Constructing an AI Agent Tutorial – Half 1


Using the time period “AI Agent” has elevated by 10x within the final 1 12 months, as per information from Google Tendencies. This surge displays a broader shift: individuals and organizations more and more need AI Techniques that not solely reply questions, but additionally take actions on their behalf. From simplifying mundane duties to streamlining enterprise operations, the promise of Agentic AI is capturing world consideration.

Development for AI Agent over time (Picture: Google Tendencies)

So, what does this actually imply in follow? Let’s begin with a relatable state of affairs of how AI Brokers can rework on a regular basis duties within the close to future. Think about planning for a trip, which includes reserving motels, flights, and rental vehicles. Right now, this course of is fragmented and time-consuming. In an Agentic AI world, nonetheless, we might merely present a easy immediate that may generate tailor-made journey packages, full with itineraries, eating places, and bookings.

Right here is an instance immediate that will work in such a state of affairs:

“I wish to e book a household journey with 2 children within the months of June/July for a weekend plus 2 days. Don’t embrace the 2nd week and third week of June. I might simply want to hold two cabin luggage, and like tasting one of the best native meals. Plan for an itinerary not longer than 2-3 hours drive from town.”

On this article, we’ll transcend the buzzword that’s AI Brokers. You’ll first perceive the basics of AI Brokers after which discover the platforms that make them potential. Lastly, we’ll construct a hands-on mission: a YouTube Summarizer Agent utilizing the Phidata framework. By the top, you’ll know what Agentic AI is and the way to begin constructing one with the SOTA instruments.

Be aware: That is the primary article in a two-part sequence on constructing AI Brokers from the bottom up. On this article, we’ll discover the worth of AI Brokers, introduce well-liked Agentic AI platforms, and stroll by a hands-on tutorial for constructing a easy AI Agent. The subsequent a part of the sequence will dive deeper with a hands-on tutorial. There, we’ll construct Brokers that may automate duties and work together with exterior instruments and APIs.

Fundamentals of AI Brokers

In easy phrases, AI Brokers are programs that may carry out duties autonomously by deciphering the information from the setting. AI brokers could make selections based mostly on that information to attain the objectives. Consider them as orchestrators, connecting numerous instruments, utilizing Giant Language Fashions (LLM) to motive, plan, and execute duties. For an in depth introduction to LLMs, you’ll be able to check with this text.

Let’s break down this definition utilizing the above trip planning instance:

  • Carry out duties autonomously: Ebook flight, lodge, and rental automotive reservations by the respective distributors.
  • Deciphering the information: Account for components like climate, visitors, and native occasions to counsel one of the best actions that match the tempo.
  • Making selections: Take into account there are dozens of eating places obtainable, Brokers can present suggestions based mostly on the indicated desire and previous opinions.
  • Obtain objectives: Put collectively a journey plan that matches the necessities – dates, length, preferences, and household wants.

Agentic AI Platforms

An Agentic AI framework is a toolkit that permits the creation of AI programs able to reasoning, planning, and taking actions autonomously or semi-autonomously by software use and reminiscence. Briefly, these frameworks present the construction wanted to create brokers.

There are a number of well-liked Agentic AI platforms, comparable to LangChain, CrewAI, and Phidata. For this tutorial, we’ll use Phidata – a light-weight and developer-friendly platform. Phidata comes with built-in entry to quite a lot of instruments and LLMs. This permits us to construct and deploy AI Brokers inside only a few strains of code.

built-in Tools and Model wrappers in Phidata
Widespread built-in Instruments and Mannequin wrappers in Phidata (For a full checklist, hyperlinks right here – Fashions, Instruments.)

Construct a YouTube summarizer Agent

The YouTube Summarizer Agent is designed to extract key insights and details from any YouTube video. It saves time by offering concise summaries without having to look at your entire content material. For the aim of the tutorial, we’ll use Google Colab pocket book to write down and execute the code and Phidata Agentic AI Platform to energy the Agent.

Mannequin: Inside Phidata, we’ll leverage the Groq mannequin internet hosting platform. It’s an inference service that runs LLMs on a devoted GPU infrastructure. Be aware that it’s totally different from Grok, which is an LLM from xAI. Since LLMs are resource-intensive, utilizing Groq helps to dump computation from the native {hardware} or Colab-provided {hardware}. This ensures sooner and extra environment friendly execution. Groq has entry to a number of fashions from totally different LLM suppliers. (see full checklist right here)

Instruments: To retrieve YouTube video information, we’ll use the built-in Instrument from the Phidata framework (referred to as YouTube Instruments). This software helps us entry video metadata and captions. The agent then passes these to the chosen LLM to generate correct and insightful summaries.

Right here is the code for a YouTube summarizer agent:

from phi.agent import Agent
from phi.mannequin.groq import Groq
from phi.mannequin.openai import OpenAIChat
from phi.instruments.youtube_tools import YouTubeTools


agent = Agent(
    # mannequin=Groq(id="llama3-8b-8192"),
    mannequin=Groq(id="llama-3.3-70b-versatile"),  ## Toggle with totally different LLM mannequin
    instruments=[YouTubeTools()],
    show_tool_calls=True,
    # debug_mode=True,
    description="You're a YouTube agent. Acquire the captions of a YouTube video and reply questions.",
)


agent.print_response("Summarize this video https://www.youtube.com/watch?v=vStJoetOxJg", markdown=True, stream=True)

Following is the output generated by the YouTube Summarizer agent (above code). The YouTube hyperlink within the above code is a video of Andrew Ng on the Machine Studying specialization. As proven beneath, it precisely summarizes the video content material. Be aware that the response could differ for every run due to the probabilistic nature of LLMs.

YouTube Summarizer Output

Detailed Tutorial

Listed here are the step-by-step directions for creating the YouTube Summarizer agent.

1. Clone Pocket book

  • Clone Colab pocket book right here (it requires a Google account)
  • Set up dependencies (first cell with code)

2. Get API key for Groq

With a view to run the Agent, on condition that we use the Groq mannequin internet hosting platform, we want an account with Groq. Comply with the steps beneath to enroll / log in to Groq and get an API key.

– Go to the Groq Developer Portal: Open your browser and go to: https://console.groq.com

– Signal Up or Log In

  • If you have already got an account, click on Log In.
  • For those who’re new, click on Signal Up and observe the prompts to create an account (you might have to confirm your e mail).

– Entry the API Part

  • As soon as logged in, you’ll land on the Groq Console.
  • Navigate to the API Keys part from the sidebar or dashboard.

– Generate a New API Key

  • Click on the “Create API Key” button.
  • Give your key a reputation (e.g., “workshop-key”).
  • Click on Create or Generate.

– Copy and Retailer the Key Securely

  • Your API key might be proven solely as soon as — copy it instantly and retailer it in a safe location.
  • By no means expose your API key in client-side code or public repositories.

3. Add the API key within the Secret Supervisor

  • Click on on Secrets and techniques (Key signal) on the left pane of Colab
  • Present the title as GROQ_API_KEY and the Worth because the API Key copied in Step 5 above
  • Toggle “ON” the pocket book entry.

Conclusion

On this article, we explored the rising demand for an AI Agent and walked by a real-world instance of how they will simplify on a regular basis duties. We broke down the basics of AI Brokers and a few well-liked Agentic AI Frameworks. We additionally constructed a hands-on mission: a YouTube Summarizer Agent powered by Phidata.

That is just the start. Within the second article of this sequence, we’ll go deeper by constructing a examine planner agent that doesn’t simply generate plans but additionally takes actions. It can create duties in Jira, ship calendar invitations, and display how AI Brokers can seamlessly combine with exterior instruments and APIs to automate real-world workflows.

Try the half 2 of this sequence right here – Constructing Examine Planner Agent: AI Agent Tutorial Half 2

Co-Creator for the article: Abhishek Agrawal

Praveen is a seasoned Information Scientist, with over a decade of expertise in analytics. He has tackled advanced enterprise challenges and pushed innovation by data-driven choice making. His experience spans throughout areas comparable to Machine Studying, Statistics, and Scalable Analytics, serving to to launch a number of revolutionary merchandise.

Login to proceed studying and luxuriate in expert-curated content material.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles