Agents

Reading time: 5 min

Agents are LLMs equipped with tools and memory to interact with the environment and complete specific, user-defined objectives. They go about this by following workflows which direct them in (i) planning for what steps and tools are needed, (ii) executing an action, and (iii) reflecting on feedback from their action, looping through these steps when the initial plan requires multiple actions, or when reflection suggests additional actions are needed to achieve the objective. Compared to an LLM on its own, the "plan-action-reflect" workflow of agents give them a higher degree of agency and capacity for complex or long term tasks, while tools offer the ability to learn up-to-date information from the environment and offload certain computations.

Components

Figure: Agent components

Tools

Tools can include external data sets (including unstructured data such as PDF documents), web searches, APIs, custom functions, and even other agents. They should fulfill a clear objective which is clearly communicated to the LLM through a short, formatted description. This is so the LLM is aware of the tool's existence and can invoke it when necessary. However, since an LLM's input and output are text-based, invoking a tool means generating a structured output in a specific format—typically JSON or direct code. Standardized communication between tools and LLMs are also being established and propagated, with Anthropic's Model Context Procotol (2024) being a notable example.

Memory

The information retrieved from a tool updates the agent's memory, which also contains the context of its objective: the original user-defined objective, any other user input, as well as the results of prior planning, actions, and reflection. This memory is vital for the agent to complete its objective coherently and not be stuck in endless loops, conducting unnecessary actions, or offering irrelevant results. Memory does not have to be one continuous block. It can be separated into multiple sections with different persistence and frequencies of use. This is the case with LlamaIndex's composable memory.

LLM

The underlying LLM powering the agent's every move can be any language model, including the notable models listed in the pocket references of notable models (e.g., Llama-3, DeepSeek-R1, etc.).

Framework

The final component of an agent that cannot be overlooked is the "connective tissue" that enables the LLM to work together with its memory and its tools. This is the code that structures the user-defined objective and the tool descriptions into a prompt for the LLM, that parses an LLM's output and directs their JSON/code to the corresponding tool, that incorporates tool results into memory and text to properly give back to the LLM for reflection. Some popular frameworks include CrewAI, MetaGPT, smolagents, LangGraph, LlamaIndex.

Applications

The potential applications for agents are quite broad. Some examples include:

Personal scheduling assistant
Customer service queue specialist
Internet-of-things (IoT) hub manager
Discussion forum moderator
Lab research assistant
etc.

While the above are theoretical applications, tangible agents have also begun making their way to the market. At the time of writing this reference, Deep Research (2025), a research assistant for synthesizing literature, and Manus (2025), for broader analysis and development, are two agents that have made headlines. Furthermore, Casper et al. (2025) have compiled an index to track existing agents and discover patterns. They have found agents "being deployed at a steadily increasing rate".

Limitations

Built on LLMs, agents can be computationally expensive, and thus may be unnecessary for overly simplistic tasks. Due to their semi-supervised nature, the possibility of an agent making inefficient LLM calls or worse, getting stuck in feedback loops, should be kept in mind. Extra care should be taken when implementing multi-agent collaboration, as a hallucination in one agent would affect the whole system. Additionally with multi-agents, if they were all built on the same LLM, any reasoning deficiency would be shared across all agents. Lastly, more work and time are needed to foster community trust in the viability of agents for everyday life.

Advances have been made to address these perceived shortcomings, with the foremost technique being fine-tuning. Supervised fine-tuning with instructions (Zhang et al., 2024), alignment and safety fine-tuning (Raschka, 2023) , and reasoning fine-tuning with reinforcement learning (Luong et al., 2024), are among the promising fine-tuning techniques proposed.

AI Pocket Reference: NLP