GenAI textbook Part 33 Chapter 1 What Are AI Agents and How Do They Work?

What Are AI Agents and How Do They Work?

Jul 6, 2025

4 min read

What Are AI Agents and How Do They Work? — Source: Freepik - stockvalera

A paradigm shift is taking place in the field of artificial intelligence (AI), moving from generative models that create synthetic content towards autonomous AI agents. These agents are capable of realising complex, open-ended goals with minimal human intervention (Kolt 2025). An AI agent is fundamentally a computational entity that uses large language models (LLMs) as its "brain" to perceive its environment, reason, plan, and act to achieve a specific goal (Deng et al. 2025). This definition represents a significant advance from traditional systems limited to narrower tasks, opening up new possibilities where AI is not merely a tool but an active collaborator in complex problem-solving. These advanced systems are becoming increasingly prevalent in the research field, where some refer to them as an "AI co-scientist" (Gottweis et al. 2025). Understanding the core principles, operational mechanisms, and types of agents is crucial for assessing the potential inherent in current and future AI applications.

The fundamental operational framework of an AI agent is built upon an iterative cycle, which consists of several closely interconnected modules. According to the consensus in modern literature, these modules are: perception, reasoning and planning, action, and memory (Xie et al., 2024; Sapkota et al., 2025). The perception module is responsible for processing and interpreting multimodal (textual, visual, auditory, etc.) information from the environment. The reasoning and planning module, which typically incorporates an LLM or a large multimodal model (LMM), is responsible for decomposing tasks, defining sub-goals, and developing a plan of action. This "thinking" process often employs techniques such as "Chain-of-Thought" (CoT), which breaks down a problem into steps for more logical deduction (Masterman et al. 2024). The action module executes the plan, most often with the help of external tools, such as APIs, web search engines, or code execution environments. Finally, the memory module enables the agent to retain context, learn from previous interactions, and maintain state during long-term tasks (Xie et al. 2024).

One of the most important directions in the development of AI agents is the increase in architectural complexity, leading from simpler, single-agent systems to sophisticated, multi-agent "agentic AI" systems (Sapkota et al. 2025). Single-agent architectures employ a single agent that independently executes the cycles of planning, action, and self-reflection. Such systems, like the ReAct or Reflexion patterns, perform well on clearly defined problems where the task can be solved through iterative refinement and self-correction (Masterman et al. 2024). In contrast, multi-agent systems (or agentic AI) represent a fundamental paradigm shift, where multiple agents endowed with specialised roles ("personas") collaborate to achieve a common goal. These systems are capable of dynamic task-sharing, the coordination of specialised knowledge, and the management of complex, parallelisable workflows (Sapkota et al. 2025). For instance, collaborative models are at the core of systems that automate systematic literature reviews (Sami et al. 2024) and those that assist in scientific research, often referred to as "AI co-scientists" (Gottweis et al. 2025).

The key to the operation of multi-agent systems is effective coordination and communication. Architectures can be vertical (hierarchical), where a lead agent directs the others, or horizontal, where agents collaborate as equals, for instance, within a debate or discussion framework (Masterman et al. 2024). The "generate, debate, and evolve" model presented by Gottweis et al. (2025) is an excellent example of horizontal collaboration, where agents refine and develop hypotheses during simulated scientific debates. This critical-reflective step is essential for increasing reliability and filtering out errors (e.g., "hallucinations"). Ensuring an effective flow of information, for example through "publish-subscribe" mechanisms, prevents unnecessary communication noise and ensures that each agent only accesses information relevant to it (Masterman et al. 2024).

In summary, AI agents represent the forefront of artificial intelligence development, moving beyond mere content generation and into the realm of autonomous action and problem-solving. Their fundamental operation is based on an iterative cycle of perception, reasoning, planning, and action, enabled by the integration of LLMs and external tools. The shift from single-agent models to complex, collaborative, agentic AI systems results in increasingly sophisticated capabilities, which are able to automate scientific discoveries (Gao et al. 2024) and other complex tasks. Although the technology still faces numerous challenges, including in the areas of reliability, evaluation (Kapoor et al. 2024), security (Deng et al. 2025), and governance (Kolt 2025), AI agents are already reshaping our interactions with the digital environment and are increasingly functioning as an extension of human creativity and expertise, rather than as mere replacements.

References:

1. Deng, Zehang, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, and Yang Xiang. 2025. AI Agents under Threat: A Survey of Key Security Challenges and Future Pathways. ACM Computing Surveys 57 (7): 1–36. https://doi.org/10.1145/3643876 ^ Back

2. Gao, Shanghua, Ada Fang, Yepeng Huang, Valentina Giunchiglia, Ayush Noori, Jonathan Richard Schwarz, Yasha Ektefaie, Jovana Kondic, and Marinka Zitnik. 2024. Empowering Biomedical Discovery with AI Agents. Cell 187 (22): 6125–6151. https://doi.org/10.1016/j.cell.2024.06.001 ^ Back

3. Gottweis, Juraj, Wei-Hung Weng, Alexander Daryin, Tao Tu, Anil Palepu, Petar Sirkovic, Artiom Myaskovsky, Felix Weissenberger, Ke Rong, Ryutaro Tanno, and Kassem Saab. 2025. Towards an AI Co-Scientist. arXiv preprint arXiv:2502.18864. https://arxiv.org/abs/2502.18864 ^ Back

4. Kapoor, Sayash, Benedikt Stroebl, Zachary S. Siegel, Nitya Nadgir, and Arvind Narayanan. 2024. AI Agents That Matter. arXiv preprint arXiv:2407.01502. https://arxiv.org/abs/2407.01502 ^ Back

5. Kolt, Noam. 2025. Governing AI Agents. arXiv preprint arXiv:2501.07913. https://arxiv.org/abs/2501.07913 ^ Back

6. Masterman, Tula, Sandi Besen, Mason Sawtell, and Alex Chao. 2024. The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey. arXiv preprint arXiv:2404.11584. https://arxiv.org/abs/2404.11584 ^ Back

7. Sami, Abdul Malik, Zeeshan Rasheed, Kai-Kristian Kemell, Muhammad Waseem, Terhi Kilamo, Mika Saari, Anh Nguyen Duc, Kari Systä, and Pekka Abrahamsson. 2024. System for Systematic Literature Review Using Multiple AI Agents: Concept and an Empirical Evaluation. arXiv preprint arXiv:2403.08399. https://arxiv.org/abs/2403.08399 ^ Back

8. Sapkota, Ranjan, Konstantinos I. Roumeliotis, and Manoj Karkee. 2025. AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges. arXiv preprint arXiv:2505.10468. https://arxiv.org/abs/2505.10468 ^ Back

9. Xie, Junlin, Zhihong Chen, Ruifei Zhang, Xiang Wan, and Guanbin Li. 2024. Large Multimodal Agents: A Survey. arXiv preprint arXiv:2402.15116. https://arxiv.org/abs/2402.15116 ^ Back