LLM with Agent, Backend or both

📅 Published on September 17, 2024

An agent is essentially just code connected to a large language model (LLM), designed to automate tasks. It’s not some science-fiction villain poised to take over the world unless, of course, a quantum cyberhacking genius manages to break through encryption standards like Post-Quantum Cryptography, AES-256, or SHA-3, and gains control over nuclear drones. In short, this idea is straight out of Hollywood.

That said, when properly configured, agents can display remarkable features such as autonomy, reactivity, proactiveness, social skills, and learning abilities. Pretty impressive, right?

Whether using agents, a backend, or both, the possibilities for automation are limitless. Let’s dive into three scenarios to illustrate the scope of these technologies.

I. LLM + backend

When dealing with AI, the approach varies depending on whether you’re handling interactive or automated systems. For instance, in a previous discussion (Chapter 1, Episode 4), we covered analysis strategies in detail.

Once analysis is complete, the data can be transferred to your backend for further processing. Our AI model, Adriana, follows this pattern, with its backend connected to an LLM. The result? It can deliver a final income tax calculation in under 10 questions, showcasing the efficiency of this configuration.

II. LLM + Agent

In this model, the agent takes the lead. It communicates with the LLM, performs analysis, and launches the appropriate tools for the task at hand.

However, this strategy has limitations when it comes to scalability, mainly due to the complexity of managing multiple processes with a single agent. If you need to manage multiple workflows or handle intricate tasks, alternative strategies are worth exploring, such as:

Gemini Google Strategy: Using Chain of Thought (Wei et al., 2022) for step-by-step reasoning
GPT-4, Tree of Thoughts (Yao et al., 2023) for problem-solving frameworks
If you’re particularly advanced, 𝐎𝐩𝐞𝐧𝐀𝐈’𝐬 𝐬𝐭𝐫𝐚𝐭𝐞𝐠𝐲 probably behind OpenAI o1 “Let’s Verify Step by Step” (Lightman et al., 2023) for reasoning through process supervision.
There is also the Mixture of Experts (MoE) 𝐌𝐢𝐬𝐭𝐫𝐚𝐥 𝐬𝐭𝐫𝐚𝐭𝐞𝐠𝐲, where each expert within the model can be fine-tuned independently to create specialized agents. While this method can optimize computational resources during inference (only activating the necessary experts), training and fine-tuning multiple experts still require significant GPU power, particularly for large-scale models.

III. LLM + Agent + backend

Now we’re talking about the strategy we really favor. This setup is highly scalable, effective, and accurate, leveraging the power of both agents and a backend.

Coupled with a Domain-Driven Design (DDD) architecture, this approach allows for a scalable AI solution suitable for complex business use cases. Essentially, it can handle almost anything.

You can also implement a multi-agents environment, where agents communicate and collaborate to complete tasks. However, managing such an environment can become a real GPU gluttony.