Table of Contents
ToggleEstimated reading time: 7 minutes
At Lyzr, we’re pushing the boundaries of what’s possible with Large Language Models (LLMs) by employing innovative frameworks that enhance their capabilities.
Leveraging frameworks like ReAct, CoT, ToT, among others, has conclusively demonstrated to unlock significant capabilities of Large Language Models (LLMs), thereby enabling the automation of complex tasks previously unattainable by using LLMs straight out of the box.
With proper prompting and feedback, Generalist Foundation Models can outcompete
special-purpose fine-tuned models.
For instance, the accuracy of GPT-4 on competitive programming problems increased from 19% with a single well-designed direct prompt to 44% with Flow Engineering.
So what is Flow Engineering?
A Flow is an iterative process where multiple agents collaborate and interact with the environment and each other to get a task done. Flow Engineering elevates prompt engineering by breaking tasks into smaller steps and prompting the LLM to self-refine its answers, enhancing accuracy and better performance.
To put the Impact of using Advanced Frameworks into perspective, let’s consider the Language Agent Tree Search (LATS) framework.
When paired with GPT-4, LATS has achieved a remarkable 94.4% success rate on the HumanEval benchmark for programming tasks. In contrast, GPT-4 alone, without specialized prompting, scores 67.0%. This striking difference of 27.4% highlights the untapped potential within LLMs that can be harnessed through Flow Engineering.
Drawing inspiration from the Monte-Carlo Tree Search, LATS operates through a sequence of steps: selection, expansion, evaluation, and simulation. A node is selected, expanded, evaluated, then simulated until a terminal node is reached, then the resulting value is backpropagated.
Research on “dual process” models suggests that people have two modes in which they engage with decisions – a fast, automatic, unconscious mode (“System 1”) and a slow, deliberate, conscious mode (“System 2”)
The simple associative token-level choices of LMs are also reminiscent of “System 1” and thus might benefit from augmentation by a more deliberate “System 2” planning process that maintains and explores diverse alternatives for current choices instead of just picking one and evaluates its current status and actively looks ahead or backtracks to make more global decisions.
System 2 thinking with LLMs can be achieved at both the prompt level and the flow level. This can be done by including a few examples of thorough and deliberate thinking at the prompt level or splitting the end goal into smaller, detailed steps at the flow level.
We at Lyzr utilize research-backed prompting techniques and prompt engineering, paired with our own rigorous research and comprehensive benchmarking for the selection of the best prompts and prompting techniques for an enterprise workflow.
A few techniques that we regularly use with Flow Engineering while developing Agents at Lyzr
– Let’s Verify Step by Step: Using process supervision by providing feedback for each intermediate reasoning step.
– Reflexion: A framework to reinforce language agents through linguistic feedback instead of updating weights. Reflexion converts binary or scalar feedback from the environment into verbal feedback in the form of a textual summary, which is then added as additional context for the LLM agent.
– OPRO: Optimization by PROmpting. Using LLM to generate new solutions from the prompt that contains previously generated solutions with their values. Prompts optimized by OPRO outperform human-designed prompts by up to 50%.
Automatic Prompt Engineering: LLMs are very good at optimizing their own prompts and are human-level prompt engineers. Using the LLM itself, we can take vague user requirements and turn them into high-quality prompts.
– Rephrase and Respond: Using LLMs to rephrase and expand questions posed by humans and provide responses. This method is complementary to Chain-of-Thought prompting.
– Step-Back Prompting: Using the concepts and principles, paired with abstraction to guide the reasoning steps of LLMs.
– Chain-of-Verification: Generate Baseline Response -> Plan Verifications -> Execute Verifications ->Generate Final Verified Response.
– Emotional Stimuli: Combining the original prompt with emotional stimuli e.g. ‘This is a Critical Scenario’ can enhance performance by as much as 115% in specific scenarios.
– EVOPROMPT: Using Evolutionary Algorithms for automatic prompt generation, outperforming human-engineered prompts by up to 25%
Why use Flow Engineering?
Efficiently leveraging LLMs for automating roles and streamlining processes begins with meticulous Flow Engineering, which involves designing task automation to reduce iterative rounds and ensure that LLM capabilities align with our expectations from the outset.
Flow Engineering enhances LLM performance, lowers operational costs, and provides improved output control. As the industry advances from chatbots to GenAI agents, mastering this approach becomes crucial for automating sophisticated processes effectively.
The use of LLMs started as assistants, supporting humans in specific tasks while the human remained always in charge and in the loop. As the LLM models become more and more powerful, they also become increasingly reliable and capable of achieving goals on their own.
Flow Engineering helps in designing robust flows with a focus on thorough testing, fault tolerance, and creating foolproof systems to ensure safe failure if it occurs.
How are we using Flow Engineering at Lyzr?
Not all flows are equal; even the latest SOTA (State-of-the-Art) framework might not perform well for a specific use case.
Then, there is an issue of latency and cost. Flows generally have two or more steps. This means multiple back and forth with the LLM and additional token costs and latency, which can quickly add up! Thorough testing and validation have to be done to select the best flow.
Frameworks like AutoGen and CrewAI are popular options to create AI Agents that work together to achieve a given task. While these frameworks are great for getting started for simple to medium workflows, they are not sufficient to build advanced agents like Lyzr’s DataAnalyzr and Knowledge ChatBot.
We at Lyzr developed the agents from scratch by leveraging the best available tools, so that we have complete control over the agent behavior.
Below is a simplified illustration of Lyzr’s conversational Data Analysis agent DataAnalyzr. The actual flow is far more sophisticated, with individual steps thoroughly optimized for task efficiency, accuracy, and overall agent performance.
The approach is Modular so that individual agents and tasks can be modified to have fine grained control over the overall performance and outcome.
Process supervision significantly outperforms outcome supervision. Thus, in the DataAnalyzr flow, we maintain synthetic supervision of the flow using agents and also with a human in the loop who can correct the intermittent steps to steer the model to the right/expected outcome.
Why are We Developing Automata, A New Agent Framework
Developing an Agent from scratch can be tedious, time-consuming, and prone to failure. There is no established framework for developing scalable, performant, and advanced GenAI agents for enterprises.
That is why we at Lyzr are developing Lyzr Automata, an agent framework incorporating all the best practices and learnings from building enterprise-grade agents.
The agents built at Lyzr – DataAnalyzr, Knowledge Bot, RAG Bot, Voice Bot, and more are being enhanced and integrated with Lyzr Automata, and they fit elegantly with the entire Lyzr ecosystem.
Lyzr Automata is a multi-agent framework for process automation. You can create multiple agents running autonomously, and automate processes end to end.
It is an agent + task architecture where tasks can be executed parallelly to reduce latency
With Automata, our aim is to democratize flow engineering with a low code approach while also making it powerful enough for power users to develop customized flows and agents to automate enterprise workloads with ease.
Version 0.2 is out, and it is open source. The code can be found on GitHub. Don’t forget to ⭐ the repo to show your support.
Book A Demo: Click Here
Join our Slack: Click Here
Link to our GitHub: Click Here