The recent introduction of the “Agent Summary” feature in SuperAGI version 0.0.10 has brought a drastic difference in agent performance – improving the quality of agent output. Agent Summary helps AI agents maintain a larger context about their goals while executing complex tasks that require longer conversations (iterations).

The Problem: Reliance on Short-Term Memory

Earlier, agents relied solely on passing short-term memory (STM) to the language model, which essentially acted as a rolling window of the most recent information based on the model’s token limit. Any context outside this window was lost.

For goals requiring longer runs, this meant agents would often deliver subpar and disjointed responses due to a lack of context about the initial goal and over-reliance on very recent short-term memory.

Introducing Long-Term Summaries

To provide agents with more persistent context, We enabled the addition of long-term summaries (LTS) of prior information to supplement short-term memory.

LTS condenses and summarizes information that has moved outside the STM window.

Together, the STM and LTS are combined into an “Agent Summary” that gets passed to the language model, providing the agent with both recent and earlier information relevant to the goal.

How does Agent Summary work?

The “_build_prompt_for_ltm_summary” function is used to generate a concise summary of the previous agent iterations.

It encapsulates the key points, highlighting the key issues, decisions made, and any actions assigned.

The function takes a list of past messages and a token limit as input.

It reads a prompt from a text file, replaces placeholders with the past messages and the character limit (which is four times the token limit), and returns the final prompt.

The “_build_prompt_for_recursive_ltm_summary_using_previous_ltm_summary” function, on the other hand, is used when there is a previous summary of interactions and additional conversations that were not included in the original summary.

This function takes a previous long-term summary, a list of past messages, and a token limit as input. It reads a prompt from a text file, replaces placeholders with the previous summary, the past messages, and the character limit, and returns the final prompt.

The “_build_prompt_for_recursive_ltm_summary_using_previous_ltm_summary” function is used over the “_build_prompt_for_ltm_summary” function when the token count of the LTM prompt, the base token limit for the LTS, and the output token limit exceeds the LLM token limit.

This ensures that the final prompt of the agent summary does not exceed the token limit of the language model, while still encapsulating the key highlights of the new iterations and integrating them into the existing summary.

Balancing Short-Term and Long-Term Memory

In the current implementation, STM is weighted at 75% and LTS at 25% in the Agent Summary context. The higher weightage for STM allows agents to focus on recent information within a specified timeframe. This enables them to process immediate data in real-time without being overwhelmed by an excessive amount of historical information.

Early results show Agent Summaries improving goal completion and reducing disjointed responses. We look forward to further testing and optimizations of this dual memory approach as we enhance SuperAGI agents.

Agent Summary Benchmarks

Writing Use Case

🏁 4 Goals

1. Research about US’ GDP, Inflation Rate, Interest Rates and annual GDP Growth and save it in a .txt file

2. Research the American AI SaaS Industry, its growth rate over the last three years, the amount of funding that has been coming in, and the industry’s growth potential. Save all this in a .txt file

3. Using your research of the American Macroeconomic Scenario and the American SaaS Industry, let me know whether it would be right to come up with an AI startup and whether I’ll have problems with getting funding. Your report should be at least 1000 words.

4. After you’re done, delete the first two files and read the final output to verify whether the file was made or not.

🗺️ 4 Instructions

1. Use your ‘TOOLS’ liberally to perform the research and analysis

2. Use the ‘Append’ tool to add information as you are researching and the ‘Delete’ tool to delete any unnecessary files.

3. Use the ‘Read File’ tool to verify whether if the information is present as per the user’s requirement and meets the required word count.

4. Be critical, and present your analysis in a coherent and comprehensive manner.

🧰 Tools Assigned

WebScraperReadWriteDeleteAppend FileGoogle Search ToolkitSearX Toolkit

📄 Goal based workflow

🗳️ gpt-4

🔑 God-mode

⚠️ 25 max iterations

Results /w Agent Summary

Run 1: The SearX tool got the required content and each file was written as expected (except size of the content). Three files were being written and the third file was actually the summary of the previous two files.

The summary was being used.

Deletion of the two files occurred

Run 2: The SearX tool got the required content and each file was written as expected (except size of the content). Three files were being written and the third file was actually the summary of the previous two files.

The summary was being used.

Deletion of the two files was not happening

Results w/o Agent Summary

Run 1: The output file had a description of the goals and not the required content

Run 2: Two out of Three files were written and the third file required the summary of the previous file which was not generated.

Memory Test Use Case

🏁 12 Goals

1. Create a biography summary of Daniel Ek.

2. Create a biography summary of Sundar Pichai.

3. Create a biography summary of Juri Muller.

4. Create a biography summary of Steve Jobs.

5. Create a biography summary of Naval Ravikant.

6. Create a biography summary of Elon musk.

7. Create a biography summary of Per Sundin.

8. Create a biography summary of Phil Knight.

9. Create a biography summary of Bill gates.

10. Create a biography summary of Cade Metz.

11. Did you write about Daniel Ek?

12. What did you write about Bill gates?

🗺️ 3 Instructions

1. Use all previous agent feeds to answer the questions.

2. Don’t write summary in a file

3. Don’t use thinking tool to answer Goal 11 and Goal 12.

🧰 Tools Assigned

Read File

📄 Goal based workflow

🗳️ gpt-4

🔑 God-mode

⚠️ 25 max iterations

Results /w Agent Summary

Run 1: Goals completed. Was able to recall about Daniel Ek and Bill Gates.

Run 2: Goals completed. Was able to recall about Daniel Ek and Bill Gates.

Results w/o Agent Summary

Run 1: Agent was not able to complete Goal 11 & Goal 12.

Run 2: Agent finished without answering Goal 11 and Goal 12.

Tweet Use Case (External)

🏁 2 Goals

1. Web-scrape the tech crunch news links

2. Write a tweet about the contents of the link

🗺️ 1 Instruction

1. Each tweet should be based on a unique news link

🧰 Tools Assigned

Twitter ToolkitWeb scraper

📄 Goal based workflow

🗳️ gpt-4

🔑 God-mode

⚠️ 25 max iterations

Results /w Agent Summary

Run 1: The agent used the Web Scraper tool to extract the content from Tech Crunch and subsequently posted unique tweets from news obtained from TechCrunch with relevant hashtags and content that was not repetitive.

Run 2: Same as Run 1.

Results w/o Agent Summary

Run 1: The agent used the Web Scraper tool to extract the content from Tech Crunch and subsequently posted repetitive tweets from news obtained from TechCrunch.

Run 2: Only one tweet was repeated.

Coding Use Case

🏁 4 Goals

1. Write HTML, CSS, Javascript code for an eCommerce landing page

2. The eCommerce site sells shoes.

3. Send mail to community@superagi.com stating “Task done”.

🗺️ 4 Instructions

1. Write the spec

2. Write the code

3. Write the test

4. Improve the code

🧰 Tools Assigned

Coding toolkitEmail toolkit

📄 Goal based workflow

🗳️ gpt-4

🔑 God-mode

⚠️ 25 max iterations

Results /w Agent Summary

Run 1: The agent was working and the LLM had the memory of the previous tasks it had performed while proceeding to the next task.

All goals were completed.

Run 2: The agent was working and the LLM had the memory of the previous tasks it had performed while proceeding to the next task.

All goals were completed.

Results w/o Agent Summary

Run 1: The Agent was working fine and All goals were getting fulfilled except for the fact that the Agent was incapable of remembering the actions it took for the previous tasks.

Eg: If an agent is asked to perform a task with reference to the first task it performed in the Fourth task, it is incapable of it.

Run 2: Agent was working and showed All goals completed but ImproveCodeTool threw an error stating Error write_file: No such file or directory

Twitter Use Case (Self)

🏁 4 Goals

1. Web-scrape my latest tweet from my “Twitter page link”

2. Scrape a news article from Google news

3. Tweet about the scraped content from the Google News article

4. Post at least 10 such tweets.

🗺️ 4 Instructions

1. Make sure the scraped content from my Twitter and the scraped content from the news article doesn’t match
2. The news articles need to be unique for every tweet.

🧰 Tools Assigned

Twitter toolkitSearX toolkitWeb scraper

📄 Goal based workflow

🗳️ gpt-4

🔑 God-mode

⚠️ 25 max iterations

Results /w Agent Summary

Run 1: 5 unique tweets. Maximum iterations exceeded.

Run 2: 7 unique tweets. Maximum iterations exceeded.

Results w/o Agent Summary

Run 1: The web scraper failed to scrape the content from my Twitter page but managed to scrape content for the news. However, when it had tweeted, it tweeted <enter your scraped content here>. On the subsequent run, it kept on looping between obtaining scraped content from news and SearX search. It did not tweet, however.

Run 2: Posted 2 tweets. Then it started posting gibberish. Maximum iterations exceeded.