Discover more from Generational
Spurring the generative agent movement
Briefings highlight generational AI scaleups, startups, and projects. Very fortunate to have chatted with with BabyAGI creator, Yohei Nakajima, for this piece. Read on to learn more about Yohei’s background as a prolific builder and BabyAGI’s two months-long history.
BabyAGI is an AI-powered task management agent. It automates brainstorming and task management by generating tasks according to the outcomes of previous tasks and a set objective. The system leverages off-the-shelf models, APIs, and components to create, prioritize, and execute tasks.
Most advances that AI Twitter folks prattle about are technical advances resulting in better, simpler, faster models. These advances are the result of smart PhDs in universities or engineers in technology companies keeping GPUs whirring all day. But last March, we saw an atypical non-technical innovation, generative agents, stir not only Twitter but also the general public media. The popular generative agent projects simply stitch together off-the-shelf models and databases. Some engineers on Twitter bemoan some of the projects because of how sloppily for-loops are put together.
But what made these projects so successful is giving the general public glimpses of AGI. Not the singularity kind but a practical kind - AI that autonomously figures out how to achieve an objective. Another interesting fact about this development is that most of the pivotal open source projects were not developed by the PhDs & developers in AI labs. Instead, they were built by hobbyists. One in particular, BabyAGI, was created by the unlikeliest of suspects: a venture capitalist.
Yohei Nakajima started his career bringing the nascent Los Angeles tech community together with coworking spaces and events. One part of the job he loved was meeting and learning from people. This curiosity made him a good match for the venture industry where he eventually spent a decade bridging global corporations & startups to work together and scouring the world for startups to join Techstars. So it was foreseeable, if not expected, that he eventually started his own venture firm Untapped Capital. His profile was perfect for it. But what’s not obvious is his shadow resume - his build-in-public log. Each entry is a project, or a presentation of one, that Yohei built himself. Since starting Untapped in 2020, he has logged over 100 entries. Curiosity to learn is a common trait among investors. But what is rare is an investor who builds. And even rarer is one that builds so prolifically.
Going through his build log, I noted down three themes: low/no-code, web3, and AI. Some of the more notable projects are:
Low/no-code: Dealflow digest, a tool to connect founders and investors
AI: Unofficial Zapier x OpenAI integration, which became the official integration
A cynic might draw a thread that he’s just following the hype cycle of what is hot in tech. But looking closer, the projects all are oriented towards helping him become a better venture investor. Tinkering helps him get a concrete grasp of the technologies he is investing in. But he is also building tools to automate the repeatable parts of his job: automating intros, drafting investment memos, answering FAQs, and many more. One of his AI projects is Mini Yohei, a chatbot that can complete regular Yohei in addressing questions from his portfolio companies.
Beginnings of BabyAGI
HustleGPT trended in early March as a creative challenge to use ChatGPT as an entrepreneurial companion - an AI cofounder - to turn $100 into your financial goals (e.g., $100,000). As a venture investor, this naturally intrigued Yohei. He is always on the look out for founders. So when he read about the HustleGPT, he wondered if it was possible to take it a step further: building an AI founder. This extra step is a subtle but is the crux to creating autonomous AI agents. In HustleGPT, a human manned the ChatGPT terminal at every turn. A fully AI founder meant autonomy. It had to think, plan, and execute on its own.
The prompt above generated a program that Yohei calls a “Task-Driven Autonomous Agent”, the predecessor to BabyAGI 1.0. For simplicity, let’s call the former BabyAGI version 0.0. The program creates agents that leverage GPT-4, Pinecone, and LangChain to autonomously plan & perform tasks based on the objective a user types in. It was built to mirror Yohei’s day to day work flow - tackling first thing on his task list, then throughout the day add new tasks, and then at night review & reprioritize tasks for the next day. BabyAGI 0.0 was then fed ChatGPT to write a pretty convincing scientific paper, which was in turn used to create a Twitter thread that garnered attention.
How long did it take him to do all of these? 2-3 hours.
Spurring the agentic AI movement
A few days later, he open sourced BabyAGI 1.0, a pared down version zero. Version 1.0 simplified the code base by removing some of the tools as LangChain and Zapier. Yohei wanted the core BabyAGI to be more of a template so that other users can easily pick it up and swap in their preferred tools. Intentional simplicity worked. BabyAGI quickly grabbed the Twitter mindhive, inspiring a wave of projects and startups. I’ve listed a sampling of them below.
Cognosys - AI agent designed to revolutionize productivity and simplify complex tasks.
Embra - A fast, ChatGPT-like assistant for your mac. Personalized to you — and your work.
aiVA - The industrial AI virtual advisor your team will love. Reliable expert guidance at your finger tips, grounded in domain-specific knowledge and best practices.
Aomni - Aomni can break down a high level research question into a step-by-step plan, and execute it for you while you enjoy your coffee.
Nexus - a freelancer marketplace composed of AI agents
GPTRPG - A simple RPG-like environment for an LLM-enabled AI Agent to exist in
Agents have as many applications as there are human tasks. There are over a hundred additional examples if you follow the Twitter threads here and here. With BabyAGI’s popularity came pull requests, questions, and comments, overwhelming Yohei with the amount of outreach. He publicly asked for help. He was not a professional software developer, nor an experienced open source contributor. He was a full-time venture investor.
And help did come.
One remarkable aspect of open source is that once a community is built, help comes organically. Fraser Kelton, the previous Head of Product at OpenAI and a venture investor at Spark Capital, volunteered to foster the budding community. BabyAGI now has an official website along with a Discord channel to bring the contributor community together. Of course, when a VC comes in to help an open source project, I can’t help but wonder if it’ll become a startup. But Yohei confirmed there is no company behind BabyAGI. What Fraser gave was elbow grease backed by the experience of founding a startup and leading product management at Airbnb and OpenAI. In the 10 weeks since Baby AGI launched, it has garnered 15.5K Github stars, inspired thousands of offshoot projects, and published three new mods of the original BabyAGI The latest mod, called BabyDeerAGI, improves the original program by making it smarter (stops when it finishes work) and faster (parallel task execution). To experience BabyAGI, here’s a free app that one BabyAGI contributor generously built.
I asked Yohei what is next for him. He said he’ll continue to spend time on BabyAGI (see his June 18 update below) but declined to share specifics. Not because he fears someone will steal his ideas - he did open source BabyAGI. Rather, because for him, it takes the fun out of building. Discovering, iterating, pivoting is all part of the process. A rule that Yohei follows when building is let his curiosity guide him, unrestricted by a preset commitment. What this means for the Yohei-curious, like me, is we just have to follow his Twitter. Or better yet, contribute to the BabyAGI repo.
BabyAGI aims to be the simple framework that developers can use as the basis for building any AI assistant. Given the diverse set of tasks that assistants will be built for in the future, there is no plan to have pre-built solutions for every possible use-case within the main codebase. Instead, the plan is to provide a simple foundation which builders can start from and pair that with easy-to-follow recipes.
Using BabyDeerAGI codebase as the ‘latest’ version of BabyAGI, the key concepts are (program functions & variables in quote marks):
Task Abstraction: In this system, a task is a fundamental unit of work. It is represented as an object with properties that describe the work to be done (‘task’), how it should be done (‘tool’), what other tasks it depends on (‘dependent_task_ids’), its current status (‘status’), and what results it produces (‘output’). This abstraction allows any type of work to be modeled into a task, given that it can be performed by one of the available tools.
Tool Abstraction: Tools are methods or procedures used to perform tasks. They are abstracted in a way that allows them to be interchangeably used depending on the nature of the task. For instance, if a task requires information generation, the ‘text-completion’ tool (using OpenAI's GPT-3.5 model) is utilized. If a task requires information gathering from the web, ‘web-search’ and ‘web-scrape’ tools are used. If a task requires human input, the ‘user-input’ tool is invoked.
Task Dependencies: Tasks can depend on one another, creating a relationship where some tasks can only be performed after certain others are completed. This relationship is modeled through the ‘dependent_task_ids’ property of a task. This concept allows the system to handle complex objectives that require multiple, dependent steps to achieve.
Task Management: The system manages tasks through a task list. The task list ensures that tasks are executed in the correct order, respecting their dependencies. Tasks are continually checked and executed if their dependencies are met, updating their status and storing their output when complete.
Automated Task Generation: Using an AI model, the system can generate its own tasks based on a given objective. This allows it to break down complex objectives into manageable tasks that can be performed by the available tools.
Session Summary: The system maintains a running summary of the work it has done. This summary includes the objective and the output of all completed tasks. It provides a consolidated view of the results produced by the system.
Here’s how BabyDeerAGI works step-by-step:
The system starts with a high-level ‘OBJECTIVE’.
The ‘task creation agent’ breaks down this objective into a list of manageable ‘tasks’, each associated with a specific ‘tool’ and potentially depending on other tasks.
These tasks are managed within a ‘task list’.
The system enters a loop, continuously checking the ‘task list’ and executing ‘tasks’ whose dependencies are met.
Each ‘tool’ performs its specific function to complete its assigned task.
Upon task completion, the ‘output’ is stored, the ‘status’ is updated, and the ‘session summary’ is updated.
This loop continues until all tasks are complete, at which point the session summary is saved, providing a consolidated view of the accomplished work.