This essay will discuss the potential of AI agents. Its timely that one of agentic companies we have featured in Generational before, PolyAI, is holding their annual online conference on November 14, 2024. Check it out here. PolyAI is one of the few companies that are actively deploying voice AI agents in enterprise settings today. With that, let us dive into AI agents.
AI agents are all the hype today, so you’ve probably already read about them. But in case you are interested in understanding the inner workings, check out my previous essay How to Create a Mind.
In this article, we will take a business and investor perspective on AI agents, rather than a product and engineering one. With all the buzz about how “AI agents will revolutionize everything,” it is essential to have a framework to identify the areas with the most potential. To build this framework, I analyzed the automation potential of approximately 16,000 tasks across 800 jobs.
Motivation
When ChatGPT was released in December 2022, people found it to be remarkable. It was a fun tool, quirky, yet in many ways useful. Few considered how it might replace anyone in their jobs. The perception shifted when GPT-4 was released in March 2023, with benchmarks showing it is better than the most humans on exams that resonate with us — SAT, GRE, LSAT, AP.
While the benchmarks are useful, they do not directly translate to how well AI can perform in our jobs. A more helpful framework focuses on specific tasks — the jobs-to-be-done — in our roles. By looking at specific tasks, we can gain more nuanced insights into how much of our jobs is automatable by AI. That also indicates how much opportunity there is for the companies building AI startups (or how replaceable we may be). If AI can do our jobs, then the market opportunity is much larger than the $500 billion US software market — it is 20 times larger.
To precisely identify where the market opportunity is, I analyzed ~16,000 tasks across ~800 jobs from the US Bureau of Labor Statistics. Much credit is due to the tireless government researchers who have profiled all those tasks and jobs, even differentiating between computer programmers, software developers, and web developers.
Analysis
I used GPT-4o to evaluate the tasks (because o1 was too expensive) using a rubric. Below is a simplified version of it:
No-Automation Exposure: AI cannot perform any aspect of this task.
Low Automation Exposure: AI could complete 0-50% of the task components at high quality.
Moderate Automation Exposure: AI could complete 50-90% of task components at high quality.
High Automation Exposure: AI could complete 90-100% of task components at high quality.
Full Automation Exposure: AI can complete all task aspects at high quality without oversight.
While others have run similar task studies, many did not specify what an “AI system” is. In my analysis, I distinguished the automation potential of base models, copilots, agents, and robotic systems. This differentiation allows us to be more precise about what we are measuring and where the opportunities are.
Findings and learnings
The incremental automation from base models to copilots is modest. Copilots, broadly — not just Microsoft’s offering — are impressive, but many feel they are not as transformative as initially marketed. Marc Benioff’s tweet captures this sentiment well. The next significant shift in AI is agents that autonomously execute tasks across tools and contexts, as in real jobs. AI agents are poised to be the next big thing — not just by hype, but supported by data.
The Agent Framework
Automation potential is one dimension in understanding how fast companies can achieve ROI from AI agents. Computer programmers are among the most automatable roles, which explains why coding tools like GitHub Copilot and Anysphere/Cursor are gaining traction quickly. Another key dimension is total wages paid, reflecting market opportunity. While proofreaders are among the most automatable roles, there are only around 5,500 in the U.S., earning about $280 million combined. In contrast, roughly 1.7 million developers collectively earn close to $230 billion.
Mapping 800 jobs across these two dimensions reveals roles with the highest potential for AI agent automation. The chart below shows the frontier of jobs ripe of agentic automation. It is no surprise many of these roles align with high-growth startup areas, where VCs are offering top valuations.
Software developers: Poolside at $3 billion
Lawyers: Harvey at $1.5 billion
Customer Service Representatives: Sierra at $4.5 billion
I am still digging through the data but have already found some interesting patterns that I will be publishing in December as part of the 3rd annual Business of AI report. Subscribe if you want to get a copy of it.
Curated reads:
This is great. Thanks!