Solving tool overload: A vision for smarter AI assistants

Let's talk about something I'm calling "tool overload". Specifically, we're diving into the world of Model Context Protocol (MCP) tools and how we might solve a looming problem in AI assistance.

The current state of MCP tools

When you install an MCP server to your AI client, it gets added to a list of tools that the LLM (Large Language Model) host application provides to the LLM as options to use. You can imagine it working something like this as a part of the system prompt, the host application gives a prompt which includes a list of servers, their descriptions, available tools, and inputs for those tools.

Now, if you've been playing with AI and LLMs for a while, you might be aware of a little problem: there's a point of diminishing returns when you provide too much context. Sure, the models are always improving, but it seems that when you're more constrained and focused, the responses you get are generally better.

So what happens when we install a bunch of MCP servers, each with their own tools, props, resources, and sampling methods? You guessed it — we end up with what I'm calling "tool overload." The LLM just has way too much context to work with efficiently.

This becomes especially tricky when you have multiple MCP servers covering similar use cases. Imagine having both PayPal and Wise.com tools available. They're both about sending money, right? But in a single workflow, the LLM might decide to use a tool from PayPal and then another from Wise — a recipe for disaster.

I've heard people say that once you get to around 50 tools, things get really complicated. This approach is a non-starter for scalability.

Learning from human assistants

So how do we solve this? I think the solution comes down to thinking about the LLM host application (like Claude or ChatGPT) as a human assistant.

Think about it: if you ask a human assistant to do something they've never done before, they're not suddenly incapable. Every competent person has a meta-tool: the ability to acquire more tools, to do some discovery, and find out what's available to accomplish the task.

A vision for dynamic tool discovery

Here's the future I'm envisioning: our LLM host applications will be able to tell the LLM, "Hey, if you need a tool to accomplish the user's task, just let me know, and I'll go find one."

The process might look something like this:

The LLM realizes it needs a specific kind of tool
It generates a query for the host application
The host application queries a registry or index of available tools
The most appropriate tool(s) are added to the LLM's context
The LLM can then decide which tool to execute

This maps so much better to how humans solve the problem of tool overload. We don't keep every possible tool at the top of our heads — we have the ability to discover and explore to find the right tool for the job.

Maybe over time the host application will include a memory so it knows the tools you like using most frequently for tasks you do often (like a human assistant would).

The need for tool registries

For this vision to work, we need some kind of registry or index of available tools. And yes, inevitably, this means we'll probably see the rise of "MCP Search Optimization" — like SEO, but for whatever this index or registry becomes.

I imagine we'll see competing registries (which is healthy and interesting), but the monetization strategy is yet to be determined. Personally, I'm not super excited about the solution being ads. I don't want my LLM using a tool just because that tool paid to be listed as number one.

A natural ranking system

Here's where things get interesting: we have the potential for a much more natural ranking system than what we see with traditional web search. The experience of using a tool is very evident in the conversation — how effective it was, the user's reaction, etc.

The LLM host application could use this interaction data to inform future recommendations. Did the user start yelling obscenities because of this tool? Maybe don't recommend it in the future. Did they say "thank you" when they normally don't? This must have been a really good tool experience.

We can go further on tool recommendation than Google ever did with website recommendation. We can have a mechanism that actually measures whether the tool solved the problem, potentially leading to much better tool ranking.

An open web future

I really hope this is the direction we're heading, because I'd much rather have a World Wide Web version of our MCP server future than an App Store version where only blessed apps are allowed in a specific registry.

Yes, there are security concerns with a more open approach. But with the power of LLMs to evaluate sets of tools, I think this is a solvable problem. And the potential value here makes it worth tackling these challenges.

Looking ahead

Tool overload is a real issue, but I believe it will be solved by some form of dynamic tool registration or installation. It's an exciting future — one where we can potentially have a Jarvis-like AI assistant that can truly do anything, drawing from a vast, open ecosystem of tools.

The possibilities are immense, and I can't wait to see how this all unfolds. Here's to an open, dynamic, and intelligent future for AI assistants!