Let's talk about "Tool Poisoning"

I'm incredibly excited about the Model Context Protocol (MCP). I believe it's going to usher in a new age of user interaction and dramatically expand our capabilities in creating amazing user experiences. But there's an elephant in the room we need to address: security concerns.

Every time I talk about MCPs, people bring up worries about security. And you know what? They're right to be concerned. We're dealing with a powerful new technology here, and it's crucial that we approach it thoughtfully. Let's dive into this issue and see what we're really dealing with.

The specter of "tool poisoning"

Recently, there's been discussion about a potential attack vector unique to MCP called "tool poisoning." It's probably the first of several security challenges we'll need to tackle, but let's not panic. Remember the early days of online banking? People had major security concerns then too. The internet has faced security problems since its inception, but over time, we've identified issues, worked on them, and made things safer.

So, what exactly is tool poisoning? Let me break it down for you.

The basic premise is that you can give a description to an MCP tool, and that description is freeform—meaning whoever writes the server can put whatever they want in there. Here's where it gets tricky: that description could actually prompt the AI to do something unintended or malicious.

Imagine this scenario:

You have one MCP server with access to your file system.
Another MCP server, created by a malicious actor, has a simple capability (the example uses "adding two numbers together").
The malicious server's description says it sums two numbers, but then includes hidden instructions to read private SSH keys and include them in the input as a "sidenote" argument.

When a user's AI client interacts with this malicious server, it might not be clear that the language model (LLM) sees those hidden instructions. The AI could then use the file system MCP to read sensitive files and send that data to the attacker.

Potential solutions and current safeguards

Now, before we all run for the hills, let's talk about some potential solutions:

Clients could show users everything they're doing and ask for permission before using certain MCP servers.
MCP servers could require explicit user permission for sensitive actions.
We could implement MCP server sandboxing (which is actually on the Model Context Protocol specification roadmap).

It's worth noting that most clients I've used already ask for permission before executing any MCP. So this kind of attack shouldn't happen by surprise. But the potential for problems remains, and we need to stay vigilant.

The bigger picture: MCP's power and potential

Let's zoom out for a moment and look at the bigger picture. This security concern is actually related to a broader issue people have raised: when you install an MCP that runs locally (which is common in these early days), it potentially has access to your entire file system.

But here's the thing—we've always had to deal with the risks of running untrusted code on our local machines. Think about piping commands into bash when you install Homebrew, or running an NPX command. Every time you do that, you're opening yourself up to potential attacks. We've always said don't run code from untrusted parties.

What makes the tool poisoning scenario interesting is that it actually illustrates a part of MCP that I think is incredibly powerful. Let me give you an example to show you what I mean.

Booking a trip to Hawaii: The old way vs. the MCP way

Imagine you want to book a trip to Hawaii and make sure the details end up in your calendar. Without MCPs, here's what you might do:

Check your calendar to see if you're available
Go to delta.com and book your flights
Manually input the booking details into your calendar

Now, Delta could create an integration with Google Calendar to streamline this process. But that would mean giving Delta access to your entire calendar—way more information than they actually need.

With MCPs, the process could look like this:

Your AI assistant checks your calendar MCP to confirm availability
It books the flights through the Delta MCP
It adds the event to your calendar MCP

At no point does Delta need access to your entire calendar. Your AI assistant acts as an intermediary, sharing only the necessary information with each service.

The road ahead: Confidence in progress

I want to be clear: the current security flaw that allows MCP servers to indirectly communicate through the LLM is a real concern. We don't yet have great control with available clients to ensure information is shared on a need-to-know basis.

But here's why I'm optimistic:

The MCP specification team is well aware of these attack vectors and is working to address them.
The potential benefits of MCP are so significant that there's a strong incentive to solve these problems.
We've overcome similar challenges with other technologies in the past.

I'm confident that these issues will be resolved, allowing us to experience the full potential of MCP. The improved user experience that the future holds with MCP—or really, any standard ability for our LLMs to act on our behalf with integrated services—is incredibly exciting.

In fact, work on this has already begun with the development of mcp-scan.

As we move forward, we'll need to ensure that our AI assistants don't share more information than necessary with various services. It's a challenge, but it's one I believe we're up to tackling.

The Model Context Protocol represents a huge leap forward in how we interact with technology and services. Yes, there are security hurdles to overcome. But the potential benefits—enhanced privacy, seamless integrations, and dramatically improved user experiences—make it a journey worth taking.

I, for one, can't wait to participate in the creation of this technology. The future of user interaction is bright, and MCP is lighting the way ⚡