Self-Hosted MCP Gateways: From Demo to Production

The 15-minute setup that takes three months to finish

Every MCP server has the same origin story. Someone runs a one-line install, connects it to Claude or another agent, and within fifteen minutes it's pulling data from a ticketing system or a database. It works. It feels like magic.

Then someone asks an obvious question: who else can use this, and what happens when it touches production data?

That question is where the real work begins. The gap between "MCP server running on my laptop" and "MCP infrastructure my whole team can trust" isn't a gap in features. It's a gap in everything that doesn't show up in a quickstart guide: authentication, access control, logging, and isolation.

Here's what actually closes that gap, and how to get there without turning it into a months-long infrastructure project.

What is an MCP gateway?

The Model Context Protocol (MCP) is a standard that lets AI agents connect to external tools and data sources through a common interface. Instead of writing custom integration code for every tool an agent needs, MCP gives agents a consistent way to discover and call tools.

An MCP gateway sits between AI agents and the MCP servers they talk to. It acts as a single entry point that handles authentication, routing, logging, and policy enforcement for all MCP traffic, instead of leaving every individual MCP server to handle these concerns on its own.

Put simply, an MCP server exposes one set of tools, while an MCP gateway manages access to many MCP servers at once and makes that access auditable and controllable.

Why "it works locally" isn't the same as "it's ready for production"

A local MCP setup usually has no authentication beyond a config file, no logging beyond stdout, and no isolation between what the agent can technically do and what it should be allowed to do. That's fine for one developer testing one tool.

It stops being fine the moment more than one person, one agent, or one environment is involved. At that point, four questions need real answers:

Who is allowed to call which tools? Without role-based access control, every agent that connects to the gateway can call every tool behind it. That's a problem the moment one of those tools can delete records, send emails, or push code.

What did an agent actually do, and when? If an agent takes an unexpected action, "check the logs" needs to mean something. An audit trail that records every tool call, its parameters, and its result is the difference between debugging an incident in minutes and not being able to reconstruct it at all.

Can a tool be restricted without forking it? Sometimes a third-party MCP server exposes more than you want agents to use. Production setups need a way to create a restricted variant of a tool, limiting parameters or disabling specific actions, without modifying the original server's code.

What happens if an agent goes off-script? Sandboxing and short-lived, isolated execution environments contain the blast radius of a misbehaving agent or a malicious prompt injection, so a single bad action doesn't cascade into the rest of the stack.

None of these are exotic requirements. They're the same operational basics that any internal API has had to deal with for years. MCP just brings them back into focus, because the "client" calling your tools is now an autonomous agent making its own decisions about what to call and when.

Self-hosted vs managed: what actually changes

Managed MCP gateway platforms exist for a reason: they hide most of the operational work behind a dashboard. But "hidden" doesn't mean "gone." Someone is still running the gateway, patching it, and deciding what data passes through a third party's infrastructure.

For teams with data residency requirements, contractual restrictions on where data can flow, or simply a preference for owning their stack, self-hosting an MCP gateway makes more sense. The tradeoff has traditionally been operational burden: setting up the gateway, the MCP servers behind it, networking, and monitoring all became someone's part-time job.

That tradeoff is exactly what container orchestration was built to remove. An MCP gateway and the MCP servers it routes to are, from an infrastructure point of view, just another set of containers with defined dependencies, networking rules, and environment variables. The same multi-container patterns used for web app stacks apply directly.

What a self-hosted MCP gateway stack looks like

A practical self-hosted MCP setup is typically composed of a small number of services working together:

A gateway container handles incoming connections from agents, authenticates them, and routes requests to the correct MCP server based on policy.

One or more MCP server containers each expose a specific set of tools: a database connector, a ticketing system integration, a documentation search tool, and so on.

A policy and logging layer stores access rules and writes an audit trail of every tool call, often backed by a small database or log aggregation service.

A network boundary restricts which containers can reach which internal resources, so an MCP server connecting to a ticketing system has no path to, say, a production database it has no business touching.

Defined as a Compose file, this is a handful of services with clear roles and explicit connections between them. The complexity isn't in any single piece. It's in getting all the pieces configured correctly together, and keeping them that way as MCP servers get added or updated.

Where try.direct fits in

This is precisely the kind of stack try.direct is built for: multiple containers that need to work together, deployed on infrastructure you control, without hand-assembling configuration from scratch every time.

Instead of provisioning a server, installing Docker, writing a Compose file from a half-finished tutorial, and debugging networking between containers, a pre-composed MCP gateway stack can be deployed and reused as a unit. The gateway, the MCP servers behind it, and the network rules connecting them are defined once and deployed consistently, whether that's a single environment for testing or separate environments for different teams.

For organizations that need their AI agent infrastructure to stay on infrastructure they own, whether for compliance reasons, data residency, or simply control, this turns "self-hosted MCP gateway" from a multi-week infrastructure project into something that can be stood up, modified, and redeployed in the time it takes to review a Compose file.

Frequently asked questions

Is MCP only useful for large enterprises?

No. The protocol itself is lightweight and works for a single developer connecting one agent to one tool. The gateway layer, with authentication, access control, and audit logging, becomes important once more than one person or one agent depends on the same tools.

Can a self-hosted MCP gateway connect to cloud-based AI models?

Yes. Self-hosting the gateway and the MCP servers behind it controls where your data and tool access live. It doesn't restrict which AI models or agent platforms connect to that gateway; the gateway is the boundary, not the model.

What's the difference between an MCP server and an MCP gateway?

An MCP server exposes a specific set of tools, for example a connector to a single database or ticketing system. A gateway sits in front of one or more MCP servers and handles shared concerns: authentication, routing, access policy, and logging across all of them.

Do I need Kubernetes to run an MCP gateway in production?

Not necessarily. Docker Compose is sufficient for many production deployments, particularly for small to mid-sized teams. Kubernetes adds value at larger scale or when workloads need to scale dynamically, but it isn't a requirement to have authentication, audit logging, and isolation in place.

The takeaway

The hard part of MCP was never the protocol itself. It's everything that surrounds it once the question shifts from "does this work" to "can I trust this in production." Self-hosting closes that gap without handing your data and tool access over to a third party. And with the right deployment approach, it doesn't have to cost you months of infrastructure work either.