Give your AI agent an inbox: a practical guide to email over MCP
Why connecting email to your AI agent is harder than it looks — and how the Model Context Protocol makes Gmail, Outlook, and IMAP a single, secure endpoint your agent can actually use.
Most "AI email" demos stop at a screenshot. The agent drafts a reply, everyone claps, and nobody asks the awkward question: how does the model actually reach a real inbox — securely, across providers, without you pasting an app password into a prompt?
This post walks through why that last mile is deceptively hard, and how the Model Context Protocol (MCP) turns Gmail, Outlook, Fastmail, and generic IMAP into a single endpoint your agent can call like any other tool.
The problem with "just give it my email"
Email looks like a solved problem. It is not — at least not for an autonomous agent. Three things get in the way:
- Auth is provider-specific. Gmail wants OAuth with the right scopes. Outlook wants Microsoft identity. Fastmail wants an app password. Each has its own token lifecycle and failure modes.
- Credentials are radioactive. You do not want a mailbox password sitting in a chat transcript, a log line, or a vector store. Ever.
- The "shape" of email is messy. MIME, threading, HTML vs. plain text, attachments, folders that are really labels. An agent needs a clean, predictable interface — not raw RFC 5322.
The naive fix — handing the model an IMAP library and your password — fails all three at once.
What MCP actually changes
MCP is a standard way to expose tools and resources to a language model. Instead of teaching every agent how to speak IMAP, you run one server that presents email as a small, well-typed set of tools:
{
"tools": [
"inbox_list",
"email_read",
"email_compose",
"email_organize"
]
}
The agent never sees a password. It sees verbs. That single shift is what makes email safe to automate.
The best agent integrations are boring on purpose: a stable verb, a predictable response, and no secrets in the prompt.
Discovery before action
A well-behaved email tool is stateless from the agent's point of view. The agent shouldn't hardcode a mailbox UUID. Instead it calls inbox_list first, discovers what's available, and then acts:
inbox_list→ returns connected accounts and their capabilitiesemail_read(actionlist) → paginated, newest-first, per inboxemail_read(actionread) → the parsed body, sender, and metadata for one messageemail_compose(actionsend) → compose and dispatch, with the provider chosen for you
This discovery-first pattern is the difference between a demo and something you'd trust on a Tuesday afternoon.
A concrete walk-through
Say you want Claude to triage your morning inbox. With email exposed over MCP, the conversation looks like this:
- The agent calls
inbox_listand finds your work Gmail. - It calls
email_readwith actionlistfor the last 24 hours. - For anything that looks urgent, it calls
email_readwith actionreadto get the full body. - It drafts replies and calls
email_composewith actionreply— or just summarises and waits for your go-ahead.
No passwords crossed the wire. No provider-specific code lived in your prompt. The agent reasoned over verbs and got real work done.
Security is the feature
The reason this works is that the boring parts are handled where they belong — on the server, not in the model:
- OAuth tokens are encrypted at rest and never returned to the client.
- Scopes are minimal: read and send, nothing more.
- The agent authenticates to the server, the server authenticates to the provider. Two hops, one secret store.
If you remember one thing from this post, make it this: the model should hold capabilities, never credentials.
Where to go from here
Connecting an inbox should take minutes, not an afternoon of OAuth debugging. If you want to try the discovery-first flow above, the quick-start guide gets you from zero to a live endpoint, and the provider matrix shows exactly which features light up for Gmail, Outlook, Fastmail, and friends.
Give your agent verbs, not passwords — and let it get to work.