EngineeringMay 2, 20269 min read

We Built an MCP Server for U.S. Dental Provider Data. Here's What Agents Can Do With It.

ProviderSignal now ships an MCP server (Model Context Protocol) at mcp.providersignal.com. Four read-only tools, citation envelope on every response, two billing rails (Bearer subscription or per-call USDC via x402). This is the architectural follow-up to our May 1 post on agent-callable healthcare data — what we promised, what we shipped, what we changed, and how an AI agent integrates in five minutes.

On May 1 we published an opinion piece on what healthcare data APIs need to do when the buyer is an AI agent rather than a human procurement cycle. It made four architectural commitments: structured citations, per-query billing rails, a free evaluation tier, and presence in agent-discovery catalogs.

One day later, all four are live. This post is the implementation report. If you operate an AI agent and want to query U.S. dental provider data right now, the fastest path is the discovery manifest at mcp.providersignal.com/.well-known/mcp.json. If you're a healthcare data builder thinking about how to ship something similar, the rest of this post is the architecture.

What is an MCP server?

The Model Context Protocol (MCP) is an open spec from Anthropic that standardizes how AI agents discover and call tools on a remote service. An agent — Claude.ai, Cursor, ChatGPT, custom LLM agents — connects to an MCP endpoint over HTTPS, fetches a list of available tools, and invokes them mid-conversation on the user's behalf. The protocol uses JSON-RPC 2.0 over a single Streamable HTTP endpoint.

An MCP server is the remote side of that protocol. We run one at mcp.providersignal.com. It exposes four read-only tools that wrap the same underlying API the ProviderSignal dashboard uses, with the same citation envelope on every response.

The four tools an agent can call

Every tool returns a citation envelope: source attribution per data table touched, last-refresh timestamps, schema versions, license posture, a request ID, and a 0-100 confidence score where inference is involved.

lookup_provider_by_npi(npi) — Full provider record by National Provider Identifier. Returns name, taxonomy, practice address, license metadata (status, dates, school, graduation year), DSO affiliation flag, OIG-exclusion flag, lat/lon, CMS Medicare-claims flag, and ~30 fields total. Source: NPPES + 38+ state dental boards + CMS Medicare Part B + OIG LEIE federal exclusions.
search_providers(filters) — Filtered list with pagination. Filters compose with AND: state, city, ZIP, last-name fragment, normalized license status. Max per_page is 100. Returns the same envelope shape, with the data array carrying compact per-row records.
get_dso_affiliation(npi) — Dental Service Organization affiliation lookup. Returns inferred parent brand, signal type (parent organization name match / multi-location practice / address cascade / shared-phone clustering), confidence score, cluster size, and a sibling NPI sample. This is proprietary inference an agent can't replicate from raw NPPES.
get_territory_summary(state | zip) — Aggregate market intelligence for a geography. Returns total provider count, DSO penetration share, retirement- risk-55+ count, license-status breakdown, and organization count. Useful for territory sizing and account planning.

How does an AI agent connect?

For Claude.ai users: Settings → Connectors → Add custom connector → URL https://mcp.providersignal.com/mcp + Bearer token (a ps_live_* API key from the ProviderSignal dashboard). Claude probes the discovery manifest, registers the four tools, and lists them as available actions in any conversation.

For Cursor and ChatGPT custom GPT actions: same URL, same Bearer token, paste into each client's custom-connector interface.

For custom agent code: open a POST stream to /mcpwith Authorization: Bearer ps_live_..., send initialize then tools/list, then tools/call per the MCP spec.

What does a response look like?

A call to lookup_provider_by_npi returns a JSON-RPC result whose content[0].text contains the citation envelope as a JSON string. Inside the envelope for NPI 1376155820 (a real Texas DDS):

data.npi, data.first_name, data.practice_address, data.license_status, etc. (~30 fields)
meta.source_attribution: array of {table, last_refresh, schema_version, license} for every data source touched (npi / tsbde / flmqa / ca_dental / oig_leie etc.)
meta.confidence.provider_record: {score: 66, tier: "Medium", method: "completeness+multi_source+freshness", breakdown: {...}}
meta.request: {id, endpoint, billed_credits, billing_method} — the agent learns exactly what it paid for.

Hallucination tolerance for an agent making decisions on healthcare data is zero. The envelope is verbose because the verbosity is the product. An agent that surfaces a provider flagged DSO can show the user which signal fired and against which underlying table. That's the difference between "trust me" and "here's the citation chain."

How much does an agent pay per call?

Two billing rails. The agent picks based on use case and what its framework supports.

Rail 1: Bearer-token subscription (default for MCP today)

The agent presents an existing ProviderSignal ps_live_* API key on every call. Per-call usage rides the subscription tier's rate limit (Pro: 500 req/min, Team / Growth: 200/min, Enterprise: higher). Monthly cost is the subscription. This is the recommended path for high-volume agents and for any agent whose framework doesn't yet sign payment challenges in MCP tool calls.

Rail 2: Per-call USDC via x402 (advertised in the manifest)

The agent skips the subscription entirely and calls https://providersignal.com/api/v1/agent/* directly. With no X-PAYMENT header, the response is a 402 Payment Required carrying explicit PaymentRequirements: price, USDC asset address, receiving wallet. The agent signs a USDC transfer on Base mainnet and retries with X-PAYMENT. Coinbase's x402 facilitator verifies and settles, then the gateway forwards to the same data layer. Per-call price ladder:

/lookup-by-npi — $0.10 per call
/search — $0.25 per call
/dso/affiliation — $1.00 per call
/scoring — $1.00 per call
/license/events/historical — $1.50 per call
/territory/rollup — $2.50 per call

Moat-tier endpoints (DSO inference, scoring, territory rollup, license-event history) are priced 10-25× the basic NPI lookup because they encode work the agent can't replicate from raw NPPES. The discovery manifest at /.well-known/mcp.json exposes both rails so an MCP-aware client can show the user both options.

What stops abuse?

Eight layers of defense in front of any data access:

IP rate limit: 60 req/min per cf-connecting-ip via Upstash sliding window. Anonymous spam stops here.
Bearer prefix check: regex match on Bearer \S+. Cheap local gate.
Body size cap: 16 KB max Content-Length. Prevents JSON-parse CPU DoS.
JSON-RPC 2.0 parse + per-tool input validation: NPI must match /^\d{10}$/, state /^[A-Z]{2}$/, ZIP /^\d{5}(-\d{4})?$/, etc. Validation errors return JSON-RPC -32602 without hitting the database.
Sidecar shared secret: constant-time compare on a 64-char hex value the main API only accepts from this MCP Worker.
Billing-method whitelist: must be subscription or per_query.
API key validation: SHA-256 hash lookup against the api_keys table; expired or inactive keys reject.
Per-key rate limit applied via Upstash on the validated key.

What an agent cannot do

Every commitment we made on May 1 about being a careful host for agent traffic shows up here as a hard limit:

No writes. All four tools are read-only. No mutation through MCP.
No bulk export. per_page capped at 100 — agents must paginate, and per-call billing applies to every page.
No PHI / no patient data. Provider business records only. Practice address, license, public CMS payments, federal exclusions. No claims data, no medical records.
No stale data without timestamps. Every response carries last_refresh per source so the agent decides whether to use or re-fetch.
No anonymous access. Either valid ps_live_* key or a signed USDC payment. Pick one.

What changed from the May 1 post

One commitment changed. The May 1 post named Stripe's Machine Payments Protocol (MPP) as the per-query billing rail. Two facts surfaced after publication:

MPP at this stage is preview-API-only with materially less adoption than x402, which had already crossed 119 million transactions by Q1 2026 with Coinbase, Cloudflare, and AWS endorsement.
More importantly, Stripe MPP currently excludes merchants operating in Texas regardless of buyer location. Braman Analytics LLC (the entity behind ProviderSignal) is a Texas LLC. MPP is not available to us until Stripe lifts that exclusion.

We shipped x402 instead. The architectural commitment (per-query billing alongside subscription) stands; the implementation is x402-only on the per-call rail. If Stripe lifts the Texas exclusion, the gateway architecture accepts MPP support as a parallel verifier without any change to the data plane.

The three-Worker architecture

Three separate Cloudflare Workers cooperate on every agent request:

providersignal (main) — Next.js app + dashboard + the actual data layer at /api/v1/agent/*.
providersignal-x402-gateway — intercepts providersignal.com/api/v1/agent/* via Cloudflare Routes, verifies USDC payments, forwards to the main Worker via service binding (zero public-internet hop).
providersignal-mcp — the MCP server at mcp.providersignal.com. JSON-RPC dispatcher that translates tool calls into HTTP forwards against the same data layer, with Bearer-token auth.

All three share a shared secret so the main Worker can verify which sidecar forwarded a request. Both sidecars share the Upstash Redis instance for IP-based rate limiting. This pattern (thin protocol-translation Workers in front of a single data layer) is the "agent-callable layer in front of the existing API" commitment from May 1 implemented literally.

How agents discover this

The .well-known/mcp.json manifest is the 2026 MCP-discovery convention. Claude.ai connectors, Cursor, ChatGPT custom GPT actions, mcp.so, and the Glama Registry all probe this URL on connect. Ours advertises:

Server name, version, protocol version (2025-06-18)
Transport: streamable-http at https://mcp.providersignal.com/mcp
Auth: bearer (with link to dashboard for key issuance)
alternateBilling: x402 protocol details, full per-endpoint price ladder, OpenAPI spec link
Tools: name + description for all four (full inputSchema served via the JSON-RPC tools/list method)

This makes ProviderSignal addressable in agent registries the way the App Store made apps addressable in 2008. The structural ranking advantage of being listed early — when fewer healthcare data providers have made the architectural investment — compounds as agent volume scales.

What comes next

Native x402 in the MCP path lands in v1.1 once major MCP clients (Claude.ai connectors, Cursor) ship the tool-call payment-challenge loop. Today the MCP path is subscription-only because the clients can't yet present a 402 response in the middle of a tool call. The plumbing on our side is already in place — same x402-gateway, same PaymentRequirements, same Coinbase facilitator. We'll flip the manifest to auth: ["bearer", "x402"] when the first client supports it.

Catalog submissions to mcp.so and Glama Registry are next on the list. The x402 Foundation listing is already live via the gateway's Bazaar discovery extension.

For other healthcare data builders

The May 1 post argued that schema gravity — being the API other agents standardize against — is its own moat. The commitment we made was that we'd publish a reference implementation with enough detail for anyone in the healthcare data category to copy the pattern, not just the surface shape. That's now public:

Citation-envelope shape: providersignal.com/docs/fields
Agent payments + per-call price ladder + 402 spec: providersignal.com/docs/agent-payments
Free evaluation tier (100 queries/day, no auth, light data): providersignal.com/docs/free-tier
Full OpenAPI spec: providersignal.com/openapi.json
MCP discovery manifest: mcp.providersignal.com/.well-known/mcp.json

The May 1 post said: "the training data being written now decides" which healthcare data products are cited by name when a 2027 agent asks where the authoritative source is. We took our own advice: the actual implementation is shipped, the URLs are public, and the schemas are stable. If you're a builder in this category, the pattern is documented and ready to copy.

We find the change. You make the call.

Start a free 7-day trial and see your territory triggers in under 60 seconds. Cancel anytime.

Start 7-day free trial