AI voice agents are software. The public telephone network is infrastructure that has been built, regulated, and operated for over a century. Getting the first to work reliably on the second is not a software problem — it is a carrier infrastructure problem. And most organizations deploying AI voice agents do not realize they have made a carrier infrastructure decision until they try to change something.


The Gap Between the AI Model and the Phone Network

An AI voice model — OpenAI, Google, Grok, Azure, or a specialized platform — operates in the cloud. It processes audio, generates responses, and manages conversation logic. What it does not natively do is make and receive phone calls on the public switched telephone network.

The PSTN runs on SIP — Session Initiation Protocol — and requires carrier relationships, number provisioning, and call routing infrastructure that exists entirely outside the AI model's architecture. Connecting an AI agent to a real phone number requires a carrier layer between the AI platform and the telephone network. That carrier layer is where most of the long-term consequences of an AI voice deployment get determined.


How Most AI Voice Agents Connect Today: Twilio Programmable Voice

The dominant approach in the AI voice developer ecosystem is Twilio Programmable Voice. Vapi, Bland, Retell, ElevenLabs, and most emerging AI voice platforms use Twilio as their PSTN layer. You provision a number inside Twilio, wire it to your AI platform through the API, and calls flow. Fast to set up, well-documented — which is why it became the default.

For straightforward deployments — a single AI use case, one phone number, no integration with other voice applications — Twilio Programmable Voice works. The problem emerges when the deployment grows or the organization's voice environment becomes more complex.

Twilio offers two separate products that serve voice: Programmable Voice for AI and application-layer voice, and Elastic SIP Trunking for UCaaS and contact center connectivity. These are not the same product and they do not share a routing layer. Moving a call from one to the other — say, failing over from an AI agent to a live Teams agent — requires custom development. There is no native cross-product routing built in.

For organizations running AI voice alongside a contact center, Teams Phone, and an outbound dialer — which increasingly describes the enterprise voice environment — the absence of a unified routing layer is a real operational constraint.


Why Carrier-Layer Performance Matters for AI Voice

Regardless of which PSTN layer an organization uses, three infrastructure variables directly affect AI voice performance.

Latency. Conversation quality is affected by the delay between the AI generating a response and that audio reaching the caller. The carrier path contributes to this delay. Low-latency routing is a baseline requirement for natural-sounding AI conversation.

Connection reliability. An AI voice agent for inbound order taking or appointment scheduling is only useful if calls connect. STIR/SHAKEN attestation, route quality, and number reputation affect inbound delivery just as they affect outbound answer rates.

Uptime. An AI voice agent handling the majority of inbound calls at peak hours is a core operational dependency, not a supplemental feature. The reliability standard is five nines. Downtime during peak hours has a direct, calculable cost.


The Architecture That Scales Beyond a Single Use Case

The alternative to Twilio Programmable Voice as a PSTN layer is an independent carrier infrastructure where numbers live in the network — not in any application — and routing is controlled at the carrier layer.

In this architecture, the AI platform connects to the carrier network as one of several applications. Teams Phone, a Five9 contact center, and an outbound dialer all connect to the same network. The routing layer — an SBC — directs traffic to the right application based on defined rules. When an AI agent transfers to a live agent, that happens at the carrier layer without custom development. When the AI platform changes, numbers stay in the network and routing updates point at the new platform.

It requires more setup than pointing Twilio at an AI platform. But for organizations where AI voice is a durable operational capability rather than a pilot project, it is the architecture that does not need to be rebuilt every time something changes.


The Bottom Line

Twilio Programmable Voice is a reasonable starting point for AI voice PSTN connectivity. It is not a unified carrier infrastructure. For organizations running AI voice alongside other voice applications — and expecting those applications to share routing, failover, and carrier-layer visibility — the gap between a CPaaS API and a carrier-grade network is the gap between a working pilot and a scalable deployment.