A provider outage took down an AI feature of mine for forty minutes last year. OpenAI returned 429s, my code had a single prompt() call with no fallback, and every request in that window just failed. The fix used to be a pile of try/catch blocks. The Laravel AI SDK now does failover natively — here's the pattern I run in production.
Why one provider is a single point of failure#
Most AI features start the same way: pick a model, write the prompt, ship it. That's fine until the provider has a bad day. Rate limits during a traffic spike, a regional outage, a billing hiccup that drains your credits — any of these turns a working feature into a wall of exceptions.
The reason this hurts so much is that the failure is total. There's no degraded mode, no slower-but-working path. One dependency is down and the whole feature is down with it. If you've read The Complete Guide to the Laravel AI SDK in Laravel 13, you already know the SDK abstracts every provider behind one prompt() call. That abstraction is exactly what makes failover cheap — the same prompt and the same response shape work across OpenAI, Anthropic, and Gemini, so swapping providers mid-request doesn't change anything downstream.
Building a fallback chain across providers#
Instead of one provider, pass an array. The SDK tries the first, and if it throws a failover-eligible exception, moves to the next:
use App\Ai\Agents\SalesCoach;
use Laravel\Ai\Enums\Lab;
// Tries OpenAI first, falls over to Anthropic on a service interruption
$response = (new SalesCoach)->prompt(
'Analyze this sales transcript...',
provider: [Lab::OpenAI, Lab::Anthropic],
);
That's the whole feature. No try/catch, no retry loop. A plain list uses each provider's default model. When you want to pin a specific model per provider, pass an associative array keyed by the Lab enum's value (the enum cases themselves can't be used as PHP array keys):
use Laravel\Ai\Enums\Lab;
$response = (new SalesCoach)->prompt(
'Analyze this sales transcript...',
provider: [
Lab::Gemini->value => 'gemini-3-flash-preview',
Lab::DeepSeek->value => 'deepseek-v4-pro',
],
);
Order matters. I keep a cheap, fast model as the primary and a reliable, possibly pricier one as the backup — the fallback only runs when the primary is genuinely unavailable, so you pay the premium rarely. If you'd rather not repeat the chain at every call site, set a default on the agent class itself with the #[Provider] attribute:
use Laravel\Ai\Attributes\Provider;
use Laravel\Ai\Contracts\Agent;
use Laravel\Ai\Enums\Lab;
use Laravel\Ai\Promptable;
#[Provider(Lab::Anthropic)]
class SalesCoach implements Agent
{
use Promptable;
public function instructions(): string
{
return 'You are a sales coach analyzing transcripts.';
}
}
Each provider in the chain can also receive its own tuning through the providerOptions method, which the SDK calls with whichever provider is currently serving the request:
use Laravel\Ai\Enums\Lab;
public function providerOptions(Lab|string $provider): array
{
return match ($provider) {
Lab::OpenAI => ['reasoning' => ['effort' => 'low']],
Lab::Anthropic => ['thinking' => ['budget_tokens' => 1024]],
default => [],
};
}
Classifying retryable vs fatal errors#
This is the part people miss. Failover is not a catch-all. It only triggers when a FailoverableException is thrown — specifically a rate limit (RateLimitedException), an overloaded or unavailable provider (ProviderOverloadedException), or insufficient credits (InsufficientCreditsException). Ordinary errors — a malformed request, a validation failure, a bad API key — do not trigger failover, and that's correct. If your prompt is broken, sending it to a second provider just burns money to get the same error twice.
The distinction the SDK draws is the same one I'd draw by hand: transient infrastructure problems are retryable, request-level problems are fatal. The one gap worth knowing is that failover switches providers — it does not retry the same provider with back-off. For a brief blip on your primary, you may want both: a short retry against the primary, then failover. That retry-with-back-off logic belongs at the queue layer, and Laravel job middleware for rate-limiting and back-off is where I put it, so the AI call stays declarative and the resilience policy lives in one place. If the same limiter throttles your inbound traffic, fine-grained rate limiting on Laravel API routes keeps you from hammering the provider before failover even gets a chance.
Logging and monitoring failover rate and cost#
Silent failover is dangerous. If your primary is failing 30% of requests and the backup quietly absorbs them, the feature looks healthy while your spend creeps up and a real problem goes unnoticed. You want to know every time the chain falls through.
The SDK dispatches an AgentPrompted event after each completed prompt. Listen for it and record which provider actually answered:
use Illuminate\Support\Facades\Event;
use Illuminate\Support\Facades\Log;
use Laravel\Ai\Events\AgentPrompted;
Event::listen(function (AgentPrompted $event) {
Log::channel('ai')->info('agent.prompted', [
'agent' => $event->agent::class,
'usage' => $event->response->usage,
]);
});
From there it's a metric. Compare the serving provider against your configured primary: every request the backup answered is a failover, and the ratio of those to total requests is your failover rate. Spike in that number means your primary is degraded — alert on it. If you want full prompt-and-token tracing rather than a log line, wire it into a dedicated tool; I cover that in Langfuse observability for LLM calls in Laravel.
Gotchas and Edge Cases#
A few things have bitten me or come close:
The backup needs to actually work. Failover is worthless if your second provider has no credits or an expired key. Test the chain by forcing the primary to fail — the SDK fakes make this easy in a feature test — and confirm the backup answers.
Per-provider model availability. When you pin models with the associative-array syntax, make sure the model exists on that provider. A bad model name is a request-level error, not a failover-eligible one, so it'll surface as a hard failure instead of rolling to the next link.
Response format is consistent, prompts are not always portable. Failover doesn't change your prompt or the response shape — that's the whole point of the unified API. But a prompt tuned for one model's quirks may underperform on another. Keep prompts model-neutral, and if you depend on structured responses, enforce them explicitly as in Laravel AI SDK structured output with JSON schema so the backup can't return a shape your code doesn't expect.
Cost asymmetry. A pricey backup is fine as an occasional safety net and painful as a daily workhorse. Watch the failover rate; if it's consistently high, fix the primary rather than letting the backup carry the load.
Wrapping Up#
Resilient AI in Laravel comes down to three moves: pass a fallback chain ordered by cost and reliability, trust the SDK to fail over only on genuine infrastructure errors, and log which provider answered so you can watch the rate. Start by adding a second provider to your highest-traffic agent today — it's a one-line change.
From here, pair failover with retry-and-back-off at the queue layer using Laravel job middleware, and if you're building retrieval-heavy features, building a RAG pipeline with the Laravel AI SDK and pgvector is the natural next step.
FAQ#
How do I switch AI providers at runtime in Laravel?
Pass the provider: argument when you call an agent's prompt() method. Give it a single Lab enum case to use one provider, or an array like [Lab::OpenAI, Lab::Anthropic] to define a failover chain. This overrides whatever default the agent declares with its #[Provider] attribute, so you can change providers per request without touching the agent class.
Can the Laravel AI SDK retry a failed request on another provider?
Yes. When you pass an array of providers, the SDK automatically retries the request against the next provider in the list if the current one throws a failover-eligible exception. It does not retry the same provider with back-off, though — that's a separate concern you handle at the queue or job-middleware layer.
How do I handle LLM rate limits gracefully in Laravel?
A rate limit from a provider raises a RateLimitedException, which is one of the exceptions that triggers failover. So a fallback chain handles rate limits by switching to your backup provider automatically. For high-volume background work, also throttle the calls with job middleware so you avoid hitting the limit in the first place.
What is the best fallback strategy for production AI features?
Order your chain by cost and reliability: a cheap, fast primary model followed by a reliable secondary that you're happy to pay more for occasionally. Keep the backup genuinely healthy — funded credits, valid keys — and log which provider serves each request so you can monitor failover rate and spend. If the failover rate stays high, fix the primary rather than leaning on the backup.
Does provider failover change my prompts or response format?
No. The Laravel AI SDK exposes a unified API, so the same prompt and the same response structure work across providers. Failover swaps the provider behind the scenes without altering your input or output shape. The caveat is that a prompt heavily tuned to one model's behaviour may produce slightly different quality on the backup, so keep prompts model-neutral.