Scaling Laravel queues in production: Horizon, Redis, and multi-server workers
Every Laravel queue tutorial walks through the same three steps: create a job, dispatch it, run queue:work. That gets you from zero to "it works on my machine" in minutes. But the first time traffic spikes and your queue depth climbs past a thousand jobs while workers silently die from memory leaks, you realise the tutorial left out the hard part. Scaling Laravel queues in production requires deliberate architecture decisions, proper Horizon configuration, idempotent job design, and a failure-handling strategy that doesn't rely on someone manually checking the failed_jobs table.
This guide covers everything I've learned running queues at scale — from choosing the right driver through multi-server Horizon deployments.
What you'll learn
- How to choose a queue driver and design a queue topology that separates concerns
- How to configure Horizon supervisors with auto-balancing for production workloads
- How to write jobs that are small, fast, idempotent, and safe to retry
- How to handle failures with dead-letter patterns, alerting, and observability
- How to scale Horizon horizontally across multiple servers
- How to test queued jobs, batches, and chains
Why queue architecture matters at scale
Most Laravel applications start with a single default queue and one worker. This works until it doesn't. The moment you have a slow PDF-generation job sitting ahead of a time-sensitive webhook-processing job, your users feel the latency. The webhook sits in line behind three minutes of PDF rendering.
The core problem is priority inversion. Without explicit queue topology, every job competes for the same worker pool. Heavy jobs starve lightweight ones. A single poisoned job that throws exceptions on every retry can clog the entire pipeline while burning through your $tries limit.
Beyond priority, there's the resource question. A job that calls an external API with a two-second timeout behaves very differently from a job that crunches a CSV in memory. The API job is I/O-bound and benefits from many concurrent workers. The CSV job is CPU-bound and will degrade server performance if you run too many in parallel.
Proper queue architecture solves both problems by isolating jobs into purpose-built queues with dedicated worker pools. Each pool gets its own concurrency limits, timeout thresholds, and retry strategies.
Choosing a queue driver for scaling Laravel queues
Laravel supports several queue drivers out of the box. In production, the realistic choices are Redis, Amazon SQS, and the database driver.
Redis is the best default for most teams. It's fast, supports atomic operations needed for unique jobs and rate limiting, and integrates directly with Horizon. If you're already running Redis for cache (and most Laravel apps are), the operational overhead is zero.
Amazon SQS makes sense when you want a fully managed queue backend with no infrastructure to maintain. It scales automatically and integrates with AWS Lambda for serverless job processing. The trade-off is that SQS doesn't support job priorities within a single queue — you need separate SQS queues and separate worker processes for each priority level. SQS is also incompatible with Horizon.
The database driver works for low-throughput applications or when adding Redis isn't an option. It uses your existing database, so there's nothing extra to deploy. The downside is performance — every job dispatch and pickup is a database query, and under load this competes with your application queries. I'd avoid it for anything processing more than a few hundred jobs per hour.
| Driver | Throughput | Horizon support | Operational overhead | Best for |
|---|---|---|---|---|
| Redis | High | Yes | Low (if already running) | Most Laravel apps |
| SQS | Very high | No | None (managed) | AWS-native, serverless |
| Database | Low | No | None | Low-volume, simple apps |
For this guide, I'll assume Redis — it's what Horizon requires, and Horizon is the tool that makes production queue management practical.
Designing your queue topology
A queue topology is the set of named queues your application uses and the rules for which jobs go where. The goal is to separate jobs by their characteristics: priority, resource usage, and failure tolerance.
Here's a topology I've used in production that works well for most SaaS applications:
// config/queue.php
'connections' => [
'redis' => [
'driver' => 'redis',
'connection' => 'queue',
'queue' => env('REDIS_QUEUE', 'default'),
'retry_after' => 90,
'block_for' => null,
'after_commit' => true,
],
],
The after_commit option is important. Setting it to true ensures jobs are only dispatched after database transactions commit successfully. Without this, you can dispatch a job that references a database record that doesn't exist yet because the transaction hasn't committed — a classic race condition that's difficult to debug.
Now define your queues by purpose:
// In your job classes, assign queues explicitly
// High priority: user-facing, time-sensitive
class ProcessStripeWebhook implements ShouldQueue
{
public $queue = 'webhooks';
}
// Medium priority: user-initiated but tolerant of seconds of delay
class SendOrderConfirmation implements ShouldQueue
{
public $queue = 'emails';
}
// Low priority: background processing, no user waiting
class GenerateMonthlyReport implements ShouldQueue
{
public $queue = 'reports';
}
// Heavy: long-running, resource-intensive
class ProcessVideoUpload implements ShouldQueue
{
public $queue = 'heavy';
}
This separation means a slow report generation can never block a webhook. Each queue gets its own worker pool with appropriate concurrency and timeout settings.
A note on the difference between connections and queues: a connection is the backend driver (Redis, SQS, database). Within a single connection, you can have many named queues. You don't need a separate Redis instance per queue — they all share the same Redis connection but operate as independent FIFO lists.
Horizon configuration deep-dive
Laravel Horizon turns queue management from guesswork into a dashboard with metrics. But the real value is in its code-driven configuration. Instead of managing Supervisor configs per server, you define your entire worker topology in config/horizon.php and deploy it like any other config change.
Here's a production-ready configuration that matches the queue topology above:
// config/horizon.php
'environments' => [
'production' => [
'supervisor-webhooks' => [
'connection' => 'redis',
'queue' => ['webhooks'],
'balance' => 'auto',
'autoScalingStrategy' => 'time',
'minProcesses' => 2,
'maxProcesses' => 10,
'balanceMaxShift' => 2,
'balanceCooldown' => 3,
'memory' => 128,
'tries' => 5,
'timeout' => 30,
'maxJobs' => 500,
'maxTime' => 3600,
],
'supervisor-emails' => [
'connection' => 'redis',
'queue' => ['emails'],
'balance' => 'simple',
'minProcesses' => 1,
'maxProcesses' => 5,
'memory' => 128,
'tries' => 5,
'timeout' => 30,
'maxJobs' => 1000,
'maxTime' => 3600,
],
'supervisor-default' => [
'connection' => 'redis',
'queue' => ['default'],
'balance' => 'auto',
'autoScalingStrategy' => 'size',
'minProcesses' => 1,
'maxProcesses' => 8,
'balanceMaxShift' => 1,
'balanceCooldown' => 3,
'memory' => 128,
'tries' => 3,
'timeout' => 60,
'maxJobs' => 500,
'maxTime' => 3600,
],
'supervisor-heavy' => [
'connection' => 'redis',
'queue' => ['heavy', 'reports'],
'balance' => false,
'processes' => 2,
'memory' => 512,
'tries' => 1,
'timeout' => 600,
'maxJobs' => 50,
'maxTime' => 3600,
],
],
],
Let me walk through the key decisions:
Balance strategies. The auto strategy dynamically scales worker processes based on workload. The autoScalingStrategy determines how it measures demand — time scales based on how long it would take to clear the queue, while size scales based on job count. For webhooks, time is better because you want low latency. For the default queue, size works fine because you care about throughput.
The simple strategy distributes workers evenly across queues within a supervisor. I use it for emails because email sending is predictable — roughly the same work per job.
Setting balance to false disables auto-scaling entirely. The heavy queue gets a fixed two processes because these jobs are resource-intensive — you don't want Horizon spinning up ten of them.
balanceMaxShift and balanceCooldown. These control how aggressively Horizon scales. balanceMaxShift is the maximum number of processes to add or remove per balance cycle. balanceCooldown is the seconds between cycles. For webhooks, I set balanceMaxShift to 2 for faster scaling. For the default queue, 1 is fine.
maxJobs and maxTime. These force worker recycling. A worker that's processed 500 jobs or run for an hour gracefully exits, and Horizon restarts it. This prevents memory leaks — PHP workers accumulate memory over time, and recycling is the production-proven fix.
Memory limits. The heavy queue gets 512MB because video processing and report generation need more memory. Everything else gets 128MB. If a worker exceeds its memory limit, Horizon kills and restarts it.
Writing Laravel queue jobs that scale
The way you design your jobs matters as much as your infrastructure. A well-designed job is small, fast, idempotent, and safe to retry.
Keep jobs small and focused
Each job should do one thing. If you need to process an order — charge the customer, send a confirmation email, update inventory, and notify the warehouse — that's four jobs, not one.
// Dispatch a chain of focused jobs
use Illuminate\Support\Facades\Bus;
Bus::chain([
new ChargeCustomer($order),
new SendOrderConfirmation($order),
new UpdateInventory($order),
new NotifyWarehouse($order),
])->onQueue('default')->dispatch();
If UpdateInventory fails, the chain stops there. The customer is already charged and notified — you only need to retry the inventory update, not the entire flow.
Make every job idempotent
Idempotency means running a job twice produces the same result as running it once. This is critical because Laravel will retry failed jobs, and network issues can cause a job to execute but fail to acknowledge completion — leading to duplicate processing.
class ProcessPaymentWebhook implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public int $tries = 5;
public int $timeout = 30;
public bool $failOnTimeout = true;
public function __construct(
private string $stripeEventId,
private array $payload,
) {}
public function backoff(): array
{
return [10, 30, 60, 300, 600];
}
public function middleware(): array
{
return [
new WithoutOverlapping($this->stripeEventId),
(new RateLimited('stripe-webhooks'))->dontRelease(),
];
}
public function handle(): void
{
// Idempotency guard — skip if already processed
if (ProcessedWebhook::where('event_id', $this->stripeEventId)->exists()) {
return;
}
DB::transaction(function () {
// Process the webhook payload
$this->processPayment();
// Record processing to prevent duplicate handling
ProcessedWebhook::create([
'event_id' => $this->stripeEventId,
'processed_at' => now(),
]);
});
}
}
The key elements: a database check at the start to skip already-processed events, a WithoutOverlapping middleware to prevent concurrent processing of the same event, and a RateLimited middleware to respect Stripe's API limits. The backoff array gives exponential delays between retries — 10 seconds, then 30, then 60, then 5 minutes, then 10 minutes.
Batch large workloads
When you need to process thousands of items, don't dispatch thousands of individual jobs. Use Bus::batch() to group them into a trackable unit of work:
use Illuminate\Bus\Batch;
use Illuminate\Support\Facades\Bus;
$batch = Bus::batch(
$products->chunk(100)->map(
fn ($chunk) => new ImportProductChunk($chunk->pluck('id')->toArray())
)->toArray()
)->then(function (Batch $batch) {
Notification::route('slack', config('services.slack.imports'))
->notify(new ImportCompleted($batch->totalJobs));
})->catch(function (Batch $batch, Throwable $e) {
Log::error('Product import batch failed', [
'batch_id' => $batch->id,
'failed' => $batch->failedJobs,
'error' => $e->getMessage(),
]);
})->finally(function (Batch $batch) {
Cache::forget("import:lock:{$batch->id}");
})->name('product-import')
->onQueue('heavy')
->dispatch();
Notice the jobs receive an array of IDs rather than serialized model collections. Passing IDs instead of models keeps the serialized job payload small and avoids stale data issues — the job fetches fresh models from the database when it runs.
Failure handling and alerting
Jobs fail. Servers reboot. APIs return 500s. Redis runs out of memory. The question isn't whether failures happen — it's whether you find out before your users do.
The failed jobs table
Laravel stores failed jobs in the failed_jobs table by default. You should be monitoring this table, not just querying it when something goes wrong:
php artisan make:queue-failed-table
php artisan migrate
Dead letter pattern
When a job exhausts all retries, it's moved to failed_jobs. But you can add a more proactive pattern — re-dispatch the job to a quarantine queue for manual review and immediate notification:
class ProcessPaymentWebhook implements ShouldQueue
{
// ... previous configuration ...
public function failed(Throwable $exception): void
{
Log::error('Payment webhook permanently failed', [
'event_id' => $this->stripeEventId,
'exception' => $exception->getMessage(),
'trace' => $exception->getTraceAsString(),
]);
// Dispatch to quarantine queue for manual review
QuarantineFailedJob::dispatch(
jobClass: static::class,
payload: $this->payload,
error: $exception->getMessage(),
)->onQueue('dead-letter');
// Alert the team immediately
Notification::route('slack', config('services.slack.alerts'))
->notify(new JobPermanentlyFailed(
job: static::class,
error: $exception->getMessage(),
));
}
}
Horizon notifications
Horizon has built-in notification support. Configure it in your HorizonServiceProvider:
// app/Providers/HorizonServiceProvider.php
use Laravel\Horizon\Horizon;
public function boot(): void
{
parent::boot();
Horizon::routeSlackNotificationsTo(
config('services.slack.horizon'),
'#queue-alerts'
);
Horizon::routeMailNotificationsTo('[email protected]');
// Notify when queue wait time exceeds threshold
Horizon::longWaitTimeThreshold(30); // seconds
}
This gives you alerts when Horizon detects long wait times — meaning jobs are piling up faster than workers process them.
Scaling Laravel queues to multiple servers
A single server running Horizon has a ceiling. Eventually you need horizontal scaling — multiple servers, each running their own Horizon instance, all sharing the same Redis backend.
The good news: this works out of the box. Horizon uses Redis to coordinate across instances. Each server's Horizon process registers its supervisors, and Redis ensures no job is processed twice. There's no special configuration needed beyond what you already have.
The setup
On each worker server:
- Install your Laravel application (same codebase, same config)
- Point
REDIS_HOSTto your shared Redis instance (ElastiCache, Redis Cloud, or your own server) - Run Horizon via Supervisor:
; /etc/supervisor/conf.d/horizon.conf
[program:horizon]
process_name=%(program_name)s
command=php /var/www/html/artisan horizon
autostart=true
autorestart=true
user=www-data
redirect_stderr=true
stdout_logfile=/var/www/html/storage/logs/horizon.log
stopwaitsecs=3600
The stopwaitsecs=3600 is critical — it gives currently-running jobs up to one hour to finish before Supervisor forcefully kills the process. Set this to at least the timeout of your longest-running job.
Scaling the process counts
With two servers each running Horizon with maxProcesses=10 on a supervisor, you effectively have up to 20 workers for that queue. Horizon's auto-balancing operates independently per server instance, so each server scales its own worker count based on the shared queue depth.
A typical scaling pattern:
| Traffic level | Worker servers | Max processes per server | Total capacity |
|---|---|---|---|
| Low | 1 | 10 | 10 workers |
| Medium | 2 | 10 | 20 workers |
| High | 3–4 | 15 | 45–60 workers |
| Peak/burst | Auto-scaled | 20 | Dynamic |
Auto-scaling with cloud providers
If you're on AWS, you can run Horizon in ECS tasks and use target-tracking auto-scaling based on a CloudWatch metric that monitors your Redis queue depth. When the queue grows, ECS spins up more Horizon containers. When it shrinks, ECS scales back down.
On Laravel Forge, you can provision additional worker servers and enable Horizon daemons on each. Forge manages the Supervisor configuration for you.
Deployment considerations
When deploying new code across multiple Horizon servers, you need to restart workers so they pick up the changes. Run this during your deploy script:
php artisan horizon:terminate
Horizon will finish processing current jobs, then exit. Supervisor restarts it automatically with the new code. If you're running Horizon's pause/continue commands during deployment, pair them to avoid processing jobs with partially-deployed code:
php artisan horizon:pause
# Deploy code, run migrations
php artisan horizon:continue
php artisan horizon:terminate
Advanced patterns and edge cases
Rate limiting external APIs
When your jobs call third-party APIs, you need to respect rate limits. Laravel's RateLimited job middleware integrates with the framework's rate limiter:
// In a service provider
use Illuminate\Cache\RateLimiting\Limit;
use Illuminate\Support\Facades\RateLimiter;
RateLimiter::for('stripe-webhooks', function ($job) {
return Limit::perMinute(100);
});
RateLimiter::for('email-sending', function ($job) {
return Limit::perMinute(50)->by($job->user->id);
});
When a job hits the rate limit, it's released back to the queue and retried after the cooldown. The ->dontRelease() option on the middleware causes the job to fail instead of retrying — useful when you'd rather fail fast than queue up a backlog of rate-limited jobs.
Preventing stale model data
Jobs serialize model data at dispatch time. If the model changes between dispatch and processing, your job works with stale data. Use the $deleteWhenMissingModels property so jobs involving deleted models don't waste retry attempts:
class SendInvoice implements ShouldQueue
{
public $deleteWhenMissingModels = true;
public function handle(): void
{
// Refresh from database to get current state
$this->order->refresh();
// Now work with fresh data
}
}
Handling long-running jobs gracefully
For jobs that legitimately take minutes to run — video processing, large data imports — set $failOnTimeout = true to ensure they fail cleanly rather than hanging:
class ProcessVideoUpload implements ShouldQueue
{
public int $timeout = 600; // 10 minutes
public bool $failOnTimeout = true;
public int $tries = 1; // Don't retry expensive operations
public function handle(): void
{
// Check if batch was cancelled before doing heavy work
if ($this->batch()?->cancelled()) {
return;
}
// Long-running processing...
}
}
Security: encrypting sensitive job payloads
If your jobs contain sensitive data, implement ShouldBeEncrypted to encrypt the serialized payload at rest in Redis:
use Illuminate\Contracts\Queue\ShouldBeEncrypted;
class ProcessSensitiveData implements ShouldQueue, ShouldBeEncrypted
{
public function __construct(
private string $ssn,
private string $accountNumber,
) {}
}
Testing queued jobs
Testing queued jobs requires a different approach than testing synchronous code. Laravel provides Queue::fake() and Bus::fake() to intercept dispatched jobs without actually processing them.
Testing individual job dispatch
use Illuminate\Support\Facades\Queue;
test('order creation dispatches processing job', function () {
Queue::fake();
$order = Order::factory()->create();
$this->postJson("/api/orders/{$order->id}/process");
Queue::assertPushed(ProcessOrder::class, function ($job) use ($order) {
return $job->order->id === $order->id;
});
Queue::assertPushedOn('default', ProcessOrder::class);
});
Testing job chains
use Illuminate\Support\Facades\Bus;
test('order processing dispatches full chain', function () {
Bus::fake();
ProcessOrderWorkflow::dispatch($this->order);
Bus::assertChained([
ChargeCustomer::class,
SendOrderConfirmation::class,
UpdateInventory::class,
NotifyWarehouse::class,
]);
});
Testing batches
test('product import creates a batch', function () {
Bus::fake();
$products = Product::factory()->count(500)->create();
ImportProducts::dispatch($products);
Bus::assertBatched(function ($batch) {
return $batch->jobs->count() === 5; // 500 products / 100 per chunk
});
});
Testing the job itself
Don't only test that jobs are dispatched — test that they work correctly when executed:
test('payment webhook job processes payment and records event', function () {
$event = StripeEvent::factory()->create();
$job = new ProcessPaymentWebhook(
stripeEventId: $event->id,
payload: $event->payload,
);
$job->handle();
expect(ProcessedWebhook::where('event_id', $event->id)->exists())->toBeTrue();
expect($event->fresh()->status)->toBe('processed');
});
test('payment webhook job is idempotent', function () {
$event = StripeEvent::factory()->create();
ProcessedWebhook::create(['event_id' => $event->id, 'processed_at' => now()]);
$job = new ProcessPaymentWebhook(
stripeEventId: $event->id,
payload: $event->payload,
);
$job->handle();
// Should not process again
expect(ProcessedWebhook::where('event_id', $event->id)->count())->toBe(1);
});
Common mistakes
Dispatching jobs inside database transactions. If the transaction rolls back, the job still sits in Redis waiting to process data that doesn't exist. Set after_commit => true in your queue connection config, or use ->afterCommit() on the dispatch call.
Not setting maxJobs and maxTime on workers. PHP processes leak memory over time. Without recycling, your workers gradually consume more RAM until they crash or the OOM killer takes them out. Always set both flags — maxJobs catches high-throughput scenarios, maxTime catches low-throughput memory creep.
Serializing entire models instead of IDs. When you pass a full Eloquent model to a job constructor, Laravel serializes it and re-fetches it by ID when the job runs. This usually works fine, but if the model has large relations loaded or custom attributes, the serialized payload balloons. Pass only the ID and fetch the model in handle() when you need tight control.
Using queue:restart instead of horizon:terminate. If you're running Horizon, use horizon:terminate for deployments. queue:restart kills workers immediately without letting current jobs finish, which can leave jobs in an inconsistent state.
Setting retry_after lower than timeout. If retry_after (in your queue config) is shorter than your job's timeout, Redis will release the job back to the queue while it's still being processed — causing duplicate execution. Always set retry_after to at least 1.5x your longest job timeout.
Wrapping up
We built a complete queue architecture: Redis as the driver, separated queues by job characteristics, Horizon supervisors with auto-balancing, idempotent job design with retry strategies, dead-letter patterns for failure handling, and multi-server horizontal scaling. The code examples build a real pattern you can adapt — from the config/horizon.php supervisor definitions through to the Pest tests that verify your jobs behave correctly.
For next steps: set up Horizon monitoring if you haven't already, implement the Stripe webhook verification pattern for your webhook jobs, and review your deployment pipeline to include horizon:terminate in every deploy.
FAQ
How do I scale Laravel queue workers in production?
Use Laravel Horizon with Redis and configure multiple supervisors — one per queue type. Set the balance strategy to auto so Horizon dynamically scales worker processes based on queue depth. For additional capacity, run Horizon on multiple servers pointing at the same Redis instance.
What is the best queue driver for Laravel in production?
Redis is the best choice for most Laravel applications. It's fast, supports Horizon's dashboard and auto-balancing, and handles the atomic operations needed for unique jobs and rate limiting. Use SQS if you're fully invested in AWS and don't need Horizon. Avoid the database driver for anything beyond low-volume workloads.
How do I prevent duplicate job processing in Laravel?
Implement idempotency guards inside your job's handle() method — check a database flag before processing and set it within a transaction. Use the WithoutOverlapping middleware to prevent concurrent execution of the same job, and implement ShouldBeUnique for jobs that should only be queued once at a time.
How do I handle failed jobs in Laravel Horizon?
Configure Horizon's Slack and email notifications in HorizonServiceProvider. Implement the failed() method on critical jobs to dispatch to a dead-letter queue and alert your team. Monitor the Horizon dashboard for failed job trends and set longWaitTimeThreshold to catch queue backlogs early.
How many queue workers do I need for my Laravel app?
Start with one worker process per CPU core on your queue server. Monitor queue wait times in Horizon — if jobs consistently wait more than a few seconds, add workers. Use Horizon's auto balance strategy with minProcesses and maxProcesses to let it scale dynamically within bounds you control.
What is the difference between queue connection and queue name in Laravel?
A connection is the backend driver — Redis, SQS, or database. It defines where jobs are stored. A queue name is a logical partition within a connection. You can have dozens of named queues on a single Redis connection. Workers subscribe to specific queue names and process jobs from those queues only.
How do I run Laravel Horizon on multiple servers?
Deploy the same Laravel application to each server, point REDIS_HOST to a shared Redis instance, and run Horizon via Supervisor on each server. Horizon coordinates through Redis — no extra configuration needed. Each server's worker processes register independently, and Redis ensures no job is processed twice.
How do I monitor Laravel queues in production?
Use Horizon's built-in dashboard for real-time metrics on throughput, wait time, and runtime. Configure Slack notifications for long wait times and failed jobs. For deeper observability, add a custom Laravel Pulse recorder to track job metrics alongside your application metrics, and send structured logs to your aggregation service.
Steven is a software engineer with a passion for building scalable web applications. He enjoys sharing his knowledge through articles and tutorials.