Laravel Octane RoadRunner in Production — Tuning max_jobs, Worker Memory, and Supervisor

Tune Laravel Octane RoadRunner for production: max_jobs, max_worker_memory, worker counts, plus a Supervisor config and zero-downtime reloads on deploy.

Steven Richardson
Steven Richardson
· 7 min read

You ran php artisan octane:start --server=roadrunner in development, watched response times drop by 5x, and shipped it. A week later the box OOMs at 3am because nothing was recycling workers, nothing was capping memory, and nothing was supervising the process. This is how to run Laravel Octane RoadRunner in production without that pager going off.

Profile your baseline Octane RoadRunner process#

Before touching any knob, capture what your workers actually consume. Start Octane with your current settings, send realistic traffic at it, and watch the resident memory of each worker process — this baseline tells you whether you have a leak and how fast it compounds.

# Watch per-worker RSS (KB) — workers are PHP processes under the rr master
watch -n 5 "ps -eo pid,ppid,rss,etime,args --sort=-rss | grep -E '[r]r serve|[o]ctane' "

A healthy worker climbs for the first few dozen requests as caches warm, then plateaus. A worker whose RSS climbs linearly with request count has a leak. Note the plateau number — you'll size max_worker_memory from it. If you want richer numbers than ps gives you, Telescope, Debugbar, and Pulse each cover a different slice of this — Pulse is the one I leave running in production.

Set max_jobs to recycle workers before leaks compound#

max_jobs is RoadRunner's name for the number of requests a worker handles before it's destroyed and replaced with a fresh one. Octane exposes it as the --max-requests flag and defaults to 500 — every worker recycles after 500 requests whether it's leaking or not. This is the same defensive pattern as recycling queue workers with --max-jobs and --max-time, and it exists for the same reason: something in your app will leak eventually, and recycling makes it not matter.

php artisan octane:start --server=roadrunner --max-requests=500

Stick with 500 unless your baseline says otherwise. Drop it to 250 if memory climbs fast and you can't find the leak today; raise it to 1000+ only if your boot is expensive and your plateau is flat. Setting it too low burns CPU on framework boots; too high lets drift compound until the box swaps.

Cap worker memory with max_worker_memory#

max_jobs recycles on a schedule; max_worker_memory recycles on evidence. RoadRunner's supervisor watches each worker's memory and gracefully replaces any worker that crosses the threshold — in-flight requests finish first. This setting lives in a .rr.yaml file, not in config/octane.php, and you point Octane at it with --rr-config.

# .rr.yaml in your project root
version: "3"
http:
  pool:
    supervisor:
      # MB per worker — recycle any worker that crosses this line
      max_worker_memory: 256
php artisan octane:start --server=roadrunner --rr-config=.rr.yaml --max-requests=500

Size it from your baseline: take the healthy plateau, add 50% headroom, and make sure workers × max_worker_memory stays around a third of the container's memory limit. With 8 workers at 256 MB you need 2 GB of worst-case worker memory on a box with comfortable room beyond that — the OS, RoadRunner master, and page cache need the rest.

Pick the right worker count for your cores#

By default Octane starts one worker per CPU core, and for CPU-bound workloads that's correct — more workers than cores just adds context-switching. The reason to go higher is IO wait: if your requests spend most of their time waiting on the database or external APIs, workers sit idle and you can afford 2–4 per core.

php artisan octane:start --server=roadrunner --workers=8 --max-requests=500

For a typical 4-core API box, 8 workers is the sweet spot in my experience — enough to cover IO wait without blowing the RAM budget. Work backwards from memory: available RAM ÷ max_worker_memory is your hard ceiling on worker count, whatever the CPU maths says.

Supervise Octane RoadRunner with Supervisor in production#

Octane's --watch flag is a development tool — it restarts on file changes, not on crashes. In production the RoadRunner process needs a real supervisor that restarts it when it dies, the same way Horizon needs Supervisor for queue workers. Create a program file and let Supervisor own the lifecycle.

; /etc/supervisor/conf.d/octane.conf
[program:octane]
process_name=%(program_name)s_%(process_num)02d
command=php /home/forge/example.com/artisan octane:start --server=roadrunner --host=127.0.0.1 --port=8000 --workers=8 --max-requests=500 --rr-config=/home/forge/example.com/.rr.yaml
autostart=true
autorestart=true
user=forge
redirect_stderr=true
stdout_logfile=/home/forge/example.com/storage/logs/octane.log
stopwaitsecs=60
stopsignal=SIGTERM
sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl start octane:*

One Supervisor program, one RoadRunner master, eight PHP workers under it. Don't run multiple Supervisor processes of octane:start on the same port — scale the --workers count instead.

Configure graceful shutdown so deploys don't drop requests#

stopsignal=SIGTERM tells RoadRunner to stop accepting new connections and let in-flight requests finish; stopwaitsecs is how long Supervisor waits before escalating to SIGKILL. The rule: stopwaitsecs must exceed Octane's max_execution_time, which defaults to 30 seconds in config/octane.php — otherwise your longest request gets killed mid-flight on every restart.

// config/octane.php
'max_execution_time' => 30, // stopwaitsecs must be > this

If you're running behind a load balancer or in Kubernetes, pair this with readiness probes that pull the pod out of rotation before shutdown — graceful shutdown only helps if traffic stops arriving first.

Reload workers on every deploy#

Octane serves your app from memory, so a deploy that swaps the code on disk changes nothing until workers reboot. Add octane:reload to the end of your deploy script — it gracefully recycles every worker, which boots the new code without dropping the listener socket.

# Last lines of your deploy script
php artisan config:cache
php artisan route:cache
php artisan octane:reload

This is what makes zero-downtime deploys work with Octane: the master process keeps the port open while workers are replaced one by one. Forgetting this line is the most common Octane bug report — "my fix is deployed but production still shows the old behaviour".

Gotchas and Edge Cases#

octane:status won't show you memory. It only reports whether the server is running. For per-worker memory you need ps (above), Pulse, or Nightwatch — don't go looking for a --memory flag, it doesn't exist.

CLI flags beat .rr.yaml. Octane passes --workers and --max-requests as overrides, so if your YAML says num_workers: 16 and your Supervisor command says --workers=8, you get 8. Keep pool sizing in the command and reserve .rr.yaml for things Octane has no flag for, like max_worker_memory.

octane:reload doesn't reload everything. It reboots workers, so new code and cached config load fine — but changes to .rr.yaml or the Supervisor command itself need a full supervisorctl restart octane:*.

The leak is usually a singleton. Container singletons that capture the request, the container, or a config repository in their constructor hold references across requests. Octane resets framework state between requests but can't reset yours. If RSS climbs and you bound services as singletons, audit those first.

RoadRunner vs FrankenPHP is a workload question, not a benchmark question. RoadRunner's mature worker supervision (this whole article) makes it my pick for long-lived VMs running a traditional monolith. FrankenPHP's single-binary model and Caddy integration win for containerised, autoscaled deployments — I covered that side in Laravel Octane + FrankenPHP in production.

Wrapping Up#

Recycle on schedule with --max-requests=500, recycle on evidence with max_worker_memory, size workers from cores and RAM, and put the whole thing under Supervisor with a SIGTERM window longer than your slowest request. If you're weighing the alternative runtime, read the FrankenPHP production deployment guide next — and make sure your deploy pipeline calls octane:reload before you ship any of this.

FAQ#

What is max_jobs in Laravel Octane RoadRunner?

max_jobs is RoadRunner's pool setting for how many requests a worker handles before it's destroyed and replaced with a fresh process. Octane exposes it as the --max-requests flag on octane:start and defaults to 500. It exists to stop slow memory leaks from compounding — a recycled worker returns its memory to the OS.

How many Octane workers should I run in production?

Octane defaults to one worker per CPU core, which is right for CPU-bound workloads. For IO-heavy APIs that wait on databases or external services, 2–4 workers per core is reasonable — 8 workers on a 4-core box is a common production starting point. Your hard ceiling is memory: available RAM divided by max_worker_memory caps the count regardless of CPU.

How do I stop RoadRunner worker memory leaks?

Set max_worker_memory in .rr.yaml so RoadRunner gracefully recycles any worker that crosses the threshold, and keep --max-requests at 500 as a schedule-based backstop. Then find the actual leak: it's almost always a container singleton holding request-scoped state, or data accumulating in a static property. Recycling makes leaks survivable; fixing the singleton makes them gone.

Should I use Supervisor or systemd for Laravel Octane?

Either works — what matters is that something restarts the RoadRunner process when it dies, because octane:start alone has no crash recovery. Supervisor is the Laravel-ecosystem default, it's what Forge configures, and its stopwaitsecs/stopsignal options map cleanly onto Octane's graceful shutdown needs. Use systemd if your ops standard is systemd units; the settings translate directly.

How does RoadRunner compare to FrankenPHP for Laravel?

RoadRunner is a Go-based application server with mature worker pool supervision — fine-grained control over recycling, memory caps, and timeouts, which suits long-lived VMs running a monolith. FrankenPHP embeds PHP in Caddy as a single binary with automatic HTTPS, which suits containerised and autoscaled deployments. Pick by workload shape and operational model, not by hello-world benchmarks.

How do I deploy Laravel Octane with zero downtime?

Deploy the new code, rebuild your caches, then run php artisan octane:reload as the final step. The RoadRunner master keeps the listener socket open while workers are gracefully recycled onto the new code, so no requests are dropped. Make sure Supervisor's stopwaitsecs exceeds your max_execution_time so in-flight requests finish during any full restart.

Steven Richardson
Steven Richardson

CTO at Digitonic. Writing about Laravel, architecture, and the craft of leading software teams from the west coast of Scotland.