Trace Every LLM Call — Langfuse Observability for Prism in Laravel

Trace every Prism LLM call in Laravel with Langfuse — model, tokens, latency, errors — using one env var. Self-hostable, Octane-ready, twenty-minute install.

Steven Richardson
Steven Richardson
· 8 min read

The agent picked the wrong tool. The prompt drifted after the last deploy. A retry doubled the cost. And your logs say nothing useful because a 4KB prompt doesn't fit in a Sentry breadcrumb. That's the gap Langfuse fills — and now you can wire laravel prism langfuse observability together in about twenty minutes, with no per-call code changes. This is the install I run on every production Prism app.

Sign up for Langfuse and grab your keys#

Langfuse comes in three flavours: the managed EU and US cloud, a Docker Compose deployment you can run on your own infrastructure, and a single-binary local dev option. For most teams I start on Langfuse Cloud — it takes a minute, the free tier is generous, and you can switch to self-hosted later without changing application code. If you're working with regulated data or your legal team objects to prompts leaving the perimeter, jump straight to the self-hosted track in the Langfuse docs.

After you create a project, open Settings → API Keys and create a new key pair. You'll get a public key prefixed pk-lf- and a secret key prefixed sk-lf-. Copy both — the secret key is only shown once. While you're there, note the host: the EU region uses https://cloud.langfuse.com, US uses https://us.cloud.langfuse.com. You'll need this if you're not on the EU default.

Install the laravel-langfuse package#

The official Langfuse SDKs are Python and JavaScript only. For PHP we've got axyr/laravel-langfuse, a clean Laravel package that talks to the Langfuse HTTP API and ships zero-code auto-instrumentation for Prism. It requires PHP 8.2+ and Laravel 12 or 13, so it slots into any modern Laravel codebase without dependency gymnastics.

composer require axyr/laravel-langfuse

The package registers a service provider, a facade (Axyr\Langfuse\LangfuseFacade), an HTTP middleware, and a set of Prism event listeners. Nothing else to wire up. If you want to override defaults like the batch size, flush interval, or scoped tags, publish the config:

php artisan vendor:publish --tag=langfuse-config

This is also the moment to make sure you've got Prism installed and configured. If you haven't, the Getting Started With Laravel Prism guide walks through the provider setup in under five minutes.

Wire up the Langfuse credentials in .env#

Open your .env file and add the two keys you copied from the dashboard. The package picks them up automatically — no service-provider boot logic, no config edits required. If your project is on the US region or a self-hosted instance, set the host too.

LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxxxxxxxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxxxxxxxxx

# Optional — defaults to https://cloud.langfuse.com (EU)
LANGFUSE_HOST=https://us.cloud.langfuse.com

# Send events through a queue worker instead of inline HTTP calls.
# Recommended for any production app — the user shouldn't pay for telemetry latency.
LANGFUSE_QUEUE=langfuse

I always set LANGFUSE_QUEUE=langfuse in production. Without it, every Prism call ends with a synchronous HTTP request to Langfuse before the response goes back to the user — fine in dev, painful at scale. Add a langfuse queue connection in config/queue.php and let Horizon (or any queue worker) drain it.

Make a Prism call and confirm the trace lands#

This is the moment the integration earns its keep. Flip the auto-instrumentation flag, fire a normal Prism call, and a trace appears in the dashboard with no extra code. No wrapper, no Langfuse::trace(...) call, no facade import in your controllers.

LANGFUSE_PRISM_ENABLED=true

Now any Prism::text(), Prism::structured(), or Prism::stream() call is intercepted by the package's listener and captured as a Langfuse trace:

use Prism\Prism\Prism;
use Prism\Prism\Enums\Provider;

class SummariseArticle
{
    public function __invoke(string $body): string
    {
        $response = Prism::text()
            ->using(Provider::Anthropic, 'claude-sonnet-4-5')
            ->withSystemPrompt('Summarise the article in two sentences. No filler.')
            ->withPrompt($body)
            ->asText();

        return $response->text;
    }
}

Open the Langfuse dashboard, click Traces, and you'll see a fresh entry with the provider, model, full prompt, full response, token counts, latency, and any error stack. That's the whole loop — one env flip, every LLM call captured.

If you're stretching this to a multi-step agent, the same instrumentation captures each tool invocation as a span. There's a deeper walkthrough in Building LLM Tool-Calling Agents With Laravel Prism and the production-grade pattern in Production AI Agents in Laravel — both will now render with full Langfuse traces.

Add user id, prompt version, and feature flag metadata#

Auto-traces with no context are useful for ten minutes and then frustrating forever. "Which user hit this 500?" "Did prompt v2 actually improve latency?" "Are flagged users seeing worse completions?" The answer is metadata, and the cleanest place to attach it is the package's request middleware. It creates a trace per HTTP request and nests every Prism call inside it.

Register LangfuseMiddleware on the routes that do AI work:

use Axyr\Langfuse\Http\Middleware\LangfuseMiddleware;

Route::middleware(['auth', LangfuseMiddleware::class])->group(function () {
    Route::post('/chat', ChatController::class)->name('chat');
    Route::post('/summarise', SummariseController::class)->name('summarise');
});

Now each request opens a parent trace with the route name and the authenticated user id. To add prompt version and a feature flag, append metadata before the Prism call:

use Axyr\Langfuse\LangfuseFacade as Langfuse;
use Laravel\Pennant\Feature;

public function __invoke(Request $request): Response
{
    Langfuse::current()?->update([
        'metadata' => [
            'prompt_version' => 'summary.v3',
            'feature_flag' => Feature::active('claude-sonnet-rollout') ? 'sonnet' : 'haiku',
            'tenant_id' => $request->user()->current_team_id,
        ],
        'tags' => ['summarise', 'production'],
    ]);

    return new Response(app(SummariseArticle::class)($request->validated('body')));
}

The trace now carries user, tenant, prompt version, and the flag bucket. In the dashboard you can filter by any of them, compare token usage across prompt versions, and answer the questions that previously cost an afternoon of grep.

Run Sentry alongside Langfuse without the OTEL headache#

If you've read the Langfuse docs, you'll have hit the OpenTelemetry coexistence section — the warnings about global TracerProvider conflicts, isolated providers, span filtering. Good news: that whole rabbit hole does not apply here. The axyr/laravel-langfuse package talks to the Langfuse HTTP API directly. It does not register an OpenTelemetry span processor and it does not touch a global tracer. Sentry's Laravel SDK can do its thing and Langfuse stays in its lane.

In practice you run them side by side:

# Sentry — application errors and performance
SENTRY_LARAVEL_DSN=https://[email protected]/...

# Langfuse — LLM-specific observability
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PRISM_ENABLED=true

That's it. No span filter, no shared collector. Sentry sees the HTTP request, captures the unhandled exception, the slow query. Langfuse sees the prompt, the response, the token spend. If you're self-hosting both — which is the right call for regulated environments — my guide to installing Sentry self-hosted on EC2 covers the Docker + Forge side, and Langfuse drops in next to it with its own Docker Compose stack.

If you eventually outgrow the Laravel package and want to ship traces through OpenTelemetry to Langfuse's OTEL endpoint, that's where the docs' existing OpenTelemetry setup guide becomes essential — at that point you do need the shared-pipeline patterns.

Watch for the common gotchas#

A few real things I've tripped over running this in production. None are blockers, all are easier to fix before you ship.

The package is at v0.1.0 as of May 2026. Pin it in composer.json with "axyr/laravel-langfuse": "0.1.*" rather than "^0.1" — the maintainer is iterating quickly and breaking changes between minor versions are entirely plausible. The same advice applies to Prism itself, which is still pre-1.0.

Sync flushing is fine in dev and lethal in production. With LANGFUSE_QUEUE unset, each event triggers an HTTP call inline. Three Prism calls in one request means three round trips to Langfuse on top of three calls to your model provider. Set LANGFUSE_QUEUE=langfuse and make sure your worker is actually draining it — I've seen teams configure the queue and then forget to deploy a worker for it.

Long-running CLI commands and Artisan jobs don't always trigger the shutdown handler that flushes pending events. If you run a one-shot command that fires a few Prism calls and exits, call Langfuse::flush() explicitly before you return. On Octane and FrankenPHP the package's scoped bindings handle per-request isolation correctly, but mass-importing scripts deserve the explicit flush.

Finally, watch what you log. Langfuse stores the full prompt and full response by default. For anything that touches personal data, healthcare records, or proprietary content, self-host. The Docker Compose deployment is straightforward, and your data never leaves your VPC. Pair it with my notes on Telescope vs Debugbar vs Pulse when you're deciding which observability tools to expose internally.

Wrapping Up#

You've got Langfuse traces for every Prism call, request-scoped metadata, and a sensible Sentry coexistence pattern. The next step is what you do with the data — start an evaluation loop, A/B test prompts via Langfuse's prompt management, or attach quality scores from a downstream check.

When you're ready to push further, the RAG pipeline with Laravel AI SDK and pgvector guide shows how the same tracing pattern captures embedding calls, vector search, and the final generation as one nested trace tree.

FAQ#

What is Langfuse and how does it help Laravel AI apps?

Langfuse is an open-source observability platform purpose-built for LLM applications. It captures every prompt, response, token count, latency reading, and tool call as a structured trace, plus prompt versioning and evaluation scores. For Laravel AI apps it answers the questions a generic APM can't: which prompt is slow, which model costs most per request, and whether the agent picked the right tool. Self-host it or use the managed cloud.

Does Langfuse have an official PHP SDK?

No. The official Langfuse SDKs are Python and JavaScript/TypeScript only — both built on OpenTelemetry. The Laravel community gap is filled by axyr/laravel-langfuse, a package that talks to the Langfuse HTTP API directly and provides a clean facade plus zero-code auto-instrumentation for Prism, Laravel AI, and Neuron AI. It's MIT licensed and works against both Langfuse v2 and v3.

How do I trace Prism calls automatically?

Install axyr/laravel-langfuse, add LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY to your .env, then set LANGFUSE_PRISM_ENABLED=true. Every Prism::text(), Prism::structured(), and Prism::stream() call from that point on is captured as a Langfuse trace with model, parameters, input, output, tokens, latency, and any errors. No wrapper code in your controllers or jobs.

Can I use Langfuse and Sentry together in Laravel?

Yes, and it's painless. The axyr/laravel-langfuse package uses the Langfuse HTTP API rather than OpenTelemetry, so it doesn't fight with Sentry's tracer provider the way the official Python and JS Langfuse SDKs can. Run Sentry's Laravel SDK for application errors and performance, and Langfuse for LLM-specific observability. Each tool sees what it needs to see, no shared collector required.

Is Langfuse self-hostable for compliance reasons?

Yes. Langfuse is open source under an MIT licence and ships a Docker Compose deployment for v2 and v3. If your prompts contain PII, PHI, or proprietary content that legal won't let leave your infrastructure, self-hosting is the right call. The Laravel package behaves identically against a self-hosted instance — just point LANGFUSE_HOST at your internal URL. The Docker stack also runs comfortably alongside a self-hosted Sentry deployment.

How do I capture token usage with Langfuse and Prism?

You don't have to — the auto-instrumentation handles it. With LANGFUSE_PRISM_ENABLED=true, the package reads Prism's response usage object and writes it to the Langfuse generation alongside model name and parameters. Input tokens, output tokens, and total tokens appear in the trace, and Langfuse calculates cost using its built-in pricing tables for major models. If you're on a less common model, set custom pricing in your Langfuse project settings.

Steven Richardson
Steven Richardson

CTO at Digitonic. Writing about Laravel, architecture, and the craft of leading software teams from the west coast of Scotland.