The first AI chat feature any Laravel team tries to ship is "make the bot's reply appear word-by-word like ChatGPT." Most teams Google "Laravel streaming ChatGPT", end up rebuilding a React front end with EventSource, wiring Pusher, or running a separate Node WebSocket server — when Livewire 4 ships wire:stream and Prism ships ->asStream() and the two were built for each other.
If you've already got Prism wired up from getting started with Laravel Prism and have a Livewire 4 app in front of it, this is the missing middle piece. We'll build a chat component, stream tokens into the bubble live, and persist the final message — all in PHP, no separate process.
Install Prism and configure the provider#
We need Prism on the project and at least one provider configured. Prism is at v0.100+ as of this writing and requires PHP 8.2+ on Laravel 11/12/13. Pin the version to avoid surprises from breaking changes — the package is still pre-1.0.
composer require "prism-php/prism:^0.100"
php artisan vendor:publish --tag=prism-config
That drops a config/prism.php file with provider stubs. Drop your OpenAI or Anthropic key into .env — the key name matches whichever provider you'll call.
# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
The provider-agnostic shape is the point: same code path for OpenAI, Anthropic, Gemini, Groq, Ollama. If you want a wider tour of the Prism API surface before going further, the complete Laravel AI SDK guide sits alongside Prism as the same kind of unified-interface story.
Build a Livewire chat component with a messages array#
Generate the component and a backing chat_messages table. The data model is two columns of substance — role (user or assistant) and content — plus a foreign key for chat_sessions if you want multi-conversation support.
php artisan livewire:make Chat
php artisan make:model ChatMessage -m
The component holds a $messages array hydrated from the database on mount, a $prompt field bound to the input, and a $streaming flag so the input disables while the bot is mid-sentence. Everything form-shaped sits here — though if your input area grows, lift it into a Livewire 4 form object with #[Validate] and let the component shrink back to a coordinator.
<?php
namespace App\Livewire;
use App\Models\ChatMessage;
use Livewire\Attributes\Validate;
use Livewire\Component;
class Chat extends Component
{
public array $messages = [];
#[Validate('required|string|max:2000')]
public string $prompt = '';
public bool $streaming = false;
public function mount(): void
{
$this->messages = ChatMessage::query()
->where('user_id', auth()->id())
->orderBy('id')
->get(['role', 'content'])
->toArray();
}
public function render()
{
return view('livewire.chat');
}
}
Add a streaming action method that calls Prism's asStream()#
This is the heart of the whole flow. send() validates the prompt, appends the user message, then triggers a second request — streamResponse() — that does the long-running work. Splitting the two methods is the single biggest decision in this whole flow: if you stream from inside send(), the user's own message doesn't appear until the bot finishes talking.
public function send(): void
{
$this->validate();
$this->messages[] = [
'role' => 'user',
'content' => $this->prompt,
];
ChatMessage::create([
'user_id' => auth()->id(),
'role' => 'user',
'content' => $this->prompt,
]);
$this->prompt = '';
$this->streaming = true;
// Fire a second request — this one streams.
$this->js('$wire.streamResponse()');
}
public function streamResponse(): void
{
$userMessage = collect($this->messages)->last()['content'];
$assembled = '';
$stream = \Prism\Prism\Prism::text()
->using('openai', 'gpt-4o-mini')
->withSystemPrompt('You are a concise Laravel expert. Answer in 3-5 sentences.')
->withMessages($this->messagesForPrism())
->asStream();
foreach ($stream as $event) {
// We only care about text_delta events here.
// Tool calls, artifacts and step events would go to other targets.
if ($event->type() === \Prism\Prism\Enums\StreamEventType::TextDelta) {
$chunk = $event->delta;
$assembled .= $chunk;
$this->stream(
to: 'response',
content: $chunk,
);
}
}
// After the stream — persist and commit to the messages array.
ChatMessage::create([
'user_id' => auth()->id(),
'role' => 'assistant',
'content' => $assembled,
]);
$this->messages[] = ['role' => 'assistant', 'content' => $assembled];
$this->streaming = false;
}
protected function messagesForPrism(): array
{
return collect($this->messages)
->map(fn (array $m) => match ($m['role']) {
'user' => new \Prism\Prism\ValueObjects\Messages\UserMessage($m['content']),
'assistant' => new \Prism\Prism\ValueObjects\Messages\AssistantMessage($m['content']),
})
->all();
}
Two things worth flagging. First, $this->stream(to: 'response', content: $chunk) defaults to append mode, which is exactly what you want for a chat reply — each token gets bolted onto the end of the bubble. Pass replace: true if you ever need to overwrite (count-downs, loading text). Second, the withMessages() history is what lets the model maintain conversation context — if you skip it, every reply will read like the first message of a new chat.
Wire up the response with wire:stream in the Blade template#
The Blade is where wire:stream does its work. You need three things in the template: the historical messages loop, a streaming target with the wire:stream="response" attribute, and the prompt form. The streaming target only renders while $streaming is true — once the message is persisted, it falls into the historical loop on the next render.
<div>
<div class="space-y-4 mb-6">
@foreach ($messages as $message)
<article @class([
'rounded-lg p-4',
'bg-blue-50 text-blue-900' => $message['role'] === 'user',
'bg-gray-50 text-gray-900' => $message['role'] === 'assistant',
])>
<strong>{{ ucfirst($message['role']) }}</strong>
<p class="mt-2 whitespace-pre-wrap">{{ $message['content'] }}</p>
</article>
@endforeach
@if ($streaming)
<article class="rounded-lg p-4 bg-gray-50 text-gray-900">
<strong>Assistant</strong>
<p class="mt-2 whitespace-pre-wrap" wire:stream="response"></p>
</article>
@endif
</div>
<form wire:submit="send" class="flex gap-2">
<input
type="text"
wire:model="prompt"
placeholder="Ask the bot…"
class="flex-1 rounded border-gray-300"
@disabled($streaming)
/>
<button
type="submit"
class="rounded bg-blue-600 px-4 py-2 text-white"
@disabled($streaming)
>
Send
</button>
</form>
</div>
The wire:stream="response" element starts empty. As Prism yields each text_delta event, Livewire's transport pushes the chunk into that paragraph and the DOM updates without a re-render. There is no JavaScript on your end — Livewire's runtime owns the SSE framing.
Persist the assembled message after the stream completes#
The streamResponse() method above already persists after the loop ends, which is the right pattern: assemble the full text in PHP, write one row at the end. The temptation to call ChatMessage::create() inside the foreach is real and you must resist it — a 500-token response would be 500 individual INSERTs, each one waiting on a synchronous database round trip while a user is watching the page.
Prism also gives you a built-in callback hook on the streaming methods so you don't have to track $assembled manually for routes that hand the stream back to the browser directly. If you've ever wanted to keep the persistence logic separate from the streaming loop — or if you're streaming over an HTTP endpoint rather than Livewire — this is the cleaner shape:
use Illuminate\Support\Collection;
use Prism\Prism\Streaming\Events\StreamEvent;
use Prism\Prism\Streaming\Events\TextDeltaEvent;
use Prism\Prism\Text\PendingRequest;
return Prism::text()
->using('openai', 'gpt-4o-mini')
->withMessages($history)
->asEventStreamResponse(function (PendingRequest $request, Collection $events) {
$fullText = $events
->filter(fn (StreamEvent $event) => $event instanceof TextDeltaEvent)
->map(fn (TextDeltaEvent $event) => $event->delta)
->join('');
ChatMessage::create([
'user_id' => auth()->id(),
'role' => 'assistant',
'content' => $fullText,
]);
});
That callback fires once, after the stream has fully drained, with a Collection of every event including stream_end (which carries the prompt + completion token counts on $event->usage — record that on the message if you're billing usage). For more on the surrounding agent loop and how to fit this into a queued workflow, see production AI agents in Laravel with Prism and Laravel Workflow.
Handle disconnects and stream errors gracefully#
LLM streams die mid-sentence more than you'd think. Provider 429s, dropped TCP connections, the user closing the tab — all of them leave $assembled half-written. Wrap the foreach in a try/catch and store whatever you have before re-throwing, so the user's input is never lost to the void.
public function streamResponse(): void
{
$assembled = '';
try {
$stream = \Prism\Prism\Prism::text()
->using('openai', 'gpt-4o-mini')
->withMessages($this->messagesForPrism())
->asStream();
foreach ($stream as $event) {
if ($event->type() === \Prism\Prism\Enums\StreamEventType::TextDelta) {
$assembled .= $event->delta;
$this->stream(to: 'response', content: $event->delta);
}
}
} catch (\Prism\Prism\Exceptions\PrismRateLimitedException $e) {
$this->stream(
to: 'response',
content: "\n\n_(Rate limited — try again in a few seconds.)_",
);
} catch (\Throwable $e) {
report($e);
$this->stream(
to: 'response',
content: "\n\n_(Something went wrong. The partial response above was saved.)_",
);
} finally {
if ($assembled !== '') {
ChatMessage::create([
'user_id' => auth()->id(),
'role' => 'assistant',
'content' => $assembled,
]);
$this->messages[] = ['role' => 'assistant', 'content' => $assembled];
}
$this->streaming = false;
}
}
The finally block runs whether the stream completes or blows up — partial messages still get persisted, the $streaming flag still flips back to false, and the UI never gets stuck in a half-disabled state.
Gotchas and Edge Cases#
A handful of things that have bitten me (or bitten people in the Prism GitHub issues) when running this in production.
Laravel Octane is not supported. The Livewire docs are blunt: wire:stream is not compatible with Octane. Octane's worker model keeps the response object open in a way that breaks Livewire's streaming transport. If you're on Octane, you have two clean fallbacks: use Prism's asEventStreamResponse() from a regular route and consume the SSE with a tiny Alpine listener, or push the work to a job and use asBroadcast() with Laravel Reverb for real-time notifications. The second option also unlocks multi-user broadcast — useful if more than one tab needs the same reply.
Laravel Telescope eats the stream. Telescope's HTTP Client Watcher buffers responses to log them, which means Prism's stream looks like a single empty response by the time your foreach gets to it. The fix is to either disable the HTTP Client Watcher in config/telescope.php or set ignore_paths to include the Livewire request paths (anything under livewire/* is the safe blanket). The Prism docs call this out explicitly under streaming.
Nginx buffering. Nginx's default proxy_buffering on will hold streamed responses until the buffer fills or the connection closes — meaning the chat bubble paints in one big chunk instead of token-by-token. Add proxy_buffering off; for the Livewire endpoints, or set the X-Accel-Buffering: no header on the response. Livewire's runtime sets this for you in v4.x, but it's worth verifying in your nginx -T output if you see batching behaviour.
Don't stream from send(). I mentioned this above but it's the single most common mistake. If you call ->asStream() from inside the same method as the user submit, the page won't paint the user's own message until the bot finishes — because Livewire holds the response open for the duration of the foreach. The $this->js('$wire.streamResponse()') trick separates the two requests so the UI updates twice.
Tool calls need a different target. If you extend this to tool-using agents (covered in building LLM tool-calling agents with Prism), the stream emits tool_call, tool_result, and step_finish events alongside text_delta. Route them to a separate wire:stream target or you'll get JSON arguments rendered inside your chat bubble.
Wrapping Up#
You now have a ChatGPT-style chat UI in pure Livewire and Prism — no Node process, no React front end, no WebSocket server. The keys are splitting submit from stream, using wire:stream append mode for tokens, and persisting once after $assembled is complete.
Two natural next steps. Add tracing so you can debug why a model gave a particular answer — set up Langfuse observability for Prism and every chunk above gets logged with its prompt, model, and token cost. And if you want to take this from "works on my laptop" to "survives traffic", queueing the LLM call via production AI agents with Prism and Laravel Workflow is the next move.
FAQ#
How do I stream LLM responses with Prism PHP?
Use Prism::text()->using('openai', 'gpt-4o-mini')->withPrompt($prompt)->asStream(). The returned value is a generator of StreamEvent objects — iterate with a foreach and check $event->type() for TextDelta, then read $event->delta for the chunk of text. Always call ob_flush() and flush() after each echo if you're streaming to a plain route; Livewire handles flushing for you when you call $this->stream() from inside a component method.
What is wire:stream in Livewire 4 and when should I use it?
wire:stream is a Livewire 4 attribute that opens a long-running response to the browser and lets the server push chunks into a target element while the request is still open. You should use it for one-way server-to-client updates that arrive over time — LLM chat replies, count-downs, slow report generation. Use Reverb broadcasting instead if multiple clients need the same update, and use a regular AJAX call if the response is a single piece of data.
How do I implement Server-Sent Events in Laravel?
Two ways. The native option is response()->stream(function () { … }, 200, ['Content-Type' => 'text/event-stream', 'X-Accel-Buffering' => 'no']) and you echo data: …\n\n plus ob_flush(); flush(); for each event. The Prism option is Prism::text()->…->asEventStreamResponse(), which returns a fully-formed SSE response with text_delta, stream_end, and error event types already framed. For a Livewire UI specifically, wire:stream is a thinner layer over SSE — you don't write the SSE plumbing at all.
Why does my streamed Prism response stop halfway?
Three usual suspects. First, Telescope's HTTP Client Watcher consumes the stream buffer before you can iterate it — disable the watcher or add livewire* to ignore_paths. Second, Nginx's default proxy_buffering holds the response until it fills the buffer; add proxy_buffering off; or set X-Accel-Buffering: no. Third, you're on Laravel Octane — wire:stream is not supported there, fall back to asEventStreamResponse() or asBroadcast() via Reverb.
Can I stream Anthropic and OpenAI responses the same way with Prism?
Yes — that's the whole point of Prism's provider abstraction. The same ->asStream() call works against 'openai', 'anthropic', 'gemini', 'groq', 'ollama' and the rest. The TextDeltaEvent shape is identical across providers. The differences live one level down: Anthropic emits a thinking_delta event for reasoning content that OpenAI doesn't, and stream_end usage numbers come in provider-shaped JSON inside $event->usage. For the common path — text in, text streamed out — the only change is the string you pass to ->using().
How do I save a streamed chat message to the database without slowing the stream?
Assemble the chunks into a PHP string variable inside the foreach, then write one row after the loop ends. Don't ChatMessage::create() inside the foreach — a 500-token response would be 500 synchronous INSERTs, each one stalling the next token. If you want the persistence pulled out of the streaming code entirely, use Prism's completion callback: ->asEventStreamResponse(function (PendingRequest $request, Collection $events) { /* … */ }) runs once after the stream drains, with the full event log to assemble from.