Laravel Scout + Typesense — A Production Guide to Typo-Tolerant, Faceted, and Vector Search

Wire Laravel Scout to Typesense for typo-tolerant, faceted, and hybrid vector search. Docker, queued indexing, Pest tests, and a /up health check, end to end.

Steven Richardson
Steven Richardson
· 16 min read

A user types "lavrel midleware" into your search box and gets zero results. Your Postgres LIKE '%lavrel%' query crawls on a 200k-row articles table. The Scout database driver — the one you reached for because "search felt solved" — has no concept of typo tolerance, no facets, no vector ranking, and no path to any of those without a rewrite. This is the moment most Laravel teams discover they need a real search engine.

This guide walks the full path from docker compose up to a production-ready, queue-backed, hybrid keyword-plus-vector search experience using Laravel Scout and Typesense. Every "Scout with Typesense" tutorial I've read stops at composer require and a hello-world query — this one covers the production concerns: async indexing, the collection schema, faceted filtering, vector embeddings via the Laravel AI SDK, the Pest test suite, and a Typesense health check wired into /up.

Run Typesense locally with Docker Compose#

Start Typesense on your laptop before touching any PHP. Running it locally — rather than reaching straight for Typesense Cloud — gives you full visibility into the HTTP API, the data directory, and the startup flags, and it keeps the development loop tight. The Typesense server is a single static binary written in C++ and the official image is around 80 MB, so it boots in well under a second.

Add a typesense service to your existing docker-compose.yml (the one alongside your MySQL and Redis services). Use a named volume for /data so the index survives container restarts, and pin to a specific Typesense major version — 28.0 was the latest stable at time of writing.

services:
  typesense:
    image: typesense/typesense:28.0
    restart: on-failure
    ports:
      - "8108:8108"
    volumes:
      - typesense-data:/data
    command: >
      --data-dir /data
      --api-key=local-typesense-api-key
      --enable-cors

volumes:
  typesense-data:

The --api-key flag sets the master API key for local development — never use local-typesense-api-key in production. The --enable-cors flag is only needed if you'll hit Typesense directly from the browser (Algolia-style InstantSearch); for server-rendered Scout queries you can drop it.

Bring it up and confirm the server is healthy:

docker compose up -d typesense
curl http://localhost:8108/health
# {"ok":true}

If you're on Laravel Sail, the container name typesense becomes the host you'll point Scout at — TYPESENSE_HOST=typesense in .env, not 127.0.0.1. That's the trap most first-time Sail users hit.

Install Laravel Scout and the Typesense driver#

Scout itself is the framework-agnostic search abstraction; the Typesense driver is a separate first-party package maintained by the Typesense team. Install both, publish the Scout config, and run the migrations Scout ships (it creates a small scout_pending_updates table that the queue indexer uses for batching).

composer require laravel/scout
composer require typesense/laravel-scout-typesense-driver
php artisan vendor:publish --provider="Laravel\Scout\ScoutServiceProvider"
php artisan migrate

Wire the credentials into .env. Scout reads SCOUT_DRIVER first, then the driver-specific block:

SCOUT_DRIVER=typesense
SCOUT_QUEUE=true

TYPESENSE_API_KEY=local-typesense-api-key
TYPESENSE_HOST=localhost
TYPESENSE_PORT=8108
TYPESENSE_PROTOCOL=http

SCOUT_QUEUE=true is the single most important line in that block — it's what flips Scout from synchronous to asynchronous indexing. We'll wire the worker up properly in the queue step below. If you're new to running queues in production, the patterns in scaling Laravel queues in production lay the groundwork this guide assumes.

Configure the Typesense collection schema#

Unlike Algolia (which infers schema from your records) and Meilisearch (which has optional schema-less mode), Typesense requires an explicit collection schema. This is a feature, not a wart: typed fields are how Typesense delivers consistent sub-50ms p99 latency without runtime guessing. The schema lives in config/scout.php under the typesense.model-settings array, keyed by the fully qualified model class.

Import your model at the top of the config file, then define the collection. Here's the schema I use for a content Article model — searchable title and body, filterable status and tags, sortable timestamps, and a placeholder vector field for the hybrid search step later.

use App\Models\Article;

return [
    // ...
    'typesense' => [
        'client-settings' => [
            'api_key' => env('TYPESENSE_API_KEY'),
            'nodes' => [
                [
                    'host' => env('TYPESENSE_HOST', 'localhost'),
                    'port' => env('TYPESENSE_PORT', '8108'),
                    'protocol' => env('TYPESENSE_PROTOCOL', 'http'),
                ],
            ],
            'connection_timeout_seconds' => 2,
        ],

        'model-settings' => [
            Article::class => [
                'collection-schema' => [
                    'fields' => [
                        ['name' => 'id', 'type' => 'string'],
                        ['name' => 'title', 'type' => 'string'],
                        ['name' => 'body', 'type' => 'string'],
                        ['name' => 'status', 'type' => 'string', 'facet' => true],
                        ['name' => 'tags', 'type' => 'string[]', 'facet' => true],
                        ['name' => 'author_id', 'type' => 'int64'],
                        ['name' => 'published_at', 'type' => 'int64', 'sort' => true],
                        ['name' => 'body_embedding', 'type' => 'float[]', 'num_dim' => 1536, 'optional' => true],
                    ],
                    'default_sorting_field' => 'published_at',
                ],
                'search-parameters' => [
                    'query_by' => 'title,body',
                ],
            ],
        ],
    ],
];

Three details worth flagging. The id field is always string even if your primary key is an integer — Typesense converts it for you, but the schema must declare string. The published_at field is int64 because Typesense stores timestamps as Unix epochs and you'll cast in toSearchableArray(). And the body_embedding field is optional: true so rows can be indexed before their vector is generated — critical for queued embedding generation.

Add the Searchable trait to your Article model#

The Searchable trait does two jobs: it registers model observers that push every save and delete to Scout, and it exposes the Article::search() query DSL. Add the trait, then override toSearchableArray() to shape the payload that ships to Typesense. The keys here must match the field names in the collection schema above — that's the only contract Scout enforces.

namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use Laravel\Scout\Searchable;

class Article extends Model
{
    use Searchable;

    public function toSearchableArray(): array
    {
        return [
            'id' => (string) $this->id,
            'title' => $this->title,
            'body' => strip_tags($this->body),
            'status' => $this->status->value,
            'tags' => $this->tags->pluck('name')->all(),
            'author_id' => (int) $this->author_id,
            'published_at' => $this->published_at?->getTimestamp() ?? 0,
            'body_embedding' => $this->body_embedding,
        ];
    }

    public function shouldBeSearchable(): bool
    {
        return $this->status->value === 'published';
    }
}

shouldBeSearchable() is the Scout escape hatch for keeping drafts out of the index — return false and Scout will skip indexing on save and remove the record from the index on update. It's how you stop user search from returning unpublished articles without filtering at query time.

Two casting traps catch people. Numeric fields need explicit casts ((int) $this->author_id) — if Eloquent gives you a string and the schema says int64, Typesense rejects the document with a cryptic error. And strip_tags($this->body) removes HTML so search ranking isn't poisoned by markup tokens; if you're storing Markdown, run it through a Str::markdown() strip first.

Backfill the index with scout:import#

With the model wired up, the index is empty until you push existing rows in. Scout ships two import commands: scout:import chunks through the table and dispatches index calls inline, and scout:queue-import (added in Scout v10) pushes batches onto the queue so a worker handles them in parallel. For anything over 10k rows, reach for the queued version.

# Synchronous — fine for a few thousand rows or local dev
php artisan scout:import "App\Models\Article"

# Queued — the right call for production-scale backfills
php artisan scout:queue-import "App\Models\Article"

scout:queue-import is meaningfully smarter than the older sync command. Instead of streaming rows sequentially (which serialises behind your slowest read), it queries the MIN and MAX of the primary key and dispatches one queued job per chunk. Twenty workers can then chew through a million-row table in parallel. I've used it to backfill an 8M-row catalogue in under twenty minutes against a single Typesense node.

A trick the docs hide: pass --chunk=1000 to tune the batch size against your record fattness. If toSearchableArray() makes three eager-loaded joins per row, drop chunk size to 250 to keep memory bounded. If your records are tiny, raise it to 5000 to cut HTTP round-trips. The default of 500 is a fine starting point but rarely the right answer at scale.

Move indexing onto the queue with SCOUT_QUEUE#

Synchronous indexing is the default and it's a trap. Every $article->save() blocks the HTTP request until Typesense responds — usually 20–80ms, but spikes to 2 seconds when Typesense is busy or the network blips. Multiply that by every controller that touches an indexed model and you have a tail latency problem nobody can pin down.

Flip Scout to async by setting SCOUT_QUEUE=true (already in our .env above) and configure the connection and queue Scout should push to. In config/scout.php:

'queue' => [
    'connection' => 'redis',
    'queue' => 'scout',
],

Then run a worker dedicated to that queue so Scout updates can't get stuck behind a long-running email job:

php artisan queue:work redis --queue=scout,default --tries=3 --max-time=3600

The --max-time=3600 flag recycles the worker every hour to release memory accumulated by long-lived PHP processes. The patterns in tuning queue workers with max-jobs and max-time cover the trade-offs in depth. For monitoring, Horizon in production gives you per-queue throughput and failed-job dashboards — install it before the first time Scout silently stops indexing because the worker died.

One subtlety: even with SCOUT_QUEUE=false, the Algolia and Meilisearch drivers index asynchronously inside the engine, so the index briefly lags the database. Typesense indexes synchronously inside the engine — the document is searchable the moment the HTTP call returns. That's a meaningful property for tests, which we'll lean on in the Pest step.

Build the search endpoint with the Scout query DSL#

The Scout query API is the same regardless of driver: Article::search($term). That call returns a Scout\Builder which you can paginate, filter, sort, and convert to Eloquent results. Here's a search controller that powers a paginated results page:

namespace App\Http\Controllers;

use App\Models\Article;
use Illuminate\Http\Request;

class SearchController extends Controller
{
    public function __invoke(Request $request)
    {
        $request->validate([
            'q' => 'required|string|min:2|max:200',
        ]);

        $articles = Article::search($request->string('q'))
            ->paginate(perPage: 20);

        return view('search.results', [
            'query' => $request->string('q'),
            'articles' => $articles,
        ]);
    }
}

paginate() returns a LengthAwarePaginator exactly like an Eloquent paginator — $articles->links() and $articles->withQueryString() both work in your Blade view without any extra plumbing.

For an API response, hand the paginator to an API Resource. The JSON:API resource patterns walk through shaping search results into a spec-compliant envelope; for a simpler REST endpoint, ArticleResource::collection($articles) is enough.

Typo tolerance is on by default. A search for "lavrel midleware" returns the same results as "laravel middleware" because Typesense uses a configurable edit-distance threshold (default: 2 for tokens of 5+ characters, 1 for shorter ones). You can tune this per query by passing optionsArticle::search($q)->options(['typo_tokens_threshold' => 1]) — but the defaults are sensible for English content.

Add faceted filtering and where clauses#

Real search rarely returns "all matches" — it returns "matches filtered by status, scoped to a tenant, sorted by recency". Scout's where, whereIn, and options hooks expose Typesense's filter and facet primitives without you writing raw query strings.

$articles = Article::search($request->string('q'))
    ->where('status', 'published')
    ->whereIn('author_id', $allowedAuthorIds)
    ->options([
        'filter_by' => 'tags:[laravel, scout]',
        'facet_by' => 'tags,status',
        'max_facet_values' => 20,
    ])
    ->paginate(20);

where('status', 'published') and whereIn('author_id', [...]) compile to Typesense filter_by clauses automatically — no raw filter strings, no driver-specific code. The two patterns work identically against Meilisearch and Algolia, so swapping drivers later doesn't require touching this controller.

For richer expressions (date ranges, OR groups, geo radius), drop into options(['filter_by' => '...']) with raw Typesense syntax. The facet results come back on the raw Typesense response, which Scout exposes via Article::search($q)->raw() instead of paginate() — useful when you need both the paginated docs and the facet counts to render a sidebar of filter pills.

A pattern I use: a SearchFacets value object that wraps the raw response and exposes ->tagCounts(), ->statusCounts(), etc. Keeps the controller thin and lets me test facet rendering without standing up Typesense in the test suite.

Lexical search nails exact-match recall. Vector search nails semantic intent — "best wineries in Napa" matches an article titled "Top Vineyards to Visit". The pair is called hybrid search and Typesense scores it natively using Reciprocal Rank Fusion, blending text match scores with cosine similarity into one ranking.

The setup is two-sided. At index time, generate an embedding for the body and store it in body_embedding. At query time, embed the search string the same way and pass it as a vector query. The Laravel AI SDK complete guide covers the embedding API in depth; the short version:

namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use Laravel\Ai\Facades\Embeddings;
use Laravel\Scout\Searchable;

class Article extends Model
{
    use Searchable;

    public function toSearchableArray(): array
    {
        return [
            // ...existing fields...
            'body_embedding' => $this->body_embedding ?? $this->generateEmbedding(),
        ];
    }

    protected function generateEmbedding(): array
    {
        $vector = Embeddings::for(strip_tags($this->body))->generate();
        $this->forceFill(['body_embedding' => $vector])->saveQuietly();

        return $vector;
    }
}

saveQuietly() is critical — it persists the embedding without re-triggering Scout's saved observer, which would loop. With this in place, indexing flows like this: save() → Scout job → toSearchableArray() → generate embedding if missing → push to Typesense. The expensive embedding call only happens once per article and never on the request thread.

For the query, pair a normal keyword search with a vector_query option:

$embedding = Embeddings::for($request->string('q'))->generate();

$articles = Article::search($request->string('q'))
    ->options([
        'query_by' => 'title,body',
        'vector_query' => sprintf(
            'body_embedding:([%s], alpha: 0.3, k: 100)',
            implode(',', $embedding),
        ),
    ])
    ->paginate(20);

alpha: 0.3 weights the result 30% keyword, 70% semantic. Tune empirically against a labelled set of queries — a docs search wants high keyword weight (0.7+) because users type exact terms; a discovery feed wants the opposite. The same Postgres-side approach is covered in whereVectorSimilarTo and native semantic search if your stack is already on pgvector; Typesense earns its keep when you want lexical and semantic ranking in a single sub-50ms call without a join.

For RAG flows that pair retrieval with generation, the RAG pipeline with Laravel AI SDK and pgvector walks through feeding the retrieved chunks back to an LLM — the retrieval half of that pipeline can be lifted straight onto Typesense if you'd rather not run pgvector.

Test the integration with Pest and Scout::fake()#

Integration tests that hit real Typesense are slow, flaky, and a nightmare in CI. Scout ships Scout::fake() which short-circuits the driver and records every indexing call; combine it with the community sti3bas/laravel-scout-array-driver and you get assertable in-memory search results too.

Install the array driver as a dev dependency:

composer require sti3bas/laravel-scout-array-driver --dev

In phpunit.xml (or Pest.php), force the Scout driver to array for the test environment:

<env name="SCOUT_DRIVER" value="array"/>

Then the test reads like any other Pest feature test:

use App\Models\Article;
use Sti3bas\ScoutArray\Facades\Search;

it('returns published articles matching the query', function () {
    Article::factory()->create(['title' => 'Laravel Scout with Typesense', 'status' => 'published']);
    Article::factory()->create(['title' => 'Django Search', 'status' => 'published']);

    $response = $this->get('/search?q=laravel');

    $response->assertOk()
        ->assertSee('Laravel Scout with Typesense')
        ->assertDontSee('Django Search');
});

it('does not index draft articles', function () {
    $article = Article::factory()->create(['status' => 'draft']);

    Search::assertNotContains($article);
});

Search::assertContains() and assertNotContains() check the in-memory index without touching Typesense, so the suite stays fast — sub-second for hundreds of search tests. For higher-confidence end-to-end coverage, keep one Pest test that hits a real Typesense container in CI, but gate it with @group integration so it only runs against a Docker service. The strategies in the Pest architecture testing guide work cleanly here for catching missing Searchable traits across a growing codebase.

Wire Typesense into your /up health check#

Laravel 13 ships a /up health check endpoint that Kubernetes, AWS ALBs, and Laravel Cloud all probe. By default it only verifies the framework boots; in production you want it to confirm Typesense is reachable too. If your search engine is down, the app shouldn't claim to be healthy and accept traffic.

Add a custom check that hits Typesense's /health endpoint with a tight timeout:

use Illuminate\Support\Facades\Http;
use Illuminate\Foundation\Application;

(new Application)->withHealthCheck(function () {
    $response = Http::timeout(1)->get(
        sprintf(
            '%s://%s:%s/health',
            config('scout.typesense.client-settings.nodes.0.protocol'),
            config('scout.typesense.client-settings.nodes.0.host'),
            config('scout.typesense.client-settings.nodes.0.port'),
        ),
    );

    abort_unless($response->ok() && $response->json('ok') === true, 503, 'Typesense unhealthy');
});

The 1-second timeout matters. Without it, a hung Typesense node can wedge the health check long enough that your orchestrator gives up on the pod entirely. The companion guide on Laravel health checks for Kubernetes probes walks through readiness vs liveness semantics — readiness should fail when Typesense is unreachable, but liveness should not, or you'll restart healthy app pods every time the search engine hiccups.

For production deployment, the Docker multi-stage build patterns cover how to bake the Typesense PHP SDK into the production image without ballooning the layer size, and how to scope Composer install to the production dependency tree only.

Wrapping up#

You now have a Scout + Typesense stack that handles typo tolerance, faceted filtering, async indexing, hybrid keyword-plus-vector search, in-memory Pest tests, and a real production health check. None of those pieces require a paid Algolia tier; all of them work on a single Typesense container that costs roughly $10/month to self-host on the smallest viable VPS.

Three natural next steps. If you want predictable cost at higher search volume, swap the local Docker container for a Typesense Cloud cluster — the API and Scout config don't change, only the connection string. If you want to broaden the AI side, the Laravel AI SDK complete guide covers reranking, agents, and the embedding cache that makes per-query embedding affordable. And if you're going to lean on this in production at any meaningful traffic level, set up Horizon for queue monitoring before the first time Scout silently stops indexing.

FAQ#

What is Laravel Scout and what does it do?

Laravel Scout is the framework-agnostic search abstraction that ships with Laravel. It adds a Searchable trait to your Eloquent models and a Article::search($query) query DSL, then delegates the actual indexing and querying to a driver — Algolia, Meilisearch, Typesense, or the built-in database driver. The point of Scout is that you write the same Eloquent-flavoured code regardless of which search engine you use, so you can swap drivers without rewriting your controllers or models.

Why choose Typesense over Meilisearch for Laravel Scout?

Typesense and Meilisearch are both open-source search engines with first-party Scout drivers, but Typesense ships more production-grade features in the open-source core: vector search, geo filtering, faceting on arrays, and built-in horizontal scaling are all included without a paid tier. Meilisearch keeps a sharper developer-experience focus but gates HA clustering behind Meilisearch Cloud. For self-hosted Laravel apps that want hybrid vector search and predictable filtering performance without a hosted dependency, Typesense is the stronger fit.

How do I make Laravel Scout index models asynchronously?

Set SCOUT_QUEUE=true in your .env and run a queue worker. Scout then dispatches a queued job for every save(), update(), and delete() instead of indexing inline on the HTTP request. Configure the connection and queue in config/scout.php under the queue array so Scout updates land on a dedicated queue and can't get stuck behind long-running jobs. For a backfill, prefer php artisan scout:queue-import over scout:import because it dispatches chunked jobs and scales linearly with worker count.

How do I add faceted search to Laravel Scout?

Use Scout's where() and whereIn() for simple equality filters, and pass driver-native facet directives through ->options(['facet_by' => 'tags,status']). The same Scout query DSL compiles to filter strings on Typesense, Meilisearch, and Algolia, so the controller code stays portable. To render facet counts in the UI, switch from ->paginate() to ->raw() which returns the full Typesense response including facet_counts, then wrap that response in a value object so views and tests don't depend on the raw driver shape.

Can Typesense do vector search alongside keyword search?

Yes — that pairing is called hybrid search and Typesense scores it natively using Reciprocal Rank Fusion. At index time, generate an embedding with the Laravel AI SDK and store it in a float[] field declared with num_dim matching your model. At query time, pass both a query_by for the lexical fields and a vector_query containing the embedded search string with an alpha weight between 0.0 and 1.0. Typesense ranks each document by both its text-match score and its cosine similarity, then fuses them into a single ranking.

How do I self-host Typesense for production?

Run the official typesense/typesense Docker image on a small VM with persistent storage for /data, expose port 8108 behind a reverse proxy with TLS, and set a strong --api-key. A single 2-vCPU node with 4GB of RAM comfortably handles tens of thousands of QPS for indexes under a million documents. For HA, run a three-node cluster with --nodes pointing at a peers file; Typesense uses Raft for leader election and synchronous replication. Back up the data directory daily and snapshot it before any version upgrade because schema migrations between major versions sometimes require a reindex.

How do I migrate from the database driver to Typesense?

Change SCOUT_DRIVER from database to typesense, add the typesense block to config/scout.php with your collection schema, deploy the change, then run php artisan scout:queue-import "App\Models\Article" to backfill the new index. The Scout query API doesn't change — Article::search($q)->paginate() works identically against either driver, so your controllers and tests don't need updating. The one trap: the database driver supports LIKE-style partial matches by default, while Typesense uses tokenised search, so a query for "lara" no longer matches "laravel" unless you configure prefix matching via prefix: true in search-parameters.

How do I test Laravel Scout in Pest without hitting a real search server?

Set SCOUT_DRIVER=array in phpunit.xml and install the sti3bas/laravel-scout-array-driver package. Scout then writes every indexing call to an in-memory array, and the package exposes Search::assertContains(), Search::assertNotContains(), and Search::assertEmpty() so you can verify which models reached the index without a live Typesense container. For end-to-end coverage, keep one Pest test group that runs against a real Typesense Docker service in CI, but gate it behind a tag so the fast in-memory suite runs on every commit and the slower integration suite runs only on the integration job.

Does Scout support pagination with Typesense?

Yes — call ->paginate(perPage: 20) on the Scout builder and you get back an Illuminate\Pagination\LengthAwarePaginator identical to the one Eloquent returns, including $articles->links(), ->withQueryString(), and the total result count. Behind the scenes Scout requests page and per_page from Typesense and re-hydrates the matching documents into Eloquent models via a single whereIn lookup on the primary key, so pagination round-trips correctly even when combined with where() filters, sorts, and vector queries.

Steven Richardson
Steven Richardson

CTO at Digitonic. Writing about Laravel, architecture, and the craft of leading software teams from the west coast of Scotland.