How to stream Postgres to Typesense
Build real-time search indexes with Postgres change data capture (CDC) and Typesense. Learn to keep your search collections in sync with your database.
This guide shows you how to set up Postgres change data capture (CDC) and stream changes to Typesense using Sequin.
With Postgres data streaming to Typesense, you can build real-time search experiences, maintain up-to-date search indexes, and provide fast, typo-tolerant search for your users.
By the end of this how-to, you’ll have a configured Typesense cluster that stays in sync with your database.
Prerequisites
If you’re self-hosting Sequin, you’ll need:
If you’re using Sequin Cloud, you’ll need:
Basic setup
Prepare your Typesense cluster
Sequin converts Postgres changes and rows into JSON documents and sends them to your Typesense cluster.
You’ll need to have a Typesense cluster running and accessible to Sequin. You can run Typesense locally for development or use a hosted instance.
If you’re using Sequin Cloud and developing locally, you can use the Sequin CLI’s tunnel command to connect Sequin’s servers to a local Typesense instance. So, no need to deploy your Typesense instance just yet!
Mapping Postgres tables to Typesense collections
From the Typesense docs:
In general, we recommend that you create one collection per type of document / record you have.
Similarly, Sequin recommends that you create one sink per Postgres table. That sink will be configured to stream changes from a single Postgres table to a single Typesense collection. This ensures that all the documents in the same collection have the same structure.
When significantly scaling Typesense, you may want to shard collections. For instance, you may shard your users
collection by region into users_us
, users_eu
, users_asia
, etc.
You can achieve this using Sequin’s filters. For example, if you want to shard your users
collection by region, you can create a sink per region and use a filter to route documents to the correct collection.
Sequin will soon launch Routing functions which will enable routing rows from one table to multiple collections within the same sink.
Create Typesense sink
Navigate to the “Sinks” tab, click “Create Sink”, and select “Typesense Sink”.
Configure the source
Select source table
Under “Source”, select the table you want to stream data from.
Specify filters
Firstly, you can indicate whether you want to receive insert
, update
, and/or delete
actions. We recommend selecting all three, which will ensure that your Typesense collection is kept in sync with your Postgres table.
You can for instance unselect deletes
if you want to exclude deleted rows from your Typesense collection. This means that records deleted from Postgres will not be removed from your Typesense collection and will still be searchable.
You can also specify filters to narrow down the documents you want to index. For example, if you only want to index products
that are currently in_stock
, you can add a filter on in_stock = true
. This is also useful for sharding or multi-tenancy.
Specify backfill
You can optionally indicate if you want your Typesense collection to receive a backfill of all or a portion of the table’s existing data. Backfills are useful if you want Sequin to populate your Typesense collection with existing data from your Postgres table. If you choose not to backfill on creation, you can backfill at any time.
Transforms
Typesense documents expect a specific schema, including:
- An
id
field, which must be a string and is used to uniquely identify the document. - Additional fields that are searchable.
You can map your Postgres table columns to Typesense fields using a functional transform.
We recommend mapping your table to a map literal, such as:
This ensures that the schema of your Typesense collection is well defined.
If instead you want any changes in your underlying table schema to be immediately reflected in your Typesense collection, you can return the record
directly, perhaps only modifying the id
field to ensure it exists and is a string.
Specify message grouping
Under “Message grouping”, you’ll most likely want to leave the default option selected to ensure events for the same row are processed in order.
While your Typesense collection is processing a message, if a new change or version related to the same row is captured, Sequin will hold the message back until the first message is processed. This is almost always the desired behavior.
Configure delivery
Specify batch size
The right value for “Batch size” depends on your data and your requirements.
Typesense is optimized to bulk import of documents. This is balanced with the memory requirements for Sequin to accumulate a batch of documents.
A good starting point is 100-1000 documents per batch, and Typesense supports up to 10,000 documents per batch.
Select the Typesense collection you created
Under “Typesense Collection”, select the collection you want to stream data rows from this table to.
Create the sink
Give your sink a name, then click “Create Typesense Sink”.
Verify & debug
To verify that your Typesense sink is working:
- Make some changes in your source table.
- Verify that the count of messages for your sink increases in the Sequin web console.
- Verify receipt of documents in your Typesense collection by searching for them.
If documents don’t seem to be flowing:
- Click the “Messages” tab to view the state of messages for your sink.
- Click any failed message.
- Check the delivery logs for error details, including the response from Typesense.
Next steps
Assuming you’ve followed the steps above for your local environment, “How to deploy to production” will show you how to deploy your implementation to production.