TypeStream - Home

Your Kafka topics are full of data.
Getting value out shouldn't require a Java project.

01

Stop building a microservice for every pipeline

A new repo, a new deployment, a new thing to monitor. The overhead kills small use cases before they start.

02

Stop managing schemas by hand

You have Schema Registry but every consumer manually deserializes each topic and hopes the schema hasn't changed.

03

Stop deploying and hoping

There's no dry-run for your consumer. You deploy and find out in production whether it works.

04

Stop writing integration code for every sink

Elasticsearch, S3, ClickHouse -- every destination is custom code with its own retry logic and failure modes.

05

Stop joining data through your database

Combining data from two topics means querying a database or building state yourself. It shouldn't be this hard.

TypeStream addresses all five. Config replaces code. One engine replaces N microservices. Schema Registry integration is automatic. And typestream plan shows you what will change before anything runs.

Three steps. No Java required.

1

Define your pipeline

Write a config file that says which topics to read, how to transform the data, and where to send it. Or prototype with CLI pipes that read like Unix.

cat /dev/kafka/local/topics/page_views \ | filter .country == "US" \ | > /dev/kafka/local/topics/us_views

Or as a config file -- version-controlled, reviewable in a PR.

2

Preview with `typestream plan`

See exactly what will change before anything runs. Like terraform for data pipelines.

$ typestream plan us-traffic.json us-traffic: + CREATE pipeline "us-traffic" source: page_views steps: filter(.country == "US") sink: us_views 1 to create, 0 to update, 0 to delete.

3

Apply and it runs

TypeStream compiles your config into a Kafka Streams topology, reads schemas from your Schema Registry, and starts processing. Change the config, plan again, apply again.

$ typestream apply us-traffic.json ✓ Created pipeline "us-traffic" Running on your Kafka cluster.

Define pipelines your way

Config files for production. CLI for exploration. Visual builder for discovery.

{
  "name": "us-traffic",
  "source": "page_views",
  "encoding": "AVRO",
  "steps": [
    { "filter": ".country == \"US\"" }
  ],
  "sink": "us_views"
}
            

Version-controlled. CI/CD friendly. Review in a PR. This is the real format -- what you see is what you deploy.

cat /dev/kafka/local/topics/page_views | filter .country == "US" | > /dev/kafka/local/topics/us_views

Pipe commands together. Great for prototyping and exploration. Reads like Unix because it is.

Drag, drop, connect. Good for exploring what's possible.

All three produce the same compiled topology. Start wherever feels natural -- export to config for production.

What teams build on their Kafka

Each pattern is a config file. Not a microservice.

Stream processing

Filter and route

Route events from one topic to many based on content. Region splits, priority routing, conditional fanout. Today this is a custom consumer or a Kafka Streams app.

TypeStream: one config file. Define predicates, set output topics, deploy.

Join two topics

Combine data from two topics by key. Enrich orders with user data, match events with metadata. One of the hardest things in Kafka -- joins, windowed joins, serde alignment.

TypeStream: one config file. Declare both sources, specify the join key, define the output.

Real-time aggregations

Counts, windowed counts, grouped metrics from your event stream. Today that means writing a Kafka Streams app with state stores, KTables, serde config.

TypeStream: one config file. Group by a field, count, done. No state store boilerplate.

Enrichment & AI

Enrich with AI in-flight

Every event needs classification, sentiment scoring, or entity extraction. You're either doing it in batch or building a custom ML pipeline alongside your streams.

TypeStream: add an OpenAI node to your pipeline config. Every event flows through the LLM and lands enriched in the output topic.

Add semantic search

Generate embeddings from your topic data and push to a vector store. Building this pipeline means standing up an embedding service, keeping it in sync, handling failures.

TypeStream: add an embedding node. Vectors flow to Weaviate or any vector store. As your data changes, embeddings update automatically.

Geo-enrich your events

IP addresses without locations. You need geo-tagged events for analytics, compliance, or routing -- but adding a GeoIP lookup to every consumer is tedious.

TypeStream: add a GeoIP node. Every event arrives with country, city, and region. Feed it into aggregations for real-time analytics by geography.

Kafka Connect made easy

Sink to Elasticsearch

Keep a search index in sync with your topic data. Extract text, filter what gets indexed, handle backpressure when ES is slow. Connector config alone is a project.

TypeStream: config file with transform nodes and an Elasticsearch sink. Schema-aware, backpressure built in.

Replicate to your warehouse

Stream topic data to ClickHouse, Snowflake, BigQuery, or S3. Setting up the connector, managing offsets, handling schema changes -- it's always more work than expected.

TypeStream: config file with a warehouse sink. Every event arrives in seconds, not the next batch run.

Route CDC to anywhere

You have Debezium CDC topics. You need to filter, transform, and route them to downstream systems. But the CDC topics are complex and connector config is its own job.

TypeStream: CDC topics are just Kafka topics. Filter, enrich, and sink to any of 200+ Kafka Connect destinations.

Stop writing consumer microservices.

You have a service that consumes from a topic, does some processing, and writes somewhere. You've written it a dozen times. Here's what changes with TypeStream.

	Your consumer microservice	TypeStream
Build a pipeline	Write a service in your language, handle offsets, wire up deserialization, deploy it	Write a config file, deploy it
Add another pipeline	Another repo, another service, another deployment to monitor	Another config file
Handle schemas	Manually deserialize each topic, hope the schema hasn't changed	Reads your Schema Registry automatically, catches schema errors before anything runs
Join two topics	Query a database, or consume both and build state yourself	Declare both sources and the join key in config
Aggregate or count	Write to a database, run queries against it later	Aggregate into an auto-generated REST endpoint you can query directly
Push to Elasticsearch, S3, etc.	Write the integration code yourself, handle retries and backpressure	Declarative sink in the pipeline. 200+ Kafka-native plugins deployed through TypeStream.
Preview before deploying	Deploy and hope	`typestream plan` shows exactly what will change
Consumer goes down	Hope your offset management is right, debug at 2am	TypeStream handles offsets, state, and recovery automatically

Under the hood, it's Kafka Streams. TypeStream compiles your config into Kafka Streams topologies -- the same technology powering stream processing at LinkedIn, Netflix, and Uber. You get reliability without writing Java microservices.

Your Schema Registry does more than you think.

TypeStream reads your Schema Registry and propagates types through every node in your pipeline. Source schemas are resolved automatically. Each transform declares how it changes the schema -- filter passes through, join merges, enrichment adds a field.

∅

Auto-generated clients

Avro and Protobuf schemas generate typed clients in any language

✓

Compile-time checks

Schema errors caught before any topology runs

Δ

Schema evolution

When upstream schemas change, your pipeline knows

TypeStream propagates schemas through every node in your pipeline. Source schemas resolve automatically, transforms declare their output shape, and schema errors are caught at plan time -- before any topology runs.

Built for production, not just demos

Plan before you deploy

Every change goes through typestream plan first. See exactly what pipelines will be created, updated, or deleted -- before anything runs. Review in CI, approve in a PR.

Backpressure and delivery guarantees

If a sink is slow or down, TypeStream queues and retries automatically. No data loss, no duplicate processing. Built on Kafka Streams exactly-once semantics.

Observability included

Metrics, logs, and traces out of the box. Know how many events per second each pipeline processes, where bottlenecks are, and when something fails.

Connects to your Kafka cluster. Doesn't replace it.

Already have Kafka?

Point TypeStream at your existing cluster and Schema Registry. It reads your topics, reads your schemas, and runs pipelines on the infrastructure you already operate. Nothing to migrate.

Starting fresh?

TypeStream bundles Kafka, Schema Registry, and Kafka Connect in a single docker compose up. Get the full streaming stack without learning to operate it.

Source available (BSL)

Read every line of code. Business Source License means no vendor lock-in. If TypeStream disappears tomorrow, you still have the code.

Your topics are already flowing.
Start building on them.

Try the demo, explore the repo, or tell us about your Kafka setup. We'll help you build your first pipeline.

Try the live demo Talk to us

or view on GitHub →

Terraform for Kafka Streams.

Your Kafka topics are full of data. Getting value out shouldn't require a Java project.