// how it works

From Raw Sensor Data to Training-Ready.
Fully Automated.

Six specialised agents run inside your infrastructure, end to end. Click any stage to learn what it does.

Ingestrosbag · HDF5 · MP4 · raw streams
Compress100–150× · full signal preserved
ScoreQuality flags low-signal episodes
ProcessVLM annotation, filtering, curation
VisualizeCompare, analyze and inspect multimodal data
DataloaderOptimized dataloaders, minimal GPU idle
All six stages run inside your infrastructure. Zero data egress.
01
// ingest agent

Ingest

The universal receiving dock for your robot data.

Every robot speaks a slightly different language. Knonik's Ingest Agent is fluent in all of them. It accepts raw data in every format your robots produce - rosbag files, HDF5 archives, raw MP4 video, or live sensor streams - and normalises everything into a single consistent structure without any manual conversion work from your team.

Think of it as a receiving dock staffed 24/7. Every delivery is checked, sorted, and put away correctly - automatically.

Formats accepted
rosbag, HDF5, MP4, raw sensor streams, LeRobot v2, LeRobot v3, Zarr
What it does
Validates, deduplicates, and normalises incoming episodes into a unified schema
What you don't do
Write custom parsers, convert formats by hand, or babysit uploads
Where it runs
Entirely inside your infrastructure - no data is uploaded anywhere
02
// compress agent

Compress

Shrink your dataset 100–150× without losing a single bit of learning signal.

Raw robot teleoperation data is extremely redundant. Cameras capture nearly identical frames at high frequency; joint sensors record tiny incremental movements. The Compress Agent orchestrates a carefully chosen combination of existing compression methods - tuned specifically for the patterns in robotics sensor data - and achieves 100–150× size reduction while preserving every meaningful signal your model will ever need. The insight isn't a new algorithm; it's knowing exactly which tools to use, in what order, and with what settings for this data type.

Smaller is faster. Data that fits in memory loads without stalling your GPU. Lower cloud and local memory costs.

Typical reduction
100–150× smaller (same dataset)
Signal fidelity
No loss of learning signal.
Training impact
Compressed data trains to equal or better validation loss (see proof page)
Benefit
Faster I/O and lower storage bills
03
// score agent

Score

An AI reviewer that catches bad demonstrations before they reach your model.

Not every robot demonstration is worth training on. Shaky grasps, incomplete tasks, sensor glitches, and operator mistakes all produce episodes that teach your model the wrong thing. The Score Agent watches every episode and assigns a quality score based on task completion, motion smoothness, sensor consistency, and outcome success - automatically, before your training run even starts.

Garbage in, garbage out. The Score Agent is the quality gate that prevents bad data from ever reaching your model.

What it detects
Failed tasks, noisy trajectories, sensor drop-outs, repetitive or near-duplicate episodes
Output
A quality score and flag per episode - bad ones are quarantined, not deleted
Why it matters
One bad episode in a small dataset can degrade policy performance by 15–30%
Human role
Review flagged episodes if you want to - the agent handles the rest
04
// process agent

Process

A Vision-Language Model that reads your robot's videos and writes descriptions - no human labellers needed.

Training modern robot policies often requires natural-language task descriptions, object labels, and structured episode metadata. The Process Agent uses a Vision-Language Model (VLM) to watch each episode's video, understand what the robot is doing, and automatically generate accurate annotations - descriptions, object labels, task phases, and quality notes. It then filters and curates based on your criteria.

Language-conditioned policies need language labels. This agent generates them at the speed of your data pipeline, not your annotation budget.

Annotations generated
Task descriptions, object labels, phase segmentation, success / failure tags
Technology
Vision-Language Model (VLM) running on your infrastructure
Filtering
Apply custom rules to curate which episodes enter your training set
Cost vs humans
Eliminates manual labelling time entirely for standard annotation tasks
Compute efficiency
Input prompt optimisation sends only relevant frames to the VLM, not the whole video - reducing compute cost by over 70%
05
// visualize agent

Visualize

A live dashboard for understanding what your robot actually collected.

You shouldn't be training blind. The Visualize Agent generates a rich, interactive dashboard where your team can browse every episode, watch the raw video alongside joint trajectories, compare episodes side by side, and explore annotations and quality scores. It's the difference between trusting your dataset and actually knowing what's in it.

Every robot team has a dataset they've never fully looked at. This makes it possible to actually understand what you have.

Episode browser
Search, filter, and sort by task, score, date, or annotation
Multimodal playback
Video, joint angles, end-effector pose, and force signals in sync
Comparison view
Overlay two episodes to spot differences in strategy or execution
Access
Web dashboard running inside your infrastructure - no external accounts
06
// dataloader agent

Dataloader

High-speed data delivery that keeps your GPU busy instead of waiting.

GPU time is expensive. A dataloader that stalls - even for 100 ms per batch - wastes 10–30% of your training budget. Knonik's Dataloader is purpose-built for compressed robotics data, using parallel decoding and shared-memory inter-process communication to deliver the next batch to your GPU before it's needed, every time.

The fastest loader isn't the one with the highest throughput spec - it's the one that never makes your GPU wait.

Epoch speedup
4.3× faster per epoch than standard LeRobot v3 loading on the same dataset
Modes
Batched (best for long runs), OnDemand (instant start), Pipelined (parallel decode)
Cross-epoch cache
Data decoded once and served from RAM across all subsequent epochs
Tail latency
p95 wait under 150 ms across all modes - fewer GPU stalls per training run
// benchmarks

Don't Take Our Word For It.
See The Numbers.

We benchmarked Knonik against standard tooling on real robotics data. Compression quality verified across three policy architectures. DataLoader performance measured end-to-end.