60 BC-Transformer trainings across 3 storage modes, 10 tasks, and 2 seeds on the LIBERO-Object benchmark (500 demonstrations, 149,014 RGB frames). Policy, optimizer, and evaluation held fixed — only the data format and dataloader change.
Knonik Lossless
64.25% vs 54.35% HDF5
+9.9 pp · 3.51× smaller
Knonik Lossy
54.83% (tied baseline)
+0.5 pp · 18.23× smaller
Stability
HDF5 worst collapse: 90.5 pp
Lossless max: 61.5 pp · Lossy: 46 pp
Figure 1 — Storage cost (log scale) vs. mean held-out success
Lossy is nearly an order of magnitude more storage-efficient per unit of task success. Lossless gets both a size reduction and a quality gain.
Figure 2 — Throughput (samples/s) and per-batch data-fetch latency (log scale)
Throughput (samples / s)
Batch-fetch latency (ms, log scale)
Knonik's parallel prefetch hides decode overhead — compressed data arrives faster than raw HDF5. Median batch-fetch latency drops from 65 ms → 21 ms despite adding a video decode step.
Oblivious format
Wrong dataloader
Silent GPU tax
Dataset
Training
Table 1 — The 10 LIBERO-Object Tasks
| ID | Task instruction |
|---|---|
| T0 | Pick up the alphabet soup and place it in the basket |
| T1 | Pick up the cream cheese and place it in the basket |
| T2 | Pick up the salad dressing and place it in the basket |
| T3 | Pick up the bbq sauce and place it in the basket |
| T4 | Pick up the ketchup and place it in the basket |
| T5 | Pick up the tomato sauce and place it in the basket |
| T6 | Pick up the butter and place it in the basket |
| T7 | Pick up the milk and place it in the basket |
| T8 | Pick up the chocolate pudding and place it in the basket |
| T9 | Pick up the orange juice and place it in the basket |
Table 2 — Dataset Footprint
| Mode | Format | Size | Ratio |
|---|---|---|---|
| Uncompressed (HDF5) | HDF5 raw uint8 | 6.93 GB | 1.00× |
| Knonik Lossless | Knonik lossless | 1.98 GB | 3.51× |
| Knonik Lossy | Knonik lossy (ul2) | 0.38 GB | 18.23× |
Figure 3 — Storage Footprint (GB)
Figure 4 — Mean Held-out Success · best ckpt · 200 rollouts · 10 tasks × 2 seeds
Mean success rate (%). HDF5 std ±34.48 pp — highest of three.
Table 3 — Held-out Success Rate (mean ± std)
| Mode | Best (%) | Latest (%) | N |
|---|---|---|---|
| Uncompressed (HDF5) | 54.35 ± 34.48 | 49.33 ± 32.64 | 20 |
| Knonik Lossless | 64.25 ± 22.17 | 53.83 ± 25.84 | 20 |
| Knonik Lossy | 54.83 ± 25.43 | 51.00 ± 24.79 | 20 |
Table 4 — Per-task Held-out Success (%) · Seed-averaged · Bold = winner per row
| Task | Object | Uncompressed | Knonik Lossless | Knonik Lossy |
|---|---|---|---|---|
| T0 | alphabet soup | 64.8 | 52.8 | 65.5 |
| T1 | cream cheese | 60.2 | 58.5 | 59 |
| T2 | salad dressing | 92 | 89.5 | 80.2 |
| T3 | bbq sauce | 73.8 | 64.8 | 54 |
| T4 | ketchup | 38.8 | 29.2 | 8.2 |
| T5 | tomato sauce | 83.5 | 77.8 | 68.8 |
| T6 | butter | 16.8 | 70.8 | 34.2 |
| T7 | milk | 29.2 | 74.8 | 45.5 |
| T8 | chocolate pudding | 39.2 | 66.2 | 81.2 |
| T9 | orange juice | 45.2 | 58.2 | 51.5 |
| Mean | 54.35 | 64.25 | 54.83 | |
HDF5 shows catastrophic per-seed collapses absent from both Knonik conditions — T8 drops 78% → 0.5%, T9 drops 90.5% → 0%.
Figure 5 — Success by Task × Mode × Seed · Green = high · Red = collapse
| Task | Seed 0 | Seed 47 | ||||
|---|---|---|---|---|---|---|
| HDF5 | Lossless | Lossy | HDF5 | Lossless | Lossy | |
| T0 alphabet_soup | 57.5 | 68.5 | 75 | 72 | 37 | 56 |
| T1 cream_cheese | 79 | 65.5 | 68.5 | 41.5 | 51.5 | 49.5 |
| T2 salad_dressing | 92 | 84.5 | 66 | 92 | 94.5 | 94.5 |
| T3 bbq_sauce | 82.5 | 73 | 62 | 65 | 56.5 | 46 |
| T4 ketchup | 0 | 11 | 0 | 77.5 | 47.5 | 16.5 |
| T5 tomato_sauce | 78.5 | 91 | 76 | 88.5 | 64.5 | 61.5 |
| T6 butter | 32.5 | 75.5 | 15.5 | 1 | 66 | 53 |
| T7 milk | 19 | 71.5 | 62 | 39.5 | 78 | 29 |
| T8 chocolate_pu | 78 | 46.5 | 89 | 0.5 | 86 | 73.5 |
| T9 orange_juice | 90.5 | 89 | 74.5 | 0 | 27.5 | 28.5 |
Table 5 — |Δ| Between Seed 0 and Seed 47 (lower = better)
| Mode | Mean |Δ| | Max |Δ| | Std |
|---|---|---|---|
| Uncompressed (HDF5) | 37.7 pp | 90.5 pp | 32.4 pp |
| Knonik Lossless | 25.2 pp | 61.5 pp | 17.4 pp |
| Knonik Lossy | 24.6 pp | 46.0 pp | 11.0 pp |
Figure 6 — GPU Utilisation vs. Idle (step/(step+gap))
Table 6 — Pipeline Efficiency (averaged across all tasks and seeds)
| Metric | Uncompressed | Knonik Lossless | Knonik Lossy |
|---|---|---|---|
| Throughput (samples/s) | 102.9 | 119.6 | 118.2 |
| Total wall time (min) | 56.9 | 48.9 | 49.5 |
| GPU util (step/cycle, %) | 62.9 | 74.8 | 74.2 |
| GPU idle (data-wait, %) | 37.1 | 25.2 | 25.8 |
| Batch fetch mean (ms) | 67.53 | 23.80 | 24.25 |
| Inter-step gap mean (ms) | 108.62 | 63.30 | 65.43 |
| GPU energy (Wh) | 133.3 | 122.9 | 123.8 |
| Est. cost (USD) | 2.90 | 2.50 | 2.53 |
Read the full paper
Complete methodology, all figures, extended ablations, and raw per-step profiler data.
Download Full Paper (PDF)