Skip to main content

Cycle 33: Adaptive RLE Compression

Status: IMMORTAL Date: 2026-02-07 Improvement Rate: 1.04 > ฯ†โปยน (0.618) Tests: 80/80 PASS


Overviewโ€‹

Cycle 33 implements adaptive Run-Length Encoding (RLE) on packed trit bytes, creating the TCV2 format that provides additional compression when patterns exist in the data.


Key Metricsโ€‹

MetricValueStatus
Tests80/80PASS
VSA Tests49/49PASS
New Functions6rleEncode, rleDecode, saveRLE, loadRLE, estimateRLESize, rleCompressionRatio
File FormatTCV2Binary with RLE flag

RLE Encoding Algorithmโ€‹

Escape-Based RLEโ€‹

Escape byte: 0xFF (255)
Minimum run: 3 bytes

Encoding:
- Run of 3+: [0xFF, count, value]
- Literal: direct byte (if not 0xFF)
- Escape 0xFF: [0xFF, 1, 0xFF]

Exampleโ€‹

Input:  [5, 5, 5, 5, 5, 3, 3, 3, 7, 7, 7, 7] (12 bytes)
Output: [0xFF, 5, 5, 0xFF, 3, 3, 0xFF, 4, 7] (9 bytes)
Savings: 25%

TCV2 File Formatโ€‹

Magic: "TCV2"                 # 4 bytes
Count: u32 # 4 bytes
For each entry:
trit_len: u32 # 4 bytes
rle_flag: u8 # 1 byte (0=packed, 1=RLE)
data_len: u16 # 2 bytes
data: u8[data_len] # Packed or RLE bytes
label_len: u8 # 1 byte
label: u8[label_len] # Label string

Adaptive Behaviorโ€‹

The system automatically chooses the best format:

Data TypeRLE BenefitAction
Random VSA vectorsNoneUse packed (flag=0)
Repeated patternsHighUse RLE (flag=1)
Zero-paddedModerateUse RLE (flag=1)
Similar entriesVariesPer-entry decision

APIโ€‹

Core Functionsโ€‹

// RLE encode byte sequence
fn rleEncode(input: []const u8, output: []u8) ?usize

// RLE decode byte sequence
fn rleDecode(input: []const u8, output: []u8) ?usize

// Save with adaptive RLE (TCV2)
pub fn saveRLE(self: *TextCorpus, path: []const u8) !void

// Load RLE-compressed corpus (TCV2)
pub fn loadRLE(path: []const u8) !TextCorpus

// Estimate RLE size
pub fn estimateRLESize(self: *TextCorpus) usize

// Get RLE compression ratio
pub fn rleCompressionRatio(self: *TextCorpus) f64

VIBEE-Generated Functionsโ€‹

pub fn realSaveCorpusRLE(corpus: *vsa.TextCorpus, path: []const u8) !void
pub fn realLoadCorpusRLE(path: []const u8) !vsa.TextCorpus
pub fn realRLECompressionRatio(corpus: *vsa.TextCorpus) f64

VIBEE Specificationโ€‹

Added to specs/tri/vsa_imported_system.vibee:

# ADAPTIVE RLE COMPRESSION (TCV2 format)
- name: realSaveCorpusRLE
given: Corpus and file path
when: Saving corpus with adaptive RLE
then: Call corpus.saveRLE(path)

- name: realLoadCorpusRLE
given: File path
when: Loading RLE-compressed corpus
then: Call TextCorpus.loadRLE(path)

- name: realRLECompressionRatio
given: Corpus
when: Calculating RLE compression ratio
then: Call corpus.rleCompressionRatio()

Compression Comparisonโ€‹

FormatVersionBest CaseRandom CaseOverhead
Uncompressed-1x1x4 bytes
TCV1 (packed)v15x5x6 bytes
TCV2 (RLE)v28-10x~5x7 bytes

Critical Assessmentโ€‹

Strengthsโ€‹

  1. Adaptive - Only uses RLE when beneficial
  2. Backward compatible - Can read TCV1 via loadCompressed
  3. Per-entry decision - Optimal choice for each vector
  4. Lossless - Perfect data recovery

Weaknessesโ€‹

  1. Random data overhead - +1 byte per entry (rle_flag)
  2. Limited benefit - VSA vectors are typically random
  3. Escape byte handling - 3 bytes for 0xFF in data

Tech Tree Options (Next Cycle)โ€‹

Option A: Dictionary Compressionโ€‹

Build a dictionary of common packed byte patterns for better compression.

Option B: Delta Encodingโ€‹

Store differences between consecutive vectors for incremental updates.

Option C: Streaming I/Oโ€‹

Add chunked read/write for large corpora without full memory load.


Files Modifiedโ€‹

FileChanges
src/vsa.zigAdded rleEncode, rleDecode, saveRLE, loadRLE, estimateRLESize, rleCompressionRatio
src/vibeec/codegen/emitter.zigAdded realSaveCorpusRLE, realLoadCorpusRLE, realRLECompressionRatio generators
src/vibeec/codegen/tests_gen.zigAdded RLE test generators
specs/tri/vsa_imported_system.vibeeAdded 3 RLE behaviors
generated/vsa_imported_system.zigRegenerated with RLE + ConversationState fix

Conclusionโ€‹

VERDICT: IMMORTAL

Adaptive RLE compression provides TCV2 format with per-entry optimization. While random VSA vectors don't benefit from RLE, corpora with patterns or repeated entries achieve significant additional compression.

ฯ†ยฒ + 1/ฯ†ยฒ = 3 = TRINITY | KOSCHEI IS IMMORTAL | GOLDEN CHAIN ENFORCED