LiteSeed
Back

Product

Trace every dataset to its origin

Understand exactly how a dataset was created — from the original seed file through the Blueprint to every generated version.

Why it matters

Data provenance at every step

When a model behaves unexpectedly, teams need to trace the issue back to the data. LiteSeed records the full lineage of every dataset version — seed, Blueprint, run parameters and generation identity — so teams can always answer: where did this data come from?

Full traceability

Every dataset version links back to the seed file, Blueprint version and run that created it.

Visual lineage graph

Explore the full lineage as an interactive graph — Seed → Blueprint → Dataset Versions.

Audit-ready

Blueprint hash and generation seed are stored with every dataset version for verification.

Core capabilities

Lineage graph

Every dataset has a visual lineage graph that shows the full chain from seed file to dataset versions.

  • Nodes: Seed → Blueprint → Dataset Versions
  • Clickable nodes navigate to source records
  • Version count and row count per node

Blueprint version history

Every Blueprint has a full version history with parent-child relationships. Changes to distributions, constraints or policies create a new Blueprint version.

  • Parent Blueprint ID stored with every version
  • Blueprint hash for integrity verification
  • Full diff between Blueprint versions

Dataset version registry

Every generated dataset version is stored with its full generation identity — Blueprint ID, seed, row count, mode and blueprint hash.

  • generationSeed and blueprintHash per version
  • Retention policy management
  • Version comparison and export

Run history

Every run is a permanent record of a generation event. Runs store the full parameter set and link to the resulting dataset version.

  • Run status, progress and completion time
  • Download links for all export formats
  • Distribution summary in run report