Technology
Blueprint Engine
The schema definition and generation engine at the core of LiteSeed — a versioned, constraint-enforced specification for every dataset.
Start FreeWhat is a Blueprint?
A Blueprint is a versioned dataset specification.
A Blueprint defines the complete structure of a dataset: field names, types, statistical distributions, constraints and generation policies. It is the single source of truth for all generation runs.
Blueprint capabilities
Field types
10+ field types covering all common data structures in ML/AI training datasets.
- \u2192numeric: integer and float with distribution sampling
- \u2192categorical: enum values with configurable weights
- \u2192string: template-based string generation
- \u2192boolean: configurable true/false probability
- \u2192date: date range with configurable format
- \u2192uuid: RFC 4122 UUID generation
- \u2192computed: derived fields with dependency resolution
Statistical distributions
8 distribution types for precise control over statistical properties.
- \u2192normal, lognormal, gamma, uniform, poisson
- \u2192categorical: values + weights
- \u2192rare_event: base_value + rare_value + probability p
- \u2192mixture: blend of distributions with weights
Constraint system
A two-tier constraint system that enforces business rules and data validity.
- \u2192Hard constraints: reject and resample (up to 50 retries)
- \u2192Soft constraints: flag violations without blocking
- \u2192Constraint types: formula, range, regex, date_order, not_null, enum_only
Blueprint versioning
Blueprints are versioned documents with parent-child lineage and hash-based provenance.
- \u2192Immutable Blueprint versions
- \u2192Parent-child lineage for schema evolution tracking
- \u2192Blueprint hash computed at generation time
- \u2192Version diff for field-level change comparison