// How It Works
Compilation Pipeline
>>
LEX
DFA-based lexer tokenizes TypeScript source into a flat stream of tokens. Handles string interpolation, regex literals, JSX fragments, and every ES2024 token type. Single-pass, zero-allocation where possible.
{}
PARSE
Hand-written recursive-descent parser builds a typed AST. Covers declarations, expressions, statements, classes, enums, generics, decorators, and full destructuring patterns. Produces rich source-location spans for error reporting.
T:
TYPE-CHECK
Two-pass type checker: first pass collects declarations and builds the type environment, second pass verifies expressions and resolves generics. Types are erased after checking — they guide codegen decisions (layout, calling convention) but add zero runtime overhead.
0x
CODEGEN
Typed AST is lowered to machine code through Cranelift IR (x86_64), a custom Thumb instruction emitter (ARM/RP2040/RP2350), or a WASM backend. Each target gets its own ABI, register allocation, and calling convention.
->
LINK
Custom ELF linker produces a self-contained static binary. No libc, no dynamic loader, no external dependencies. Syscalls are emitted directly. DWARF debug info is embedded for GDB/LLDB debugging.
// Architecture
10 Rust Crates
chipts_lexerTokenizer / DFA lexer
chipts_parserRecursive-descent parser + AST
chipts_typeckTwo-pass type checker
chipts_codegenCranelift x86_64 backend
chipts_armCustom ARM Thumb emitter
chipts_wasmWebAssembly backend
chipts_linkerELF linker + DWARF debug info
chipts_stdlibBuilt-in standard library
chipts_runtimeBump allocator + syscall layer
chipts_cliCLI driver + build orchestration
// Memory Model
Bump Allocator
memory layout
┌─────────────────────────────────────────┐
│ HEAP (bump allocator) │
│ │
│ ┌──────┐ ┌──────────┐ ┌────────────┐ │
│ │String│ │ Array │ │ Object │ │
│ │ ptr ─┼→│ [len|cap]│ │ [keys|vals]│ │
│ │ len │ │ [T,T,T,T]│ │ {k:v, k:v} │ │
│ └──────┘ └──────────┘ └────────────┘ │
│ │
│ ← alloc_ptr grows upward │
├─────────────────────────────────────────┤
│ STACK (function frames, locals) │
│ grows downward → │
├─────────────────────────────────────────┤
│ .rodata (string literals, constants) │
├─────────────────────────────────────────┤
│ .text (machine code) │
└─────────────────────────────────────────┘All heap allocations use a simple bump allocator — a pointer advances through a contiguous memory region. Zero fragmentation, deterministic performance, no GC pauses. Strings, arrays, and objects are reference-counted at the type level for deallocation.
// Code Generation
Three Backends
>Cranelift (x86_64)
Production-grade code generation through Cranelift IR. Optimization passes include constant folding, dead code elimination, and function inlining.
>Custom Thumb (ARM)
Hand-written Thumb instruction emitter for RP2040 and RP2350. Generates bare-metal binaries with UF2 bootloader support.
>WASM Backend
Compiles to .wasm modules for browsers, edge runtimes, and WASI environments. Shared memory model with linear memory.
// No Runtime
Zero Overhead
>Direct syscalls — write, exit, mmap, and brk emitted inline
>Static linking — no dynamic loader, no libc, no shared libraries
>No garbage collector — deterministic bump allocator with arena reuse
>No interpreter loop — every expression compiles to native instructions
>Minimal binary size — 13-26 KB for typical programs
// What's Included
Language Features
>Variables — let, const with type inference and explicit annotations
>Functions — declarations, expressions, arrow functions, default/rest parameters
>Classes — constructors, methods, properties, inheritance, static members
>Interfaces — structural typing, optional properties, index signatures
>Enums — numeric and string enums with const evaluation
>Generics — type parameters, constraints, generic functions and classes
>Control Flow — if/else, switch, for, for-of, while, do-while, break, continue
>Destructuring — arrays, objects, nested patterns, rest elements, defaults
>Async/Await — promise-based async functions compiled to state machines
>Generators — yield expressions with iterator protocol support
>Template Literals — string interpolation with expression embedding
>Optional Chaining — ?. operator with null-safe property access
>Nullish Coalescing — ?? operator for default value resolution
>Spread Operator — array and object spreading in expressions and parameters
>Type Guards — typeof, instanceof, and user-defined type predicates
>Type Assertions — as-casting with compile-time verification
>Tuple Types — fixed-length typed arrays with element access
>Union Types — type narrowing through control flow analysis
// Standard Library
Built-in Modules
console
log, error, warn, time, timeEnd
Direct syscall output — no buffering overhead
String
charAt, charCodeAt, includes, indexOf, lastIndexOf, slice, substring, trim, trimStart, trimEnd, padStart, padEnd, repeat, replace, split, toLowerCase, toUpperCase, startsWith, endsWith
15+ methods — heap-allocated, length-prefixed
Array
push, pop, shift, unshift, slice, splice, indexOf, includes, map, filter, reduce, forEach, find, findIndex, some, every, join, reverse, concat, flat
15+ methods — contiguous memory, capacity doubling
Math
abs, ceil, floor, round, max, min, pow, sqrt, log, log2, log10, sin, cos, tan, random, trunc, sign, PI, E
18 methods — inlined where possible
Map
get, set, has, delete, clear, keys, values, entries, forEach, size
Hash map with linear probing
Set
add, has, delete, clear, keys, values, entries, forEach, size
Hash set built on Map internals
Number
toString, toFixed, isNaN, isFinite, parseInt, parseFloat
IEEE 754 double-precision floats
Object
keys, values, entries, assign, freeze, isFrozen
Property map with ordered insertion
JSON
parse, stringify
Recursive descent JSON parser + serializer
Date
now, getTime, toISOString, getFullYear, getMonth, getDate
Epoch-based via clock_gettime syscall
RegExp
test, exec, match, replace, search, split
NFA-based regex engine compiled from patterns
// CLI
Command Line Options
chipts --help
Usage: chipts [OPTIONS] <file.ts>
Arguments:
<file.ts> TypeScript source file to compile
Options:
-o, --output <path> Output binary path (default: a.out)
--target <target> Compilation target:
x86_64 (default) Linux ELF binary
wasm WebAssembly module
rp2040 Raspberry Pi Pico (UF2)
rp2350 Raspberry Pi Pico 2 (UF2)
qemu QEMU LM3S6965 emulation
--opt-level <level> Optimization: 0 (none), 1 (basic), 2 (full)
--emit-asm Emit assembly alongside binary
--emit-ir Emit Cranelift IR for inspection
--debug Include DWARF debug information
--no-std Disable standard library
-v, --verbose Verbose compilation output
-h, --help Print this help message
-V, --version Print version// Known Limitations
Heads Up
!Global mutation bug — re-assigning module-level variables inside functions may not propagate correctly
!String sort — Array.sort() on strings uses numeric comparison instead of lexicographic
!Function variables — storing functions in variables and calling them indirectly can fail in some edge cases
// In The Pipeline
What's Next
[BUG]
Fix global mutation
Module-level variable reassignment inside functions doesn't always propagate. Needs IR-level indirection for global slots.
[BUG]
Fix string sort
Array.sort() on string arrays currently uses numeric comparison. Needs lexicographic comparator in the stdlib sort implementation.
[BUG]
Fix function variables
Storing functions in variables (let fn = myFunc) and calling them can fail due to incorrect closure capture in some cases.
[WIP]
Better async/await scheduling
Current state machine transform works but has edge cases with nested awaits and try/catch. Improving the scheduler for fairness and error propagation.
[TODO]
Dynamic imports
import() expressions for code splitting. Requires lazy linking support in the ELF backend.
[TODO]
Symbol type
Unique symbol support for property keys and well-known symbols (Symbol.iterator, Symbol.toPrimitive).
[TODO]
Decorators
TC39 stage 3 decorators for classes and class members. Compile-time metadata transformation.
[TODO]
WeakMap / WeakSet
Weak reference collections. Requires finalization hooks in the allocator for ephemeron semantics.
[TODO]
Property descriptors
Object.defineProperty, getOwnPropertyDescriptor, and the full descriptor protocol (get/set/configurable/enumerable/writable).
[TODO]
Logical assignment operators
&&=, ||=, and ??= operators. Straightforward lowering to existing comparison + assignment IR.
// Roadmap
The Long Game
1
Full ECMAScript Compliance
Complete coverage of ES2024 specification. Every standard feature compiles correctly.
2
Package Manager Integration
Consume npm type definitions (.d.ts) for type checking third-party libraries. No runtime package support — types only.
3
Language Server Protocol
LSP server for IDE integration. Autocomplete, go-to-definition, diagnostics, hover info — all powered by the same type checker.
4
Multi-File Projects
Import resolution across files, dependency graph analysis, incremental compilation. Module bundling for WASM targets.
5
Garbage Collection
Optional tracing GC or reference counting for long-running programs. The bump allocator stays as the default for embedded and short-lived binaries.
6
More Embedded Targets
ESP32 (Xtensa/RISC-V), STM32 (Cortex-M), and Nordic nRF series. Bring TypeScript to every microcontroller.
7
Source Maps
Map compiled binary addresses back to TypeScript source locations. Better debugging, better stack traces.
8
REPL / Playground
Interactive TypeScript evaluation with JIT compilation. Web-based playground using the WASM backend.
9
Self-Hosting
Compile chipts with chipts. The ultimate milestone — proving the compiler is complete enough to build itself.