Edge Lab
A lab instrument for Gemma on iPhone — not a Gallery clone. Import your .litertlm, run four presets, export a versioned JSON manifest with decode tok/s, TTFT, requested vs actual backend, and thermal state. Fully local. No cloud. No hidden toggles.
Reference run · iPhone 16 Pro Max

gemma-4-E2B-it.litertlm · LiteRT-LM 0.12.0 · iOS 26.6 · 256 decode tokens per preset
Why this exists
The Gemma 4 Benchmark Suite answers "how smart is this model on desktop hardware?" Edge Lab answers "how fast is this exact .litertlm on your phone, with settings you can reproduce?" Same model family, different instrument — quality scoring vs edge throughput manifests.
Greedy GPU
gpu14.6 tok/s
10s wall
Sampled GPU
gpu39.8 tok/s
7s wall
Greedy CPU
cpu4.7 tok/s
52s wall
Sampled CPU
cpu4.9 tok/s
60s wall
Four named presets
Greedy vs sampled × GPU vs CPU. Names match what's logged — not opaque Gallery UI labels.
Backend honesty
requested_backend vs backend plus did_fallback when artisan-only weights force GPU on a CPU run.
Share sheet export
Copy X thread text, JSON, or Markdown from the device. Drop manifests into GitHub or a blog.
BYOM
Any compatible .litertlm from HuggingFace litert-community or your own builds.
vs AI Edge Gallery
Google's demo is closed iOS. Edge Lab is Apache-2.0 with every knob in the export.
Not a benchmark suite
No LLM judge, no 403-test scoring grid — one matrix run, one manifest, ship the numbers.