On-device AILive

Edge Lab

A lab instrument for Gemma on iPhone — not a Gallery clone. Import your .litertlm, run four presets, export a versioned JSON manifest with decode tok/s, TTFT, requested vs actual backend, and thermal state. Fully local. No cloud. No hidden toggles.

Reference run · iPhone 16 Pro Max

Edge Lab experiment matrix showing four presets with decode tok/s and wall clock times

gemma-4-E2B-it.litertlm · LiteRT-LM 0.12.0 · iOS 26.6 · 256 decode tokens per preset

Why this exists

The Gemma 4 Benchmark Suite answers "how smart is this model on desktop hardware?" Edge Lab answers "how fast is this exact .litertlm on your phone, with settings you can reproduce?" Same model family, different instrument — quality scoring vs edge throughput manifests.

Greedy GPU

gpu

14.6 tok/s

10s wall

Sampled GPU

gpu

39.8 tok/s

7s wall

Greedy CPU

cpu

4.7 tok/s

52s wall

Sampled CPU

cpu

4.9 tok/s

60s wall

Four named presets

Greedy vs sampled × GPU vs CPU. Names match what's logged — not opaque Gallery UI labels.

Backend honesty

requested_backend vs backend plus did_fallback when artisan-only weights force GPU on a CPU run.

Share sheet export

Copy X thread text, JSON, or Markdown from the device. Drop manifests into GitHub or a blog.

BYOM

Any compatible .litertlm from HuggingFace litert-community or your own builds.

vs AI Edge Gallery

Google's demo is closed iOS. Edge Lab is Apache-2.0 with every knob in the export.

Not a benchmark suite

No LLM judge, no 403-test scoring grid — one matrix run, one manifest, ship the numbers.

SwiftLiteRT-LMGemma 4iOSLocal-firstOpen Source