A collection of interactive demos showcasing machine learning models running directly in your browser. These demos use Transformers.js to run inference locally — your data never leaves your device.
Visual Question Answering
Ask questions about images and get natural language answers powered by the Moondream vision-language model.
Moondream2 · SigLIP · Phi-1.5
Try DemoImage Captioning
Generate natural language captions for any image using a Vision Transformer and GPT-2 decoder.
ViT-GPT2 · ~100MB cached
Try DemoPrivacy#
Since these models run entirely in your browser, your images never leave your device. No data is sent to any server.
