Local LLM. Document never leaves your device.

Ask your PDF
anything.

Drop a PDF, ask questions, get answers with citations. The model runs in your browser. Nothing about your document touches a server.

Drop a PDF here_

or click below. Stays on this device, period.

PDF · multi-page · up to 200 MB in browser memory

Zero uploads. Model downloads once, then runs offline.
verify it yourself

Privacy as an instrument, not a claim.

01

The model downloads once, then runs offline.

First time you load the tool, your browser downloads the Llama 3.2 3B model from Hugging Face. After that, the weights live in your browser cache. Disconnect from the internet and the tool still works.

02

Your document never goes to a server.

PDFs are read with pdfjs-dist in the page. Embeddings are computed locally with all-MiniLM-L6-v2 via transformers.js. Inference runs in your GPU through WebLLM. Open DevTools, Network tab. After the model loads, it stays at zero.

03

CSP locks the page to your device.

The page's Content Security Policy allows scripts and styles only from this origin. The only external destinations allowed are Hugging Face hosts (huggingface.co, hf.co) for the model weights and raw.githubusercontent.com for the compiled WebGPU shaders, both for the one-time model download. Everything else, including any future change, would have to update the CSP and would be visible in this page's source.

run it offline

Don't trust the page. Take it home.

single file

Standalone HTML

One index.html plus the vendored libraries (pdfjs, transformers.js, MiniLM, WebLLM). The model itself still downloads on first use unless you save it from your browser cache.

Download the source →
self-host

Self-host on your own domain

Drop the directory on any static host. The README has the nginx CSP block and the Dockerfile to run it on Cloud Run, Fly, or your own box.

Read the self-host guide →
open source

View source on GitHub

MIT licensed. Check the CSP, the prompt template, the embedding pipeline. Submit issues for documents we read poorly.

github.com/xjmani/ask →