Generating Anki flashcards from e-book screenshots

The friction of manually typing flashcards is the thing that makes someone stop using flashcards, and therefore the learning benefits of spaced repetition. I built this tool that bypasses the friction: upload photos of a book’s glossary, get back a CSV that imports cleanly into an existing Anki deck.

The code is on GitHub, and there’s a live version to try on Hugging Face Spaces.

The build

Before, the web app ran locally with Tesseract. It was free, can run on the laptop, and had no API keys. Back then, books were more likely to have dedicated glossary sections. Now, not as often. A lot of useful vocabulary aren’t in dedicated glossary sections; they’re introduced in bold inlines. In a sentence like “Classification is a problem of assigning a label to an unlabeled example,” only one of the three bolded words is being defined while the others are cross-references. Pattern-matching wasn’t the solution anymore. The modern process is more like this: Express backend → upload photo → ship it to Claude’s vision API → parse the JSON response → render an editable table → export a CSV. Required about 300 lines of code across the whole stack.

What surprised me

Most of the engineering wasn’t in the JavaScript. It was in three other places.

The prompt did more work than the code did. Telling the model explicitly to skip bold words that are merely referenced fixed almost every false positive in one edit. Iterating on the prose of that prompt was the highest-leverage thing for the whole build.

The browser lied about file types. An uploaded WebP file came through labeled image/jpeg. Claude’s API rejected the mismatch, correctly. The fix was sniffing the actual format from the first few bytes of the file (“magic numbers”) and ignoring what the upload claimed.

Anki has its own CSV grammar. The first export produced cards with every term and definition on the front and a blank back. Plain CSVs don’t tell Anki anything about how to map columns to fields or which deck to create. The fix was learning that Anki reads #header:value lines at the top of CSV files with five extra lines, and the import became one click.

Those problems are something to be aware of. They were in the gap between the docs and what the system actually did when I ran it.

A note on hosting

The app is now live on Hugging Face Spaces. Earlier, it wasn’t possible to host the web app publicly. Reasons are that every extraction is a paid API call, and I simply prefer not to share my own API key, which would then allow any visitor drive up my personal cost. Mitigations to prevent that used to be out-of-scope until I figured the web app could request users to input their API key in the appropriate field.

Bottom line: paste your own Anthropic key into the app. It’ll be used for that one request and not stored. Everyone pays only for their own usage. No shared bill whatsoever. The source is on GitHub, and the video above shows it in action.