I built it with Claude 3.5 Sonnet, initially as an Artifact prototype, then upgraded to a standalone browser app that talks directly to the (CORS) Gemini API, storing the user’s API key in localStorage
Full details and prompt transcripts showing how I built it are here on my blog: https://simonwillison.net/2024/Aug/26/gemini-bounding-box-visualization/
How I built this is a pretty good illustration of how convoluted my workflow for getting useful results out of LLMs has become - I used Claude 3.5 Sonnet to build a web app for talking to Gemini 1.5 Pro, and fired up GPT-4o with Code Interpreter to help debug a weird JPEG issue