Email or username:

Password:

Forgot your password?
Top-level
Simon Willison

As an experiment I downloaded the two column PDF of this new paper from Google research "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" research.google/pubs/sql-has-p

... and uploaded it to Google AI Studio and told Gemini Pro 1.5 "Convert this document to neatly styled semantic HTML" - and the results were pretty good! static.simonwillison.net/stati

3 comments
Matt Campbell

@simon I'd be really worried about both hallucination and prompt injection when using an LLM for document conversion, as an accessibility tool for blind or other disabled users. But the tools I've tried on this paper do worse than what you got out of Gemini.

Simon Willison

@matt yeah, me too. The responsible way to do this would be to use Gemini Pro to create the first draft, then spend significant time and effort checking and verifying it, iterating on the prompts, porting across the figures etc

Chris Keene

@simon considering that PDFs are still hard to work with (eg select/copy/paste) after many years, I do wonder if this is an area where AI can help (based on what a user can see on scree, what do you think they want to copy) rather than a traditional approach

Go Up