Email or username:

Password:

Forgot your password?
Simon Willison

Can modern screen readers read academic papers that are published as two column PDFs? Do they know how to separate out the two columns?

14 comments
Raphael Fetzer :kirby:

@simon You can specify the reading order in a PDF document so the screen reader can follow it and doesn’t need to guess.

Simon Willison

@pheraph that’s reassuring! Do you know if published papers tend to do that? Any way for me to tell if this one works properly? storage.googleapis.com/gweb-re

Raphael Fetzer :kirby:

@simon I am no expert in that field, maybe @yatil has an answer. In my experience PDF files often have tons of accessibility issues. You can check your specific document here: pave-pdf.org/pave/index.html It should highlight if the reading order isn’t specified.

jleedev

@simon @pheraph That PDF isn't tagged, so the only way to read it is in the page content order, which fortunately is sensible enough to present one column and then the other, but things like footnotes and figures run inline.

James Scholes

@simon This specific PDF is not tagged for accessibility, and is literally unreadable with NVDA plus Acrobat Reader on Windows. For instance, here's an excerpt of what I'm hearing:

> SQLhasbeenextremelysuccessfulasthedefactostandardlanguageforworkingwithdata.Virtuallyallmainstreamdatabase-like systemsuseSQLastheirprimaryquerylanguage.ButSQLisan oldlanguagewithsignificantdesignproblems,makingitdifficultto learn,difficulttouse,

Karl Pettersson

@simon Have not tested, but it should be no problem if the PDF is properly tagged (otherwise it might be problematic).

Bruce Elrick

@simon
I would think you should be able to tell by watching the order in which the selection highlighting expands as you drag the cursor.

Whenever the selection jumps around in non-reading order I suspect a screen reader would also jump around.

Björn Töpel

@simon That feature is what I miss the most on reMarkable! I've never seen a reader supporting that. 😢

Mikołaj Hołysz

@simon Short answer, just don't, preferrably provide both a HTML alternative and the LaTeX source.

PDF is essentially a vector graphics format, the ultimate end goal of PDF is making a document that prints and displays in exactly the same way for everybody, everything else is secondary. In HTML, the "recommended way to do things" is to essentially say "put a h1 here" and let the browser deal with it, possibly with some help from your style sheet along the way. In PDF, you essentially say "hey, here's some text, put it 2.7 inches from the left margin, 16 point, use font so and so". If you were so inclined, you could even re-order the characters in your font and use completely nonsensical codepoints, and things would still pretty much work visually.

LaTeX definitely uses shenanigans like that, Polish diacritics for example aren't expressed as a single character. Instead, the English letter is used, along with some extra markup that tells the renderer where to draw the acute accents on the page. Those acute accents aren't actually part of the character from an a11y perspective though, they're just random squiggles that the renderer happens to be told to draw. Some say that modern JS frameworks are crazy, I say that PDF is far, far crazier than that.

Speaking onf the two-column stuff in particular, I've seen it work and I've also seen it not work, this probably depends on where the text goes in the document, what it is rendered with, and probably on what software you're using and what their a11y implementation is like.

Yes, there's a way to mark PDFs up for accessibility properly, but very few people do it, LaTeX makes it far harder, there are a lot of other problems (think math), and support among reading programs is... spotty at best.

@simon Short answer, just don't, preferrably provide both a HTML alternative and the LaTeX source.

PDF is essentially a vector graphics format, the ultimate end goal of PDF is making a document that prints and displays in exactly the same way for everybody, everything else is secondary. In HTML, the "recommended way to do things" is to essentially say "put a h1 here" and let the browser deal with it, possibly with some help from your style sheet along the way. In PDF, you essentially say "hey, here's...

Adam Marcus

@simon I believe @jbigham has a few things to say about this

Simon Willison

As an experiment I downloaded the two column PDF of this new paper from Google research "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" research.google/pubs/sql-has-p

... and uploaded it to Google AI Studio and told Gemini Pro 1.5 "Convert this document to neatly styled semantic HTML" - and the results were pretty good! static.simonwillison.net/stati

Matt Campbell

@simon I'd be really worried about both hallucination and prompt injection when using an LLM for document conversion, as an accessibility tool for blind or other disabled users. But the tools I've tried on this paper do worse than what you got out of Gemini.

Simon Willison

@matt yeah, me too. The responsible way to do this would be to use Gemini Pro to create the first draft, then spend significant time and effort checking and verifying it, iterating on the prompts, porting across the figures etc

Chris Keene

@simon considering that PDFs are still hard to work with (eg select/copy/paste) after many years, I do wonder if this is an area where AI can help (based on what a user can see on scree, what do you think they want to copy) rather than a traditional approach

Go Up