My notes on Google Research's new paper describing "pipe syntax", their alternative syntax for SQL queries which they've been rolling out internally since February https://simonwillison.net/2024/Aug/24/pipe-syntax-in-sql/
My notes on Google Research's new paper describing "pipe syntax", their alternative syntax for SQL queries which they've been rolling out internally since February https://simonwillison.net/2024/Aug/24/pipe-syntax-in-sql/ Can modern screen readers read academic papers that are published as two column PDFs? Do they know how to separate out the two columns?
Show previous comments
As an experiment I downloaded the two column PDF of this new paper from Google research "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/ ... and uploaded it to Google AI Studio and told Gemini Pro 1.5 "Convert this document to neatly styled semantic HTML" - and the results were pretty good! https://static.simonwillison.net/static/2024/Pipe-Syntax-In-SQL.html Lots of people are asking why Anthropic and OpenAI don't support OAuth, so you can bounce them through those providers to get a token that uses their API budget for your app My guess: they're worried malicious app developers would use it to trick people and obtain valid API keys Imagine a version of my dumb little "write a haiku about a photo you take" page which used OAuth, harvested API keys and then racked up hundreds of dollar bills against everyone who tried it out running illicit election interference campaigns or whatever An interesting thing about CORS is how poorly understood it is and how difficult it is to find a really clear explanation I’m not sure I could write a clear explanation myself The best I’ve seen is https://jakearchibald.com/2021/cors/
Show previous comments
@simon indeed is cryptic. The best reference I found was this one: https://jub0bs.com/posts/2023-02-08-fearless-cors/#cors-101 But the one you shared looks nice! @simon I tried my best in this section of an old post: https://jub0bs.com/posts/2023-02-08-fearless-cors/#cors-101 This story about why some companies are reconsidering their Microsoft Copliot 365 rollouts is amusing - in this case the problem is that the AI chatbot is /too/ effective, in that if you haven’t correctly configure permissions on documents like the employee salary spreadsheet anyone in your org who asks about it will get the right answer! https://simonwillison.net/2024/Aug/23/microsoft-copilot-data-governance/ @simon Fun story. I worked for a company that used Confluence for wikis and such. The VP of engineering would write all the private team meetings in here. Yet I was not part of the allowed members to see such content. However, I just told Confluence to e-mail me daily updates on when this VP posted. So I would get a daily e-mail that showed the contents of these meetings delivered conveniently to my inbox. Not sure if this would still work as it has been many years… 😀
[DATA EXPUNGED]
Any screen reader users able to help idenfity the best pattern for ensuring this proposed fediverse symbol gets read out loud correctly in different software? @simon Have you seen @heydon's take https://heydonworks.com/article/the-abbr-element/ on the `abbr`-element? @simon I'll file a report with Apple to see what's going on with Voiceover/other speech components. I'll also do some testing with different synthesizers. It could be that Apple will need to add this symbol to their emojis. @simon Tested in latest NVDA on Windows and this symbol is read as "Asterism," regardless of speech synthesizer. Using latest version of JAWS on Windows, it is not read at all. There is no pause, it's just skipped. Same for Windows Narrator. iOS 18 beta reads it as asterism, likewise for macOS beta. The only way to fix this is to beg and plead with screen reader maintainers to add a default pronunciation for ⁂. Users can add a replacement themselves, but this is a somewhat technical process. Sent out the latest edition of my newsletter - everything I’ve posted on my blog in the past couple of weeks, which it turns out adds up to a lot of stuff! https://simonw.substack.com/p/claudes-api-now-supports-cors-requests I like this idea, but I worry about accessibility I just tried using the Mobile Safari “read this page” feature and it skipped right over the ⁂ symbol as if it wasn’t even there @simon My screen reader (NVDA) announces it, but it's four syllables so might get a bit tedious. No idea about other SRs or how it manifests in braille. @FediverseSymbol maybe this is an Apple bug? At least one screen reader, NVDA, pronounces the symbol correctly as “Asterism” https://dragonscave.space/@jscholes/113009367662283225 @simon Ah, that’s annoying, would have expected “asterism” or some description. Depending on the usage, the symbol might be decorative and that’s maybe okay. But we need to think about cases where it convey meaning. Awesome new Anthropic API feature: you can now enable CORS support with a (currently undocumented) anthropic-dangerous-direct-browser-access: true request header - which means you can call their API directly from browser JavaScript now! My notes here: https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/ I used it to upgrade my fun little Haiku app, which uses your webcam to take a photo and then writes a Haiku about it using the Claude Haiku model Optimizing Datasette (and other weeknotes) https://simonwillison.net/2024/Aug/22/optimizing-datasette/ Armin’s take on what happens if Astral turn out to be a bad steward of uv is interesting: “[…] having seen the code and what uv is doing, even in the worst possible future this is a very forkable and maintainable thing. I believe that even in case Astral shuts down or were to do something incredibly dodgy licensing wise, the community would be better off than before uv existed.” @simon what if uv is the nose and nobody is looking at the Moon yet? I'm pretty sure their next product will be a PyPi alternative. Then they will add proprietary stuff. This will fragment the market and devs/companies will have to release Python packages to two places instead of one. Then it will become paid only etc... (one can only imagine the possibilities). I'm not saying this will 100% happens, but it's definitely possible. uv alone, is not an issue. Wow there is a LOT of stuff in the new release of uv - lockfiles, a pipx alternative, Poetry-style project management, even the ability to manage and download standalone Python versions directly It's taking me a while to dig through all of this https://astral.sh/blog/uv-unified-python-packaging
Show previous comments
@simon So much to unpack. As a minor-minor-nice-to-have (but really wish Python would settle on something here)...scaffold/template support: https://github.com/astral-sh/uv/issues/4759 Published some more detailed notes on today's huge new uv release https://simonwillison.net/2024/Aug/20/uv-unified-python-packaging/ New prompt injection data exfiltration attack today, this time against Slack and Slack AI It's a bit of a subtle one, but the net effect is that if you can get your malicious tokens into a Slack you can get their AI bot to trick users into exfiltrating private data by clicking on links My notes here: https://simonwillison.net/2024/Aug/20/data-exfiltration-from-slack-ai/ Original report by PromptArmor here: https://promptarmor.substack.com/p/data-exfiltration-from-slack-ai-via i quit my job just over 5 years ago to explain computer things (https://jvns.ca/blog/2019/09/13/a-year-explaining-computer-things/). I had no idea if I would like being my own boss but ultimately it's been really cool and I'm happy to have this weird job writing zines about computers. ("I’m not planning to hire employees or anything” turned out to not be an accurate prediction, now I work with 2 part-time employees who I don't know how I would manage without)
Show previous comments
@b0rk someone asked me recently how long it took me to get used to the rhythm of working for myself and I said “uh, maybe 3 years?”. I thought working for myself would be hard to adjust to and it was, but I'm happy I did it anyway My notes on trying out whisperfile, the new cross-platform executable packaging for the Whisper speech-to-text model, released as part of the latest update to llamafile
Show previous comments
Everyone who builds web applications should read the Reckoning series by @slightlyoff https://infrequently.org/series/reckoning/ My own notes here, but you should work through the entire thing: https://simonwillison.net/2024/Aug/18/reckoning/ Seriously, take a look at the case-study in which the California food stamps signup site takes 29.5s to become interactive on a slow rural mobile connection, and tell me we don't urgently need to do better! https://infrequently.org/2024/08/object-lesson/#the-golden-wait
Show previous comments
Killing all the tracking code to Google and all corporates makes our services on MPAQ, very fast. We are even working on self hosting a hit counter 😜 outsourcing is a major draw on performance. @simon @slightlyoff It is not just javascript. In spite of a high-bandwidth connection, I've been getting very bad response loading web pages recently. After diagnosing the problem, I found that there was 76% to 91% packet loss on DNS requests to the servers my ISP configures! I'm about to file a service request, but in the meantime, I changed DNS servers to publicly available ones. What made it worse was the number of DNS requests now needed to load a web page. @simon @slightlyoff I don't think saying Javascript is the cause of this is honest or helpful. I built a fun new Datasette plugin: datasette-checkbox, which looks for columns on a table called is_* or has_* or should_* and upgrades them to interactive checkboxes, provided the current user has update-row permission for that table https://simonwillison.net/2024/Aug/16/datasette-checkbox/ @simon I love that you share your transcripts. So cool to see the behind the scenes. 🙌 Eventually I wonder if we’ll land on a standardized/automated way to attach prompt transcripts to commits as a kind of provenance artifact. Interesting notes from Paul Gauthier on how asking an LLM to return code wrapped in a JSON object can result in a quality reduction compared to asking for that code in a less complex format such as fenced code Markdown blocks https://aider.chat/2024/08/14/code-in-json.html (Cross-posted from my blog: https://simonwillison.net/2024/Aug/16/llms-are-bad-at-returning-code-in-json/) Format encoding robs the network of capacity I said. Better to write encoder/decoders I said. Just because it can write #svg don't mean it should I said. 😉 Good to have data though. This might be interesting for you @diacritica |
@simon this looks a lot like Microsoft’s LINQ declarative query syntax.
@simon - This will be some hilarious prank on DBAs who have spent the last 2 decades irrationally railing against ORMs - Entity Framework in particular.
Under the hood, it's probably quite similar to LINQ and then an engine to translate the AST to SQL!
@simon Came for the SQL, stayed for the rant about PDFs with two columns 😅