Simon's wall

Spent the evening tinkering with Anthropic's new interactive prompting tutorial and OpenAI's new "improved file search result relevance". Wrote up a bunch of notes on them:

- My notes on Anthropic's Prompt Engineering Interactive Tutorial https://simonwillison.net/2024/Aug/30/anthropic-prompt-engineering-interactive-tutorial/
- And my notes on the new file chunking debug mode OpenAI added to their assistants API https://simonwillison.net/2024/Aug/30/openai-file-search/

Like 30 August at 4:29 | Open on fedi.simonwillison.net

Simon Willison

Using uvx to run a one-off Jupyter notebook against the current directory is a useful trick - I tried that for the first time today against the Anthropic Jupyter notebook interactive tutorials:

uvx --from jupyter-core jupyter notebook .

30 August at 6:01 | Open on fedi.simonwillison.net

Show 8 replies

Simon Willison

The piece of documentation I want most for the modern web is something that explains to me what variants of a "set-cookie:" header work in which modern browsers under which conditions

There's a ton of stuff out there about "Total Cookie Protection" in Firefox and "Privacy Sandbox" in Chrome, but I cannot figure out what it actually means for me as a web developer! I need protocol-level documentation for all of this stuff.

Like 29 August at 18:55 | Open on fedi.simonwillison.net

Simon Willison

A few years ago I put a bunch of work into figuring out the SameSite cookie attribute because the documentation for how that actually worked was so thin on the ground https://simonwillison.net/2021/Aug/3/samesite/

29 August at 18:56 | Open on fedi.simonwillison.net

Show 8 replies

Jeffrey Yasskin

@simon There's some work going on at https://johannhof.github.io/draft-annevk-johannhof-httpbis-cookies/draft-annevk-johannhof-httpbis-cookies.html to specify this. Does that draft at least improve the situation? I believe they're accepting complaints and suggestions.

29 August at 19:02 | Open on hachyderm.io

Show 4 replies

Melaskia

@simon Well, a very stupid summary with some elements of wrong.
1st party cookies with controlled subdomain and permissions will be fine.
The rest, notably 3rd party cookies are going to be very difficult (especially for FF and Safari since Chrome has kinda given up).

29 August at 19:19 | Open on mastodon.social

Show 2 replies

Simon Willison

Do you ever use LLM tools like Claude or ChatGPT to help code up exploratory prototypes?

(Specifically asking about prototyping here, because I'm beginning to think it's a particularly valuable application of this tech)

Anonymous poll

Poll

No, I've not tried that

96

22.7%

No, I've tried and found it didn't help me

59

14%

No, I disagree with the ethics of it

82

19.4%

Yes

185

43.8%

422 people voted.
Voting ended 29 August at 17:49.

Like 28 August at 17:48 | Open on fedi.simonwillison.net

Show previous comments

Tom Phillips

@simon For me I find a lot of the value of spikes and prototypes comes from the process, e.g. discovering that things work differently than I expected. Even if an LLM can give me a working prototype I am worried about the loss of that learning and discovery. I might be wrong though. I'll try it next time and see.

28 August at 20:40 | Open on hachyderm.io

Show 1 reply

Ian Wagner 🦀 :freebsd: :osm:

@simon yes, but to be honest it is inly well suited to specific domains; usually the ones with poor dev tools and a lot of ceremony and boilerplate which also have a lot of users 😂 But it can speed things up there sometimes.

29 August at 0:06 | Open on fosstodon.org

Janne Moren

@simon
In my brief exploration of it (and based on others experience) it seems to be a direct replacement of Stack Overflow.

That is, if you use reasonably mainstream technology, and you want help in solving a common problem or implementing a standard solution, perhaps with a small twist, then it's helpful and generally correct.

But as you veer off the mainstream path, the suggestions rapidly become misleading and wrong, and it's faster figuring it out for yourself.

@simon
In my brief exploration of it (and based on others experience) it seems to be a direct replacement of Stack Overflow.

That is, if you use reasonably mainstream technology, and you want help in solving a common problem or implementing a standard solution, perhaps with a small twist, then it's helpful and generally correct.

Expand text...

29 August at 0:33 | Open on fosstodon.org

Show 4 replies

Simon Willison

Blogged a few thoughts on the OSI's latest draft of a definition for "Open Source AI", which notably doesn't require that the training data itself be released under on open source license: https://simonwillison.net/2024/Aug/27/open-source-ai/

Like 27 August at 23:39 | Open on fedi.simonwillison.net

Jan Lehnardt :couchdb:

@simon pragmatism or dangerous precedent, we’ll find out :)

27 August at 23:47 | Open on narrativ.es

Loren Kohnfelder

@simon Even if the training data cannot be shared it can be named or described: for "open" to have any meaning I'd like to see a declaration, even if it's 100% "dark" training data.

27 August at 23:55 | Open on infosec.exchange

Show 3 replies

Simon Willison

I just spent ten minutes in Claude-3.5 Sonnet spinning up this little interactive streaming chat app to play around with the latest Google Gemini models - notes and prompts here: https://simonwillison.net/2024/Aug/27/gemini-chat-app/

Like 27 August at 22:52 | Open on fedi.simonwillison.net

Simon Willison

Here's the Claude transcript - I started by pasting in some example code for a Node.js streaming app and effectively told Claude to guess how to port that to run in a browser instead, by including a snippet of my own code that I used to manage API keys using localStorage https://gist.github.com/simonw/498a66c1c4b5053a6dfa2015c3675e24

27 August at 22:56 | Open on fedi.simonwillison.net

Thomas Steiner :chrome:

@simon Very cool! It's interesting how it doesn't wrap the inputs and buttons in a `<form>`, though. Might be something I wish the model just knew.

28 August at 7:52 | Open on toot.cafe

Simon Willison

Oh this is delightfully petty https://www.theartnewspaper.com/2024/08/27/sainsbury-wing-contractors-find-1990-letter-from-donor-anticipating-their-demolition-of-false-columns

Like 27 August at 16:33 | Open on fedi.simonwillison.net

Show previous comments

BakersRelay

@simon That’s Great!

27 August at 20:54 | Open on m.ai6yr.org

Jay

@simon An image of the columns in question can be found here: https://architecturetoday.co.uk/learning-from-venturi-scott-brown-the-national-gallerys-sainsbury-wing/

27 August at 21:06 | Open on social.coop

Covidiocracy

@simon

I love this! A shame the author passed in 2022 — so close to seeing the letter discovered.

27 August at 23:03 | Open on mastodon.online

Simon Willison

Carl T. Bergstrom

26 August at 14:32 on post Allowing police officers to submit LLM-written reports reveals a remarkable misunderstanding of what...

It's a terrifying development.

LLMs are literally designed to generate *plausible-sounding* *bullshit*.

They have no accountability and even less allegiance to truth than crooked cops—but they will be much, much better at writing the kinds of falsehoods that will bring a conviction.

Like 26 August at 22:17 | Open on fedi.simonwillison.net

Alexey Skobkin

@ct_bergstrom
I'd trust a language model more than an officer who doesn't give a shit about his/her work so much that they're fine with writing fiction in their reports.

LLM's aren't the problem here. Incompetent, unmotivated and lazy people are. Or do you think their reports would get better without LLM's?

26 August at 22:39 | Open on lor.sh

Simon Willison

Anthropic released the system prompt for their various consumer LLM chatbot apps today, and they're a really fun read. Made some notes on them here: https://simonwillison.net/2024/Aug/26/anthropic-system-prompts/

Here's how Claude 3.5 Sonnet deals with controversial subjects:

Like 26 August at 20:24 | Open on fedi.simonwillison.net

Simon Willison

And here's a fun little hint at some of the annoying behaviour in the base model that they've tried to knock out of it with some system prompt instructions

Seriously, stop saying "certainly"!

26 August at 20:24 | Open on fedi.simonwillison.net

Show 5 replies

Alex Bradbury

@simon it was also shared on /r/claudeai where a substantial portion of the community are convinced sonnet 3.5 has degraded significantly in recent weeks. https://old.reddit.com/r/ClaudeAI/comments/1f1shun/new_section_on_our_docs_for_system_prompt_changes/

Per the rep "We've also heard feedback that some users are finding Claude's responses are less helpful than usual. Our initial investigation does not show any widespread issues. We'd also like to confirm that we've made no changes to the 3.5 Sonnet model or inference pipeline."

@simon it was also shared on /r/claudeai where a substantial portion of the community are convinced sonnet 3.5 has degraded significantly in recent weeks. https://old.reddit.com/r/ClaudeAI/comments/1f1shun/new_section_on_our_docs_for_system_prompt_changes/

Per the rep "We've also heard feedback that some users are finding Claude's responses are less helpful than usual. Our initial investigation does not show any widespread issues. We'd also like to confirm that we've made no changes to the 3.5 Sonnet...

Expand text...

26 August at 20:41 | Open on fosstodon.org

Show 2 replies

Simon Willison

I wish I had the equivalent of threads for my own blog... there's something uniquely interesting about a publishing medium that produces a chronological record of the way you explored a specific thought

A thread is almost like a mini-blog for evolving one very specific idea over time

Like 26 August at 19:39 | Open on fedi.simonwillison.net

Show previous comments

Jan Lehnardt :couchdb:

@simon I’ve been thinking about this for a long time and it’s the reason why I’m posting under narrativ.es

27 August at 2:48 | Open on narrativ.es

Micah R Ledbetter

@simon totally agree, there's something useful about the "livetweet" / "tweetstorm" mode of communication that I wish I could get on my own site. It's not a replacement for normal blog posts but a different kind of thing.

27 August at 13:47 | Open on mastodon.social

Steve has ☕️ for brains

@simon had a few minutes today so started drawing about this... it's not simple but it's still intriguing! The data model and UI presentation model are interesting problems.

28 August at 0:59 | Open on hachyderm.io

Simon Willison

@anandphilipc yes, in localStorage

Like 26 August at 17:09 | Open on fedi.simonwillison.net

Anand Philip

@simon is there an image type that is good for this? i ve tried about ten so far, and i get [] as the result or "no bounding boxes"

26 August at 17:16 | Open on sigmoid.social

Show 5 replies

Simon Willison

Did you know Google’s Gemini 1.5 Pro vision LLM is trained to return bounding boxes for objects found within images?

I built this browser tool that lets you run a prompt with an image against Gemini and visualize the bounding boxes

You can try it out using your own Google Gemini API key: https://tools.simonwillison.net/gemini-bbox

Like 26 August at 16:54 | Open on fedi.simonwillison.net

Show previous comments

Jon Gilbert

@simon ...in this example, the left-goat bounding box looks quite off?

26 August at 17:01 | Open on mastodon.social

Show 5 replies

Grant Custer

@simon nice! i've got one here too :) https://gemini-spatial-example.grantcuster.com/

that TIFF bug/trick is interesting!

26 August at 17:18 | Open on mastodon.social

Adrien Delessert

@simon Thanks for this! I've just started working on a project that needs to both generate bounding boxes and extract some qualitative information from images—hopefully Gemini can be a one stop shop for that, rather than stringing things together like I'd started to do.

Microsoft has docs on a GPT4+"Enhancements" vision model with grounding/bounding boxes, but when you get into their dashboard it seems like it's actually deprecated. 🙄

26 August at 18:22 | Open on infosec.exchange

Simon Willison

My covidsewage bot finally generates useful alt text!

I tried scraping text data out of the Microsoft Power BI dashboard but was defeated by their bizarre DOM structure… so I’m passing the image to the OpenAI GPT-4o API instead and asking it “Return the concentration levels in the sewersheds - single paragraph, no markdown”

Code is here: https://github.com/simonw/covidsewage-bot/blob/98c56cc83a85d4b8c07e90cb0404f1b1cc2f0fd6/.github/workflows/post.yml#L47-L66
https://fedi.simonwillison.net/@covidsewage/113023397159658020

My covidsewage bot finally generates useful alt text!

I tried scraping text data out of the Microsoft Power BI dashboard but was defeated by their bizarre DOM structure… so I’m passing the image to the OpenAI GPT-4o API instead and asking it “Return the concentration levels in the sewersheds - single paragraph, no markdown”

Expand text...

Like 25 August at 15:40 | Open on fedi.simonwillison.net

Sevoris

@simon ...the absurdity of having to use a computationally expensive visual-to-text language model to extract data, when PowerBI usually offers an option to download the data visualized in a given graphic right there on the UI. EDIT: which seems to have been disabled here.

25 August at 15:49 | Open on mastodon.social

Show 3 replies

Simon Willison

Wrote up a few more details about how the alt text generation works on my blog https://simonwillison.net/2024/Aug/25/covidsewage-alt-text/

25 August at 16:12 | Open on fedi.simonwillison.net

Show 10 replies

Simon Willison

My notes on Google Research's new paper describing "pipe syntax", their alternative syntax for SQL queries which they've been rolling out internally since February https://simonwillison.net/2024/Aug/24/pipe-syntax-in-sql/

Like 24 August at 23:07 | Open on fedi.simonwillison.net

Show previous comments

Brian Reiter

@simon this looks a lot like Microsoft’s LINQ declarative query syntax.

25 August at 19:05 | Open on hachyderm.io

lostprototype

@simon - This will be some hilarious prank on DBAs who have spent the last 2 decades irrationally railing against ORMs - Entity Framework in particular.

Under the hood, it's probably quite similar to LINQ and then an engine to translate the AST to SQL!

28 August at 17:52 | Open on mastodon.social

Bastian Venthur

@simon Came for the SQL, stayed for the rant about PDFs with two columns 😅

29 August at 6:46 | Open on mastodon.social

Simon Willison

Today’s dumb way to entertain myself with LLMs:

> Write a COBOL program that my dog would enjoy. Include instructions for compiling and running it on macOS.

TIL how to compile and run a COBOL program using GnuCOBOL!

brew install gnu-cobol
cobc -x DOG-GAME.cob -o dog_game
./dog_game

Sadly it didn’t work on the first go, Claude 3.5 Sonnet missed that COBOL requires tabs, not spaces

Transcript: https://gist.github.com/simonw/64026b497ca46dd3c665f6cce6016825

Today’s dumb way to entertain myself with LLMs:

> Write a COBOL program that my dog would enjoy. Include instructions for compiling and running it on macOS.

TIL how to compile and run a COBOL program using GnuCOBOL!

brew install gnu-cobol
cobc -x DOG-GAME.cob -o dog_game
./dog_game

Sadly it didn’t work on the first go, Claude 3.5 Sonnet missed that COBOL requires tabs, not spaces

Expand text...

Like 24 August at 16:43 | Open on fedi.simonwillison.net

Clifford Adams

@simon
Woof! 🐕🐶🐕

24 August at 19:47 | Open on fosstodon.org

Simon Willison

Can modern screen readers read academic papers that are published as two column PDFs? Do they know how to separate out the two columns?

Like 24 August at 16:20 | Open on fedi.simonwillison.net

Show previous comments

Mikołaj Hołysz

@simon Short answer, just don't, preferrably provide both a HTML alternative and the LaTeX source.

PDF is essentially a vector graphics format, the ultimate end goal of PDF is making a document that prints and displays in exactly the same way for everybody, everything else is secondary. In HTML, the "recommended way to do things" is to essentially say "put a h1 here" and let the browser deal with it, possibly with some help from your style sheet along the way. In PDF, you essentially say "hey, here's some text, put it 2.7 inches from the left margin, 16 point, use font so and so". If you were so inclined, you could even re-order the characters in your font and use completely nonsensical codepoints, and things would still pretty much work visually.

LaTeX definitely uses shenanigans like that, Polish diacritics for example aren't expressed as a single character. Instead, the English letter is used, along with some extra markup that tells the renderer where to draw the acute accents on the page. Those acute accents aren't actually part of the character from an a11y perspective though, they're just random squiggles that the renderer happens to be told to draw. Some say that modern JS frameworks are crazy, I say that PDF is far, far crazier than that.

Speaking onf the two-column stuff in particular, I've seen it work and I've also seen it not work, this probably depends on where the text goes in the document, what it is rendered with, and probably on what software you're using and what their a11y implementation is like.

Yes, there's a way to mark PDFs up for accessibility properly, but very few people do it, LaTeX makes it far harder, there are a lot of other problems (think math), and support among reading programs is... spotty at best.

@simon Short answer, just don't, preferrably provide both a HTML alternative and the LaTeX source.

PDF is essentially a vector graphics format, the ultimate end goal of PDF is making a document that prints and displays in exactly the same way for everybody, everything else is secondary. In HTML, the "recommended way to do things" is to essentially say "put a h1 here" and let the browser deal with it, possibly with some help from your style sheet along the way. In PDF, you essentially say "hey, here's...

Expand text...

24 August at 18:15 | Open on dragonscave.space

Adam Marcus

@simon I believe @jbigham has a few things to say about this

24 August at 19:05 | Open on hachyderm.io

Simon Willison

As an experiment I downloaded the two column PDF of this new paper from Google research "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/

... and uploaded it to Google AI Studio and told Gemini Pro 1.5 "Convert this document to neatly styled semantic HTML" - and the results were pretty good! https://static.simonwillison.net/static/2024/Pipe-Syntax-In-SQL.html

24 August at 22:13 | Open on fedi.simonwillison.net

Show 3 replies

Simon Willison

Lots of people are asking why Anthropic and OpenAI don't support OAuth, so you can bounce them through those providers to get a token that uses their API budget for your app

My guess: they're worried malicious app developers would use it to trick people and obtain valid API keys

Like 24 August at 0:23 | Open on fedi.simonwillison.net

Simon Willison

Imagine a version of my dumb little "write a haiku about a photo you take" page which used OAuth, harvested API keys and then racked up hundreds of dollar bills against everyone who tried it out running illicit election interference campaigns or whatever

https://tools.simonwillison.net/haiku

24 August at 0:24 | Open on fedi.simonwillison.net

Show 7 replies

Simon Willison

An interesting thing about CORS is how poorly understood it is and how difficult it is to find a really clear explanation

I’m not sure I could write a clear explanation myself

The best I’ve seen is https://jakearchibald.com/2021/cors/

Like 23 August at 22:48 | Open on fedi.simonwillison.net

Show previous comments

Luis Lavena

@simon indeed is cryptic. The best reference I found was this one: https://jub0bs.com/posts/2023-02-08-fearless-cors/#cors-101

But the one you shared looks nice!

24 August at 8:47 | Open on mastodon.social

happyborg

@simon I hope CORS is half as frustrating for those trying to break web security as those of us just trying to do legitimate stuff 🤦‍♂️

24 August at 8:53 | Open on fosstodon.org

jub0bs

@simon I tried my best in this section of an old post: https://jub0bs.com/posts/2023-02-08-fearless-cors/#cors-101

19 September at 13:57 | Open on infosec.exchange

Simon Willison

This story about why some companies are reconsidering their Microsoft Copliot 365 rollouts is amusing - in this case the problem is that the AI chatbot is /too/ effective, in that if you haven’t correctly configure permissions on documents like the employee salary spreadsheet anyone in your org who asks about it will get the right answer! https://simonwillison.net/2024/Aug/23/microsoft-copilot-data-governance/

Like 23 August at 17:26 | Open on fedi.simonwillison.net

Steven

@simon Fun story. I worked for a company that used Confluence for wikis and such. The VP of engineering would write all the private team meetings in here. Yet I was not part of the allowed members to see such content.

However, I just told Confluence to e-mail me daily updates on when this VP posted.

So I would get a daily e-mail that showed the contents of these meetings delivered conveniently to my inbox.

Not sure if this would still work as it has been many years… 😀

23 August at 17:36 | Open on hachyderm.io

[DATA EXPUNGED]

Simon Willison

Any screen reader users able to help idenfity the best pattern for ensuring this proposed fediverse symbol gets read out loud correctly in different software?
https://typo.social/@FediverseSymbol/113012440880547584

Like 23 August at 17:19 | Open on fedi.simonwillison.net

Aslak Raanes

@simon Have you seen @heydon's take https://heydonworks.com/article/the-abbr-element/ on the `abbr`-element?

23 August at 17:37 | Open on mastodon.social

Pratik Patel

@simon I'll file a report with Apple to see what's going on with Voiceover/other speech components. I'll also do some testing with different synthesizers. It could be that Apple will need to add this symbol to their emojis.

23 August at 18:23 | Open on mstdn.social

Tristan

@simon Tested in latest NVDA on Windows and this symbol is read as "Asterism," regardless of speech synthesizer. Using latest version of JAWS on Windows, it is not read at all. There is no pause, it's just skipped. Same for Windows Narrator. iOS 18 beta reads it as asterism, likewise for macOS beta. The only way to fix this is to beg and plead with screen reader maintainers to add a default pronunciation for ⁂. Users can add a replacement themselves, but this is a somewhat technical process.

23 August at 22:39 | Open on hachyderm.io

Simon Willison

Sent out the latest edition of my newsletter - everything I’ve posted on my blog in the past couple of weeks, which it turns out adds up to a lot of stuff! https://simonw.substack.com/p/claudes-api-now-supports-cors-requests

Like 23 August at 4:49 | Open on fedi.simonwillison.net