OK, all done (I just went through and added alt text...

Simon's posts Post Back to profile

Top-level

Simon Willison

OK, all done (I just went through and added alt text to the images with the help of Claude)

Like 20 December at 18:39 | Open on fedi.simonwillison.net

8 comments

Simon Willison

By far the best coverage of o3 is this essay by François Chollet, it's crammed with interesting insights beyond just reporting on the benchmark score: https://arcprize.org/blog/oai-o3-pub-breakthrough

Published my own notes on that here: https://simonwillison.net/2024/Dec/20/openai-o3-breakthrough/

20 December at 22:33 | Open on fedi.simonwillison.net

Joe Pasqua

@simon agreed. Very good piece.

21 December at 1:21 | Open on mastodon.social

Xing Shi Cai

Sensitive content

@simon It feels way to expensive to run these models. But if the price drops to a level to, say chatgpt pro level ($200), I can many researchers will give it a try.

21 December at 1:44 | Open on mathstodon.xyz

Brian "bex" Exelbierd

@simon what is your prompt for this. I have had mixed results. And the API constantly asks if I want to keep doing the remaining images.

21 December at 15:28 | Open on toot.io

Simon Willison

@bexelbie I have a Claude Project set up with these custom instructions

You write alt text for any image pasted in by the user. Alt text is always presented in a fenced code block to make it easy to copy and paste out. It is always presented on a single line so it can be used easily in Markdown images. All text on the image (for screenshots etc) must be exactly included. A short note describing the nature of the image itself should go first.

21 December at 17:21 | Open on fedi.simonwillison.net

Brian "bex" Exelbierd

@simon Thank you. Have you got something similar for reformatting transcripts and other longer texts that prevents “Would you like me to continue”?

21 December at 18:51 | Open on toot.io

Simon Willison

@bexelbie length limits are still really frustrating, o1 and o1-mini might do better on that but generally I think it may need a custom harness that knows how to run "keep going" prompts automatically a few times when needed

21 December at 19:12 | Open on fedi.simonwillison.net

Brian "bex" Exelbierd

@simon this is where I had gotten too as well. It seems to be very limiting for using LLMs as part of automated processes. Especially if it’s hard to detect final states and the LLMs are apparently bad at managing to length on tasks of variable length.

22 December at 8:36 | Open on toot.io

Go Up