Simon's wall

I keep finding new ways to entertain myself with ChatGPT advanced voice mode...

"I need you to pretend to be a California brown pelican with a very thick Russian accent, but you talk to me exclusively in Spanish"

Like 26 October at 14:50 | Open on fedi.simonwillison.net

Show previous comments

Eric Holscher

@simon I think it could be a game changer for learning a language.

26 October at 15:48 | Open on fosstodon.org

Clifford Adams

@simon
ChatGPT response: "You have very peculiar needs." 🤪

26 October at 17:57 | Open on fosstodon.org

Simon Willison

Here's a short audio clip https://static.simonwillison.net/static/2024/russian-pelican-in-spanish.m4a

26 October at 19:15 | Open on fedi.simonwillison.net

Simon Willison

I spun up a new LLM benchmark: how well can they handle this prompt?

Generate an SVG of a pelican riding a bicycle

I find the results so far utterly delightful: https://simonwillison.net/2024/Oct/25/pelicans-on-a-bicycle/

Here's Claude 3.5 Sonnet (2024-06-20) and Claude 3.5 Sonnet (2024-10-22):

Two images that are recognizable as pelicans on bicycles

Gemini 1.5 Flash 001 and Gemini 1.5 Flash 002:

Two images that are NOT.

Like 25 October at 23:59 | Open on fedi.simonwillison.net

Show previous comments

Kevin Marks

@simon saw this and thought of you https://bsky.app/profile/socalleslie.bsky.social/post/3l7ewtd4koe2x

26 October at 2:10 | Open on xoxo.zone

Jeremy Kun

@simon I guess "draw a unicorn in tikz" has already been tainted in the training data?

26 October at 3:43 | Open on mathstodon.xyz

alphaomega

@simon
Just and only my 2 cents: some people do formula 1 racing as a hobby. Or wakeboarding. Or use LLMs.
It's not forbidden. It's fascinating. It's fun.
<-> It's energy consuming

2 November at 8:29 | Open on hessen.social

Simon Willison

How hard is it to process untrusted SVG data to strip out any potentially harmful tags or attributes (like stuff that might execute JavaScript)?

I feel like this is well trodden ground for HTML these days, are there robust solutions for the SVG version of this problem?

Like 25 October at 22:38 | Open on fedi.simonwillison.net

Show previous comments

João S. O. Bueno

@simon based on the exoerience of people who tried to create a Python sandbox over the decades, I'd say it is pretty much impossible. (save for a browser saparayed as another page box: i.e. a "Frame")

26 October at 0:20 | Open on floss.social

Show 3 replies

martin sereinig

@simon CSP is probably a good second layer of security, no matter what you end up doing.

26 October at 7:22 | Open on mastodon.online

Pelle Wessman

@simon I wonder if you could do a similar approach as eg Figma used for their third party plugins: Make it all happen in a WASM script that’s sandboxed

So that and render to a canvas?

26 October at 14:54 | Open on mastodon.social

Simon Willison

Constantly persisting everything in a web form to localstorage in case of browser, tab accidents is such a cheap and effective trick!
https://mastodon.social/@timbornholdt/113369594056850962

Like 25 October at 19:11 | Open on fedi.simonwillison.net

Simon Willison

I keep meaning to knock up a little bit of JavaScript for the Django Admin that does this automatically for every add form

25 October at 19:14 | Open on fedi.simonwillison.net

Show 7 replies

Sam Wilson

@simon We recently added that sort of feature to MediaWiki: https://mediawiki.org/wiki/Help:Edit_Recovery (indexedDB rather than localStorage though, because it can store more).

25 October at 19:58 | Open on wikis.world

Show 1 reply

Vít Skalický :fedora:

@simon don't browsers do some level of form persistance if you use plain HTML forms?

25 October at 20:43 | Open on fosstodon.org

Show 1 reply

Simon Willison

Anthropic's https://claude.ai/ grew a new feature today: an equivalent of OpenAI's ChatGPT Code Interpreter mode, where the chatbot can write and then execute code in order to help answer questions (e.g. to run calculations that are beyond a next-token-predicting LLM)

OpenAI use server-side Python for this, but Anthropic instead chose to use client-side JavaScript running in a Web Worker.

Here are my notes so far on the new feature: https://simonwillison.net/2024/Oct/24/claude-analysis-tool/

Claude screenshot. I've uploaded a uv.lock file and prompted "Write a parser for this file format and show me a visualization of what's in it" Claude: I'll help create a parser and visualization for this lockfile format. It appears to be similar to a TOML-based lock file used in Python package management. Let me analyze the structure and create a visualization. Visible code: const fileContent = await window.fs.readFile('uv.lock', { encoding: 'utf8' }); function parseLockFile(content) ... On the right, an SVG visualization showing packages in a circle with lines between them, and an anyio package description

Like 24 October at 20:28 | Open on fedi.simonwillison.net

Show previous comments

Michael Hunger

@simon interesting choice.
Delegating the sandbox to the browser and compute to the user. what would be attack vectors here?

24 October at 23:25 | Open on chaos.social

Show 1 reply

Prem Kumar Aparanji 👶🤖🐘

@simon would be cool to see Llama 3.2 1B or similar doing it right inside the browser 😄

25 October at 3:38 | Open on mastodon.social

Show 2 replies

Stephan Druskat

@simon Interesting. Do you know (or does the code show) why the vis nodes for the dependencies are of different size?

25 October at 6:21 | Open on fosstodon.org

Show 2 replies

Simon Willison

Definitely read the latest @genmon https://interconnected.org/home/2024/10/24/soup

Grandma’s secret cake recipe, passed down generation to generation, could be literally passed down: a flat slab of beige ooze kept in a battered pan, DNA-spliced and perfected by guided evolution by her own deft and ancient hands, a roiling wet mass of engineered microbes that slowly scabs over with delicious sponge cake, a delectable crust to be sliced once a week and enjoyed still warm with creme and spoons of pirated jam.

Like 24 October at 12:22 | Open on fedi.simonwillison.net

Simon Willison

@genmon poetry.

24 October at 16:55 | Open on fedi.simonwillison.net

Show 1 reply

Robert Thau

@simon @genmon I think that's called "sourdough starter".

24 October at 19:23 | Open on mastodon.social

Simon Willison

I finally have a procedure in place I like for hacking on Python CLI apps using a development environment managed by uv - full notes here:
https://til.simonwillison.net/python/uv-cli-apps

Like 24 October at 5:57 | Open on fedi.simonwillison.net

Show previous comments

Luca Baggi

@simon Cool post! I am not sure the `tool.uv.dev-dependencies` is the same as, or interoperable with, PEP 735 though.

24 October at 7:37 | Open on mastodon.social

Marius Gedminas

@simon Extra tip: you can do uv tool install -e ., and now you can run demo-app directly, without the uv run prefix, in any directory.

24 October at 19:05 | Open on floss.social

Quinn Comendant

@simon Then, for scripts you run frequently, do you just add to your.bashrc `alias demo=“uv run python -m demo_app”`?

My few python CLI apps each have a .venv directly, but it’s tiresome to have to juggle activations: `source ~/bin/demo/.env/activate && demo && deactivate’. 🫠

24 October at 22:41 | Open on mastodon.social

Show 1 reply

Simon Willison

I know this should be obvious, but it still surprises me how much more fun my blog feels now that I've started habitually using images in my posts

Here's two screenshots and a GIF demo from today's posts https://simonwillison.net/2024/Oct/23/

Like 23 October at 18:30 | Open on fedi.simonwillison.net

v̾i̾t̾r̾i̾o̾l̾i̾x̾

@simon it does make the posts more engaging. our little rat brains like stimulation

23 October at 19:53 | Open on mastodon.social

Simon Willison

I built a Bash script for running prompts with images or PDFs against the Google Gemini models - a prototype of how multi-modal support for my LLM CLI tool is going to work

It's so much fun to play with, especially since Gemini somehow costs less than 1/10th of a cent per image

https://til.simonwillison.net/llms/prompt-gemini

Like 23 October at 18:27 | Open on fedi.simonwillison.net

Simon Willison

Came up with a creative way to post quotes from video content on my blog - since my quotes support images I can run MacWhisper to extract a text transcript of part of the video, then drop a screenshot in the middle to illustrate the quote

https://simonwillison.net/2024/Oct/23/cnbc/

Screenshot of my blog:

According to a document that I viewed, Anthropic is telling investors that it is expecting a billion dollars in revenue this year.

Then a screenshot of the video appearance. The alt text on that is:A CNBC Money Movers broadcast screenshot showing financial data. A news anchor in a green blazer appears on the left with the San Francisco Bay Bridge visible behind her. The screen displays ANTHROPIC EST. 2024 REV DOCUMENT SEEN BY CNBC: with a breakdown showing Third-party API: 60-75% of sales, Direct sales API: 10-25%, Chatbot subs: 15%, Professional services: 2%. The lower third chyron reads ANTHROPIC REV EXPECTED TO SURGE

Text continues: Third-party API is expected to make up the majority of sales, 60% to 75% of the total. That refers to the interfaces that allow external developers or third parties like Amazon's AWS to build and scale their own AI applications using Anthropic's models.

That is by far its biggest business, with direct API sales a distant second projected to bring in 10% to 25% of revenue. Chatbots, that is its subscription revenue from Claude, the chatbot, that's expected to make up 15% of sales in 2024 at $150 million.

— Deirdre Bosa, CNBC Money Movers

Like 23 October at 1:36 | Open on fedi.simonwillison.net

Reilly Wood

@simon Do you understand the difference between third party and direct sales in that quote? Third party sounds like regular API key usage, I’m unclear on what counts as direct sales

23 October at 2:56 | Open on mastodon.social

Show 1 reply

Simon Willison

The Internet Archive being down helps expose quite how much I rely on their Wayback Machine - I've headed over there are least four times in the past week and been disappointed at not being able to use it

Like 22 October at 21:46 | Open on fedi.simonwillison.net

Simon Willison

Today people are saying that it's interesting that Anthropic's Claude 3.5 Opus model is no longer mentioned on their models page - but without the archive I can't see for myself if it used to be listed there or not https://docs.anthropic.com/en/docs/about-claude/models

22 October at 21:47 | Open on fedi.simonwillison.net

Show 5 replies

Simon Willison

Anthropic released a fascinating new capability today called "Computer Use" - a mode of their Claude 3.5 Sonnet model where it can do things like accept screenshots of a remotely operated computer and send back commands to click on specific coordinates, enter text etc

My notes on what I've figured out so far: https://simonwillison.net/2024/Oct/22/computer-use/

Like 22 October at 17:52 | Open on fedi.simonwillison.net

Show previous comments

jmjm

@simon is this the much vaunted vaporware, the Large Action Model?

22 October at 21:09 | Open on mstdn.social

Show 1 reply

Duncan Lock

@simon
This feels like it's a step on the way to automating many millions of office admin jobs - which are often "copy and paste stuff from one computer system to another, sometimes editing it". Sobering thinking of how many people are potentially affected by this stuff.

23 October at 13:22 | Open on cosocial.ca

Show 3 replies

Mikołaj Hołysz

@simon this has huge accessibility implications, hopefully I'll have some time to test this out from a screen reader perspective over the coming weekend.

25 October at 7:06 | Open on dragonscave.space

Simon Willison

This is an absurdly clever hack. I broke it down to figure out how it works here: https://simonwillison.net/2024/Oct/21/sudoku-in-python-packaging/

https://mastodon.social/@konstin/113341705101217633

Like 21 October at 19:00 | Open on fedi.simonwillison.net

Jed Schmidt

@simon that is amazing! reminds me of when @fgnass golfed a sudoku solver into a tweet... he didn't have enough bytes to return the result so he stopped the recursive iteration by throwing: https://youtu.be/JsAetmgJRss?t=1467

21 October at 19:18 | Open on mastodon.social

Simon Willison

konsti

20 October at 20:46

You can solve sudokus in python packaging.

Not not python code, python packages:
```
[project]
name = "sudoku"
version = "1.0.0"
dependencies = [
"sudoku_3_1 == 2",
"sudoku_5_7 == 6",
"sudoku_0_7 == 5"
...
]
```
and the installed packages are the solution.

https://github.com/konstin/sudoku-in-python-packaging

Like 21 October at 18:49 | Open on fedi.simonwillison.net

Lucas Werkmeister

@konstin konsti stop doing cursed python things challenge (impossible)

20 October at 20:57 | Open on wikis.world

Frost Ming

@konstin This is not a new idea
https://github.com/frostming/sodoku-solver
In this repo, I referenced the original article.

21 October at 0:14 | Open on mas.to

Simon Willison

I exported my Claude data to poke around with it and found out I've become a HEAVY user of Claude Artifacts - the feature that lets Claude build a full interactive HTML+JavaScript tool for you based on your prompts. I built 14 (somewhat) useful things with it in just the past week!

Here's a post describing them all https://simonwillison.net/2024/Oct/21/claude-artifacts/

Like 21 October at 14:35 | Open on fedi.simonwillison.net

Simon Willison

Full list of artifacts from the past week

URL to Markdown with Jina Reader
SQLite in WASM demo
Extract URLs
Clipboard viewer
Pyodide REPL
Photo Camera Settings Simulator
LLM pricing calculator
YAML to JSON converter
OpenAI Audio
QR Code Decoder
Image converter and downloader
HTML Entity Escaper
text-wrap-balance-nav
ARES Phonetic Alphabet Converter

21 October at 14:38 | Open on fedi.simonwillison.net

Show 4 replies

Jeff Triplett

@simon I haven't played as much with Canvas but like I told you at DCUS, Claude Projects can't be unseen once you use it.

I saw your app last night, but I'm going to carve out some time to try it out. I have 1000s of chats and 100s of projects driven from it. I know it's not a true sync but I suspect I'll get a ton of good out of it.

21 October at 14:55 | Open on mastodon.social

Simon Willison

A year ago Mastodon 4.2 expanded the HTML filter to allow a whole bunch of additional tags... anyone know of any good examples of accounts that are using that bounty of additional formatting options? https://docs.joinmastodon.org/spec/activitypub/#sanitization

Since Mastodon v4.2, the following elements and attributes are supported:

<p>
<span> ( class )
<br>
<a> ( href , rel , class )
<del>
<pre>
<code>
<em>
<strong>
<b>
<i>
<u>
<ul>
<ol> ( start , reversed )
<li> ( value )
<blockquote>
headings will be converted to <strong> and then wrapped in <p>

Like 21 October at 3:41 | Open on fedi.simonwillison.net

Show previous comments

rawe

@simon <blink>ho ho ho</blink>

21 October at 5:04 | Open on techhub.social

Jack Linke 🦄

@simon Running 4.2 here, but seeing raw tags in the replies 😪

<i>someday this might be in italics</i>

21 October at 7:15 | Open on social.jacklinke.com

Kevin Marks

@simon tantek uses it (via bridgy) to do in-post footnotes https://fed.brid.gy/r/https://tantek.com/2024/285/t1/io-domain-suggested-steps

22 October at 12:34 | Open on xoxo.zone

Simon Willison

I really like Drew's framework here dividing current AI use-cases into Gods (human replacement, which I think of as still mostly science fiction), Interns (assistants you delegate closely-reviewed tasks to, which is most of how I use LLMs today) and Cogs (smaller tools that can more reliably serve a single purpose, like Whisper for transcription) - more of my own notes on this here: https://simonwillison.net/2024/Oct/20/gods-interns-and-cogs/
https://note.computer/@dbreunig/113330394050829486

Like 20 October at 22:16 | Open on fedi.simonwillison.net

Matt Campbell

@simon For the "gods" category, also check out @forrestbrazeal's excellent song "AGI (Artificial God Incarnate)": https://www.youtube.com/watch?v=1ZhhO7MGknQ

20 October at 22:23 | Open on toot.cafe

Show 10 replies

Jeff Triplett

@simon I have been struggling with terminology, so this is useful. That said, I'm not a fan of "interns" used like this. The context that you used it in felt more appropriate than a whole class of AI terminology that literally means to replace a useful class of workers and learning.

I have personally struggled with the term Agents for lack of a framework or way to use them outside of running a Python script.

20 October at 22:50 | Open on mastodon.social

Show 6 replies

Simon Willison

The cold open from Abbott Elementary Season 4 Episode 2 (Ringworm) - the one with the PTA meeting - is already iconic

Like 20 October at 20:56 | Open on fedi.simonwillison.net

Simon Willison

This is really neat from Terence Eden: you can use text-wrap: balance for icons as well as text https://shkspr.mobi/blog/2024/10/you-can-use-text-wrap-balance-on-icons/

Inspired by this I put together an interactive prototype using it for text-based navigation items, seems to work well for that too!

Try the prototype here: https://tools.simonwillison.net/text-wrap-balance-nav

More notes: https://simonwillison.net/2024/Oct/20/you-can-use-text-wrap-balance-on-icons/

https://mastodon.social/@Edent/113339538024245344

This is really neat from Terence Eden: you can use text-wrap: balance for icons as well as text https://shkspr.mobi/blog/2024/10/you-can-use-text-wrap-balance-on-icons/

Inspired by this I put together an interactive prototype using it for text-based navigation items, seems to work well for that too!

Expand text...

Like 20 October at 14:01 | Open on fedi.simonwillison.net

George Lund

@simon the result on Firefox mobile is a slightly distressing amount of horizontal scroll for some reason, which rather spoils the effect!

20 October at 20:40 | Open on urbanists.social

Show 4 replies

Simon Willison

I finally managed to get the Llama 3.2 and Phi 3.5 vision models to run on my M2 Mac laptop, using the mistral.rs Rust library and its CLI Tool and Python bindings https://simonwillison.net/2024/Oct/19/mistralrs/

Like 19 October at 16:16 | Open on fedi.simonwillison.net

Simon Willison

Here's what I got from Llama 3.2 11B for this photo I took at the Pioneer Memorial Museum in Salt Lake City https://www.niche-museums.com/111

"describe this image including any text"

Photos of the outside of the museum. The signs are visible, but none of them say that it is in Salt Lake City in Utah.

The image shows the exterior of the Daughters of Utah Pioneers (DUP) Museum, which is also known as the Pioneer Memorial Museum. The building is located in Salt Lake City, Utah, USA.

Visible in the image are:

The DUP Museum Building: The main structure of the museum is a prominent feature in the image. It is a two-story building with a rectangular shape and a flat roof. The exterior walls are made of brick or stone, giving it a classic and historic appearance.

Signage: Above the entrance, there is a sign that reads "Daughters of Utah Pioneers Museum" in bold letters. Below this sign, another sign indicates that the building is also known as the "Pioneer Memorial Museum."

Entrance: The main entrance to the museum is marked by a doorway with a decorative frame and a pair of large wooden doors.

Landscaping: The surrounding area around the museum appears well-maintained with neatly trimmed grass and some trees nearby.

Statue or Monument: In front of the museum, there is a small statue or monument that appears to be dedicated to pioneers or early settlers.

The Daughters of Utah Pioneers (DUP) Museum showcases historical artifacts and exhibits related to Utah's pioneer history, focusing on the experiences and contributions of early settlers in the region.

19 October at 16:21 | Open on fedi.simonwillison.net

Show 13 replies

Prem Kumar Aparanji 👶🤖🐘

@simon did you see that h2o.ai did well even with 0.8B model?

https://venturebeat.com/ai/small-but-mighty-h2o-ais-new-ai-models-challenge-tech-giants-in-document-analysis/

19 October at 16:32 | Open on mastodon.social

Show 2 replies

Florian Idelberger

@simon thanks for this! I had some issues however replicating this, where on an M3 max it always crashes. (Plus also annoying that it also crashes or errors if it cannot find an image. There is a PR to fix that, but it's not merged yet) Like even on the M3 MAX, as the in-situ quantization is done on one core, it takes a while... have you experienced one or all of these?

22 October at 11:05 | Open on mastodon.social

Show 1 reply