New release of sqlite-utils, my combined CLI tool and Python library for doing useful things with SQLite databases https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-38
New release of sqlite-utils, my combined CLI tool and Python library for doing useful things with SQLite databases https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-38 Looking for something to do this weekend? Come on an artist treasure hunt down on the coast! There are stalls and open studios you can visit and #shoplocal to support artists. https://www.colonyofcoastsideartists.com/coca-os-2024.html Here are some of the pots Iβll have for sale! Iβm #19 on the map. Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast https://simonwillison.net/2024/Nov/22/weeknotes/ Wrote about a delightfully subversive use of a Bluesky custom labeler: displaying labels on accounts belonging to British public figures showing which expensive private school they went to and what the current fees are for that school We need input from our community π Take and re-share the 2024 #Django Developers Survey β https://jb.gg/75ipes - collab with @jetbrains, makers of @pycharm π«Ά Google released a new LLM today - gemini-exp-1121, hot on the heels of last week's gemini-exp-1114 It's currently at the top of the Chatbot Arena. I've updated my llm-gemini plugin to support it and used that to run my pelican on a bicycle SVG benchmark My notes: https://simonwillison.net/2024/Nov/22/gemini-exp-1121/ Amazon S3 just grew "append" support! It's only available for the more expensive, lower latency "S3 Express One Zone" bucket class but you can now append data to an object up to 10,000 times - previously you could only ever atomically replace a whole object with an updated version https://simonwillison.net/2024/Nov/22/amazon-s3-append-data/ Released a new version of my llm-gguf LLM plugin adding support for GGUF embedding models - which means you can use models like the bafflingly small (30.8MB in its smallest quantization) mxbai-embed-xsmall-v1 model with LLM @simon do you see any convergence being possible on the file format for the model weights? GGUF, ONNX, etc? TIL Fabrice Bellard has a closed-source REST server for serving LLMs (and image generation models) called TextSynth, which he's been hacking on since 2019 starting with GPT-2 https://simonwillison.net/2024/Nov/21/textsynth-server/ Foursquare just open sourced their 100 million place point of interest dataset! Some notes on poking around with it using DuckDB (it's Parquet files on S3) https://simonwillison.net/2024/Nov/20/foursquare-open-source-places/
Show previous comments
Built my first experiment on top of Bluesky's API (actually the Jetstream WebSocket proxy) - it took ~15s of prompting in Claude to get this working: https://tools.simonwillison.net/bluesky-firehose More details including the prompt transcript here: https://simonwillison.net/2024/Nov/20/bluesky-websocket-firehose/ Let's Encrypt is 10 years old today! #tech #technology #security #privacy #encryption #https #letsencrypt #ISRG
Show previous comments
@Some_Emo_Chick I'm not sure, but will I use certificates from a so called phishing CA? .... a difficult question ... @Some_Emo_Chick They were not the first (anyone remember startssl.com?) but they sure did a great job with the automation. @Some_Emo_Chick I do congratulate @letsencrypt even tho @cacert was way earlier there and only got #cickblocked by #GAFAMs like #Aoole & #Microsoft who refused to integrate it and @mozilla who didn't integrate it either. - The reasons why are the most abdurd given compromized CAs as well as free, non-#KYC-Certs were accepted without warning... Meanwhile #LetsEncrypt can be setup fully-automatic. Here's video and a bunch of links from the conversation I had today with @benjedwards about that memorable time when Microsoft Bing went feral https://simonwillison.net/2024/Nov/19/notes-from-bing-chat/ I used a trick to help write the shownotes: dump a Whisper transcript into Claude and prompt "List of potential articles and other resources to link to in show notes - be as comprehensive as possible, no need to provide URLs, just provide a description of each one" https://gist.github.com/simonw/865c1b1c20eaa869411ddc6aad9897e2 Looks like it's a big LLM release Monday today - so far Qwen 2.5 Turbo (API-only) and a new vision model from Mistral called Pixtral Large (open weights) https://qwenlm.github.io/blog/qwen2.5-turbo/ @simon I wonder what kind of hardware it takes to support a 1M token context window. That's amazing. Notes on accessing Pixtral Large via LLM and llm-mistral on my blog: More notes on my blog: https://simonwillison.net/2024/Nov/18/pixtral-large/ @simon do you reckon this is a coincidence or is there coordination between different labs? New release of my LLM combined CLI tool and Python library for interacting with LLMs - the big new feature in 0.18 is support for async models https://llm.datasette.io/en/stable/changelog.html#v0-18 And a new plugin release: llm-claude-3 0.9, adding support for asynchronous access to the Claude family of models https://github.com/simonw/llm-claude-3/releases/tag/0.9 Video and notes from yesterday's session with @phildini talking about https://civic.band/ - his project to gather minutes and agendas from 100+ US local governments and make them searchable using @datasette https://simonwillison.net/2024/Nov/16/civic-band/ @simon @phildini @datasette Bookmarked to watch later! I'm scraping my city's agendas and minutes now and making them full-text searchable using R and the R data.table package in an R Shiny app. Someday I may move to more robust infrastructure and maybe look at adding embeddings and natural-language queries. Some notes on NuExtract, a family of small LLMs fine-tuned for structured data extraction https://simonwillison.net/2024/Nov/16/nuextract-15/ @simon Did you have much luck with this? I played around with the GGUF quant but it seemed to falter and miss fields once I deviated from the published json templates even slightly. Starting in 1.5 hours we'll be hosting the second Datasette Public Office Hours livestream - covering embeddings, vector search, enrichments and with a special guest appearance by @phildini talking about https://civic.band/ - come join us in our Discord: https://discord.gg/jFWyFW8A?event=1306797041816567859 Very impressed by Recraft AI - a new image generation service that can generate editable vector graphics that you can export as SVG This seems massively more useful than tools that can only output raster graphics
Show previous comments
@simon my current image gen test is "toaster lying on its side". no image generator has succeeded at this yet, no matter what wording I try |