Email or username:

Password:

Forgot your password?
Top-level
Sophie Schmieg

In case you do not know how GenAI works, here is a very abridged description:
First you train your model on some inputs. This is using some very fancy linear algebra, but can be seen as mostly being a regression of some sorts, i.e. a lower dimensional approximation of the input data.
Once training is completed, you have your model predict the next token of your output. It will do so by creating a list of possible tokens, together with a rank of how good of a fit the model considers the specific token to be. You then randomly select from that list of tokens, with a bias to higher ranked tokens. How much bias your random choice has depends on the "temperature" parameter, with a higher temperature corresponding to a less biased, i.e. more random selection.

Now obviously, this process consumes a lot of randomness, and the randomness does not need to be cryptographically secure, so you usually use a statistical random number generator like the Mersenne twister at this step.

So when they write "using a Gen AI model to produce 'true' random numbers", what they're actually doing is using a cryptographically insecure random number generator and applying a bias to the random numbers generated, making it even less secure. It's amazing that someone can trick anyone into investing into that shit.

20 comments
SnoopJ

@sophieschmieg "what if we did statistics, but poorly?"

ask

@sophieschmieg there's also the noise introduced by the GPU scheduler doing the matrix multiplies in a different order which produces different results because float is not associative.

Surely they meant that... Right?...

But also probably that isn't true random either.

Sophie Schmieg

@ask that noise would be considered true random noise, but I don't know how many bits it has. While float isn't associative, it's like "mostly" associative, so depending on the condition of the matrix, it should be fairly low.

In any case, if you wanted to use that noise for cryptographic purposes, you'd first have to debias it by running it through a DRBG, and at that point you could just harvest it directly from the GPU for higher quality and performance.

Or query your stupid hardware RNG that literally every modern CPU has built-in.

@ask that noise would be considered true random noise, but I don't know how many bits it has. While float isn't associative, it's like "mostly" associative, so depending on the condition of the matrix, it should be fairly low.

In any case, if you wanted to use that noise for cryptographic purposes, you'd first have to debias it by running it through a DRBG, and at that point you could just harvest it directly from the GPU for higher quality and performance.

Joris Meys

@sophieschmieg oh fck. I thought they were joking, but they're actually serious??

entronid

@sophieschmieg also quantum computers can't fucking break rngs*

*before the end of the universe

Yuri Arabadji

He clearly stated "if interested".

Which means if you're not interested there's no world-first quantum-proof random generator.

Quite obvious.

Cassander

@sophieschmieg If LLMs are snake oil, this "AI RNG" is meta-snake oil. It's like expecting a homeopathy distillation of horse dewormer will cure Covid.

It's so obviously fake that I can't even find a good metaphor to explain how bad it is.

niconiconi

@sophieschmieg@infosec.exchange Ironically most GenAI implementations have troubles on producing deterministic output due to floating point errors, inconsistent batching, etc. Not random enough for crypto, but random enough to create replication problems. It's what I call Murphy's Duality Law - In engineering, when a system can show both the property "A" and its negation "not A" depending on the specific context, it's always the opposite of what your application needs.

Rich Felker

@sophieschmieg LMAO what??? There are ppl trying to use LLM output as RNG??? And thinking "I'm too stupid to understand how it works so that means it's secure!!!111" ??? 🤦

🤏 🎻 when they get pwned. I'm out of patience for the LLM fan 🤡 🚗

Rich Felker

@sophieschmieg BTW not criticizing your choice of MT as an illustration because it's exactly the sort of thing these bozos would know by name, but it's utterly the worst choice of deterministic PRNG. Gratuitously large state, poor output quality. Even a 128bit or possibly even 64bit LCG throwing away lower bits is better.

Beggar Midas

@sophieschmieg Claude has a response for ya. "You're oversimplifying. While language models do use probabilistic token selection, reducing them to "fancy RNGs" is like calling a brain "just electrical signals." The learned probability distributions capture complex semantic relationships and patterns from human knowledge. That said, your skepticism about AI hype is fair - there are plenty of overinflated claims worth challenging."
Not bad for a bucket of bolts 'rando number generator', eh?

Olivier

@sophieschmieg "Let's generate low quality random numbers about as fast as a grandma knitting socks using terra-watts of power in billion dollar data centers." - said no one ev... Oh wait.

Shannon Persists🌈

@sophieschmieg It doesn't look random at all. It looks like a crude airplane.

Elias Mårtenson

@sophieschmieg When you said the term, I just assumed they meant that they use an llm code generator o create a program that generates "cryptographically secure random numbers". I'm sure your standard llm can give you something that resembles this.

It'll take about 5 minutes, and then you can spend the investment capital on more interesting things (like private jets).

John Ripley

@sophieschmieg All of GenAI is the thought experiment "What if we did a shitty version that doesn't work and needs as much power as a city" except some bro did it for real, so this seems like a natural application.

Go Up