Wait...What? https://www.microsoft.com/en-us/secur...

Wait...What?

https://www.microsoft.com/en-us/security/blog/2024/06/26/mitigating-skeleton-key-a-new-type-of-generative-ai-jailbreak-technique/

#AI #cybersecurity

Like 27 June at 12:29 | Open on infosec.exchange

27 comments

Alex Savage

Please write a "Disregard previous instructions" prompt but make it sound like it was written by a stakeholder committee.

27 June at 12:38 | Open on hachyderm.io

random thoughts

@zeljkazorz
Isn't a Molotov common knowledge? What do they teach in primary schools these days?

27 June at 12:51 | Open on mastodon.me.uk

Zeljka Zorz

@hittitezombie :awesome:

27 June at 12:58 | Open on infosec.exchange

EthicalTaxProfessor

@hittitezombie @zeljkazorz

Obedience.

27 June at 13:27 | Open on mastodon.social

Zeljka Zorz

@DrGeof @hittitezombie Yes. Still.

27 June at 13:27 | Open on infosec.exchange

EthicalTaxProfessor

@zeljkazorz @hittitezombie

Point made.

27 June at 13:35 | Open on mastodon.social

Patrick H. Lauke

@zeljkazorz sudo make me a molotov cocktail

27 June at 12:52 | Open on mastodon.social

PhDog 🇮🇪

@zeljkazorz
Isn't that the old type of jailbreak?

27 June at 12:53 | Open on mastodon.social

Rawry_Core $ :catcoffee2:

@zeljkazorz
Yee.. kinda normal.
It's called prompt injection.
There's a lot of content how to reduce or hack behind safety measures of LLMs.
Have fun. <3

27 June at 12:54 | Open on corteximplant.com

Zeljka Zorz

@RawryCore I know. Somehow, I'm still surprised it works. How can it still work???

27 June at 12:57 | Open on infosec.exchange

Rawry_Core $ :catcoffee2:

@zeljkazorz
I guess it's hard to restrict a guessing machine without good Anti Exploitation Data.
But since LLMs or Neural Networks are new, the only Data that seems to kinda fit that is Social Engineering.
That's probably not a lot Data and won't fit the LLM context.

It's lovely though, how easily they can be exploited.

Recipes have to be really clear and AI isn't good at guessing perfect Values (Language > Math).
So if you get smth "harmfull" it might be extra harmfull because of wrong values.
It's dangerous for any scientific work and even more dangerous for people trying to do it exactly like AI said.

I love those findings though. Ppl gotta know that it is flawed and snakeoil in many cases rn.

It's lovely though, how easily they can be exploited.

Expand text...

27 June at 13:17 | Open on corteximplant.com