@drahardja pretty phony hero. I just gave a look to the paper, this attack works only if the prompt is very short, and the most relevant word subject was poisoned. It won’t work with long prompts. The very moment you call “a dog barking”, the trick is filtered away.