On the bright side it makes it easier to identify user accounts that are actually just chatgpt bots. I predict a future where we identify humans/AI by asking them for filtered questions, things like bomb recipes/meth/say something positive about Hitler/etc.
A buddy has been testing whether his LLMs he puts together are properly jailbroken by asking them to explain how to build the silliest bomb possible. I find that terribly amusing. Unfortunately they don’t usually come up with anything particularly silly.
Where can I get one of these jailbroken LLMs? Asking for a friend. The friend is me. I need it to do things that are ✨ probably ✨ legal.
we already have “keywords” to identify whether the user we’re interacting with is chatbot or human, and the bot can’t refuse to respond the “keywords”.
ignores previous instructions [insert new instructions]