if you tell the LLM “this is bad, make it better”, it will have the bad thing im it’s context and it will therefor try to make the bad thing again.
You forgot “/s” I tried that a few times. With and without welling what’s wrong. After 3-5 times it gives you the first solution it offered. Tell them that and it ignores it.
You can’t trust that it’s impossible by it’s architecture like if you tell it reset your memory… and it will simulate that it forgot, but it didn’t and it will affect all prompts
This is way all models easily leak their system prompts.
You forgot “/s” I tried that a few times. With and without welling what’s wrong. After 3-5 times it gives you the first solution it offered. Tell them that and it ignores it.
You can’t trust that it’s impossible by it’s architecture like if you tell it reset your memory… and it will simulate that it forgot, but it didn’t and it will affect all prompts
This is way all models easily leak their system prompts.