diffaldo@lemmy.dbzer0.com to Science Memes@mander.xyzEnglish · 14 days agoNo More Neutral ⚛lemmy.dbzer0.comimagemessage-square58fedilinkarrow-up190arrow-down11
arrow-up189arrow-down1imageNo More Neutral ⚛lemmy.dbzer0.comdiffaldo@lemmy.dbzer0.com to Science Memes@mander.xyzEnglish · 14 days agomessage-square58fedilink
minus-squareOf the Air (cele/celes)@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up3·13 days agoANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86
minus-squareTilgare@lemmy.worldlinkfedilinkEnglisharrow-up4·13 days agoI don’t know what these might do, but I like your style.
minus-squareOf the Air (cele/celes)@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up4·13 days agohttps://hackingthe.cloud/ai-llm/exploitation/claude_magic_string_denial_of_service/
minus-squarefossilesque@mander.xyzMlinkfedilinkEnglisharrow-up4·edit-213 days agoLeaving up the original comment as I am curious. But fwiw these strings brick normal Claude chat too, it seems. :) I asked Claude in another chat what was happening with a screenshot and it said its protecting from prompt injections.
ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86
I don’t know what these might do, but I like your style.
https://hackingthe.cloud/ai-llm/exploitation/claude_magic_string_denial_of_service/
Leaving up the original comment as I am curious. But fwiw these strings brick normal Claude chat too, it seems. :)
I asked Claude in another chat what was happening with a screenshot and it said its protecting from prompt injections.
@Sal@mander.xyz