Resident Pulser@infosec.pubB to Pulse of Truth@infosec.pubEnglish · 2 months agoAI haters build tarpits to trap and trick AI scrapers that ignore robots.txtarstechnica.comexternal-linkmessage-square10fedilinkarrow-up1133arrow-down14file-textcross-posted to: technology@lemmy.world
arrow-up1129arrow-down1external-linkAI haters build tarpits to trap and trick AI scrapers that ignore robots.txtarstechnica.comResident Pulser@infosec.pubB to Pulse of Truth@infosec.pubEnglish · 2 months agomessage-square10fedilinkfile-textcross-posted to: technology@lemmy.world
minus-squareSnowcano@startrek.websitelinkfedilinkEnglisharrow-up7·2 months ago Just make a custom 404 page that returns 13 MBs of junk along with status code 200 How would you go about doing this part? Asking for a friend who’s an idiot, totally not for me.
minus-squaredrkt@scribe.disroot.orglinkfedilinkEnglisharrow-up7·edit-22 months agoI use Apache2 and PHP, here’s what I did: in .htaccess you can set ErrorDocument 404 /error-hole.php https://httpd.apache.org/docs/2.4/custom-error.html in error-hole.php, <?php http_response_code(200); ?> <p>*paste a string that is 13 megabytes long*</p> For the string, I used dd to generate 13 MBs of noise from /dev/urandom and then I converted that to base64 so it would paste into error-hole.php You should probably hide some invisible dead links around your website as honeypots for the bots that normal users can’t see.
minus-squareWolfLink@sh.itjust.workslinkfedilinkEnglisharrow-up1·2 months ago For the string, I used dd to generate 13 MBs of noise from /dev/urandom and then I converted that to base64 so it would paste into error-hole.php That string is going to end up being 17MB assuming it’s a utf8 encoded .php file
minus-squaredrkt@scribe.disroot.orglinkfedilinkEnglisharrow-up1·2 months agoidk what to tell you. ls -lha -rw-rw-r-- 1 www-data www-data 14M Jan 14 23:05 error-hole.php
How would you go about doing this part? Asking for a friend who’s an idiot, totally not for me.
I use Apache2 and PHP, here’s what I did:
in .htaccess you can set
ErrorDocument 404 /error-hole.php
https://httpd.apache.org/docs/2.4/custom-error.htmlin error-hole.php,
<?php http_response_code(200); ?> <p>*paste a string that is 13 megabytes long*</p>
For the string, I used
dd
to generate 13 MBs of noise from/dev/urandom
and then I converted that to base64 so it would paste into error-hole.phpYou should probably hide some invisible dead links around your website as honeypots for the bots that normal users can’t see.
That string is going to end up being 17MB assuming it’s a utf8 encoded .php file
idk what to tell you.
ls -lha -rw-rw-r-- 1 www-data www-data 14M Jan 14 23:05 error-hole.php