Trolling AI Scrapers (For Fun)
Did you know that over 90% of my website traffic was coming from AI scrapers? Well, not anymore!
While reviewing my Caddy logs, I noticed an overwhelming amount of activity. The logs were flashing by so quickly that I had to pause and take a closer look. What I found was a barrage of user agents, all containing "ai" or "llm." I had to do something about it, i don't want LLMs trained on my website.
The Initial Attempt
At first, I tried the straightforward approach of blocking these scrapers using a robots.txt
file. Unfortunately, this method proved ineffective, as most scrapers ignored it. I needed a more creative solution.
The Trolling
Instead of outright blocking them, I decided to have a little fun. I configured my server to respond to these scrapers with the text of Capital, Vol. 1 by Karl Marx. Here’s how I set it up in Caddy:
@robot {
header User-Agent *bot*
header User-Agent *spider*
header User-Agent *ai*
header_regexp ua (?i)(AdsBot-Google|Amazonbot|anthropic-ai|Applebot|Applebot-Extended|AwarioRssBot|AwarioSmartBot|Bytespider)
not path /robots.txt
}
handle @robot {
file_server {
root /path/to/capital
}
rewrite * /capital.txt
}
So, enjoy training on Capital!