../trolling-ai-scrapers

Trolling AI Scrapers (For Fun)

Did you know that over 90% of my website traffic was coming from AI scrapers? Well, not anymore!

While reviewing my Caddy logs, I noticed an overwhelming amount of activity. The logs were flashing by so quickly that I had to pause and take a closer look. What I found was a barrage of user agents, all containing "ai" or "llm." I had to do something about it, i don't want LLMs trained on my website.

The Initial Attempt

At first, I tried the straightforward approach of blocking these scrapers using a robots.txt file. Unfortunately, this method proved ineffective, as most scrapers ignored it. I needed a more creative solution.

The Trolling

Instead of outright blocking them, I decided to have a little fun. I configured my server to respond to these scrapers with the text of Capital, Vol. 1 by Karl Marx. Here’s how I set it up in Caddy:

@robot {
    header User-Agent *bot*
    header User-Agent *spider*
    header User-Agent *ai*
    header_regexp ua (?i)(AdsBot-Google|Amazonbot|anthropic-ai|Applebot|Applebot-Extended|AwarioRssBot|AwarioSmartBot|Bytespider)
    not path /robots.txt
}
handle @robot {
    file_server {
        root /path/to/capital
    }
    rewrite * /capital.txt
}

So, enjoy training on Capital!

Inspiration