self-sabotage against llms

2025-06-22

Inspired by Bibbit's How to ruin an AI summary of your work..

These are all welcome tips:

Containerize local LLMs
Treat downloaded files from them as potentially hostile
Learn how different kinds of prompt injection works

"Defending your work" by hijacking LLM output via wholly incorrect info strikes me as odd. Yes, I understand impeding adaptive consumption might thwart bad apples. However, I'm not keen on the involved deception, accessibility costs, and potential for friendly fire. I feel the same way with statements like "expose the LLM-cheat as a fraud".

Certainly, such tactics could be weaponized against groups you associate with or like. Depending on AI uptake across those groups and beyond, you might not enjoy the results. In contrast, saying what you want gets to the point. It also avoids throwing those that support you and summarize via AI under the bus.

Content deserves more scrutiny against this. If I find these tactics employed, it's my duty to divert my focus toward content which welcomes me instead. That diversion per se might be overdue with how much I have seen people denigrate AI, LLMs, and more.

Want to reach out? Connect with me however you prefer:

Email me via your mail client
Copy my email address or remember it for later: yoursimperfect@proton.me
Email me via Letterbird contact form or open it in a new tab