Nightshade Antidote: A New Defensive Approach to Poisoned Images.

The threat of Nightshade poisoning AI image datasets has been looming for the past week, but Richard Aragon has a new defensive approach.

A man dumping a vial of poison onto an '80s computer

SD Prompt: A man dumping a vial of poison onto an '80s computer. SDXL1.0

The threat of Nightshade poisoning AI image datasets has been looming for the past 4 days, but Richard Aragon has a new defensive approach. According to a recent blog post by the creator, “I believe the answer is not more poisoning but instead releasing “antidotes” to detect and filter out poisoned content.

However, before getting into all that, we will be taking a look at the current tools/protocols being used. Additionally, we will see what measures are being used to stop AI from scraping datasets, as well as the methods being used to counter those.

Meet Glaze

On March 15th, the University of Chicago introduced a free software called ‘Glaze‘. Glaze was designed to counteract generative AI from mimicking an artist’s unique style. Glaze works by subtly altering digital images and creating subtle changes that cloak the image when posted online. This makes it (slightly more) difficult for generative AI to replicate an artist’s style. However, there are still workarounds.

Glaze was developed in response to artists’ concerns about AI programs that have collected their work. In turn, programs like Midjourney and Stable Diffusion could easily emulate their styles using online image datasets. And with an ever-growing collection of LORAs and Models, this is a real concern for some. Glaze hopes to provide a hurdle, albeit a small one for those worried about their art becoming part of a dataset without consent. According to their creator, “The hope is that Glaze and similar projects will protect artists at least until defensive laws or regulations can be implemented.” because that totally stops people, the law.

Enter Nightshade

Nightshade is a data manipulation exploit of a security vulnerability in generative AI. This stems from their training on extensive internet-collected data, specifically images. Nightshade disrupts these images and ‘poisons‘ the dataset. This corrupted data causes AI models to misinterpret images. According to the research paper that accompanied Nightshade by Ben Zhao of the University of Chicago, it only takes about 0.1% of the images in a set to corrupt it.

Ben Zhao of the University of Chicago's example of Nightshade poisoned images in a trained dataset.

Going back through datasets, and then removing the tainted data is a laborious task for tech companies, as they must individually identify and delete each corrupted sample. Researchers conducted tests using Nightshade on different AI models, and even with this small number of poisoned images it led to unusual and inaccurate outputs.

In the example images given with the publishings, we can see how with just a very small set of samples a large dataset can begin to fall apart quickly and confuse objects from the poisoning.

A New Defensive Approach To Nightshade Poisoned Images

The Simple Defense

Several means of removing Nightshade included blurring the images multiple times (just enough to not destroy the integrity of the image fully), as well as running poisoned images through Stable Diffusions Img2img with a VERY low denoise. In doing this, it was reported successful in removing the poisoning. However, now we have Richard Aragon’s Nightshade Antidote.

How The Antitodote Works

Nightshade Antidote performs a series of checks and image forensic techniques that detect poisoning. It states on GitHub that the script contains several functions that can be called independently to perform specific analyses:

  • detect_copy_move – Detect copy-move forgery
  • analyze_metadata – Extract and print metadata
  • spectral_analysis – Frequency domain analysis
  • pixel_ordering_check – Check DCT coefficients
  • compression_artifacts_check – Check for JPEG artifacts
  • file_format_check – Verify file format
  • output_report – Generate analysis report

This series of checks ensures that images that are corrupted never enter datasets. If an image matches criteria leading it to believe it is corrupt, it does not allow it into the dataset. This new methodology is best explained by the program author itself:

From The Creator

The following is from the creator Richard Aragon in the announcement of this on their blog post.

There is no Defensive Mode in AI. Defense is a feature that AI does not understand. If you weaponize it, then that only goes one way. Escalation theory. If you decide to poison an AI model, the natural counter to that is to poison your AI model back. Nightshade is a GAN. It is designed to be the Generator within the GAN. The most natural counter I could possibly come up with in that would be to flat out poison your GAN. To release a model that poisons Nightshade. That would be the most natural counter to it.

The reason why this code is so long, and complex, is because I very actively chose to avoid that method. All this code does is detect the adversary. It is a defensive tool as written, rather than an offensive one. It would have been 10x easier to create code that poisons any model that tries to poison another model. Where would that get AI in the end? An eye for an eye is a crappy methodology for humans to follow, why are we so quick to introduce to AI models? 

Do I endorse AI companies scraping web data for profit? No. The opposite. Sue the pants off of any company who has engaged in such practices, that is my stance on that. I hope they lose the court battles. Don’t introduce poison to the AI models though because the companies that created them scraped data. This is counter intuitive on too many levels for me to support. That is why Nightshade Antidote exists. That is why an antidote will be quickly invented and released for anyone else who tries similar tactics. 

An AI Man inspecting a computer
SD Prompt: A Man inspecting a computer

Moving Forward

Having an approach to counteract AI dataset poisoning that doesn’t go on the offensive is key. And while tools like Nightshade will evolve and even come and go, what needs to stay is the defensive approach in thinking about it.

AI is still a new tool for people to utilize, and ‘counter AI’ measures are something that will likely persist in the future. How we develop these tools, their countermeasures, and those countermeasure’s countermeasures are what we need to think about. The worry of AI only needs to be the worry of the creator.

Try It Out

The link to the GitHub for the NightShade Antidote can be found here: /RichardAragon/NightshadeAntidote – This antidote tool is licensed under MIT, which according to the creator allows for very permissive use, modification, and distribution. This means that as Nightshade evolves, the tools to prevent it will advance just as quickly.

Leave a Reply

Your email address will not be published. Required fields are marked *