In response to the rapid proliferation of massive and often illicit data scraping by AI companies, ImmuniWeb unveils a free online tool to quickly assess website’s protection from bots.
As evidenced by a recent hearing “AI Industry’s Mass Ingestion of Copyrighted Works” at the US Senate Judiciary Committee, unauthorized exploitation of copyrighted and proprietary creative works – including all kinds of texts, images, compositions and videos – by AI companies for training of their Large Language Models (LLMs) – is an emerging economic, legal and social problem.
According to the Database of AI Litigation (DAIL) by the Ethical Tech at the George Washington University, currently, there are over 250 pending lawsuits against major AI vendors for, among other things, copyright infringement, unwarranted data scraping and even exploitation of pirated content for AI training purposes.
In July, Cloudflare – a leading cybersecurity company that is estimated to protect over 20% of global websites by its technical solutions including the largest websites in most countries – announced that AI bots would then be blocked by default to prevent unwarranted data scraping from websites protected by Cloudflare.
Today, in order to empower individuals and companies of all sizes to quickly ascertain whether their websites and web applications are properly protected from AI bots and other unauthorized data-scraping web crawlers, ImmuniWeb presents a new feature incorporated into its online website security test to check the presence and correct implementation of an anti-bot protection.
The new feature quickly detects the presence of a Web Application Firewall (WAF) or any other network-wide or server-side security mechanisms that prevent data scraping by rogue bots, as well as verifies the presence of a “robots.txt” file on the web server with instructions for rules-abiding AI bots not to crawl the website content. Importantly, many AI bots, majority of which are operated by vendors outside of Europe and the US, deliberately ignore “robots.txt” instructions, leaving website owners without a recourse but to implement a robust anti-bot protection to comprehensively block trespassing bots.
ImmuniWeb Website Security Test: Verifying Protection from AI Bots
Dr. Ilia Kolochenko, Chief Architect & CEO at ImmuniWeb, says: “We are delighted to bring this important and highly demanded feature to all our users, who are increasingly concerned over the amplitude of the irreparable harm caused by illicit data scraping that some AI companies and their clandestine suppliers in offshore jurisdictions deploy to ingest petabytes of creative content owned by third parties, which is subsequently exploited to train commercial LLMs without properly compensating the authors. Now anyone can easily verify whether their personal, academic or corporate websites are adequately protected from AI and other bots chasing for proprietary data without author’s permission. More is coming very soon, please stay tuned.”
The free website security test is a part of the ImmuniWeb’s Community Edition that currently runs over 100,000 daily security scans in over 100 countries. Statistical data from the Community Edition has been utilized in the Verizon Data Breach Investigations Report (DBIR) to which ImmuniWeb is a Contributor, as well as leveraged in strategic partnerships that ImmuniWeb has with various NGOs and international organizations including the UN ITU.
To test your website’s protection from AI bots, click on this link.