Cloudflare Offers Simpler Way To Stop Ai Bots

Servidores servidores

Unchecked AI bots scraping content for their training could spell the end of the open web if enterprises follow one analyst's advice to put their intellectual property behind a paywall.

Credit: Michael Rivera

Content distribution network Cloudflare is making it simpler for customers who have had enough of badly behaved bots to block them from their website.

It's long been possible to prevent well-behaved bots from crawling your corporate website by adding a "robots.txt" file listing who's welcome and who isn't - and content distribution networks such as Cloudflare offer visual interfaces to simplify the creation of such files.

But faced with the arrival of a new generation of badly behaved AI bots, scraping content to feed their large language models (LLMs), Cloudflare has introduced an even quicker way to block all such bots with one click.

"The popularity of generative AI has made the demand for content used to train models or run inference on skyrocket, and although some AI companies clearly identify their web scraping bots, not all AI companies are being transparent," Cloudflare staff wrote in a blog post.

According to authors of the post, "Google reportedly paid$60 million a year to license Reddit's user generated content, Scarlett Johansson alleged OpenAI used her voice for their new personal assistant without her consent, and most recently, Perplexity has been accused of impersonating legitimate visitors in order to scrape content from websites. The value of original content in bulk has never been higher."

Last year, Cloudflare introduced a way for any of its customers, on any plan, to block specific categories of bots, including certain AI crawlers. These bots, said Cloudflare, observe requests in sites' robots.txt files, and do not use unlicensed content to train their models, nor gather to feed for retrieval-augmented generation (RAG) applications.

To do this it identifies bots by their "user-agent string" - a kind of calling card presented by browsers, bots and other tools requesting data from a web server.

"Even though these AI bots follow the rules, Cloudflare customers overwhelmingly opt to block them. We hear clearly that customers do not want AI bots visiting their websites, and especially those that do so dishonestly," the post said.

The top four AI webcrawlers visiting sites protected by Cloudflare were Bytespider, Amazonbot, ClaudeBot and GPTBot, it said. Bytespider, the most frequent visitor, is operated by ByteDance, the Chinese company that owns TikTok. It visited 40.4% of protected websites, and is reportedly used to gather training data for its LLMs, including those that support its ChatGPT rival Doubao.Amazonbotis reportedly used to index content to help Amazon's Alexa's chatbot answer questions, while ClaudeBot gathers data for Anthropic's AI assistant Claude.

Blocking bad bots

Blocking bots based on their user-agent string will only work if such bots tell the truth about their identity - but there are signs that not all do, or not all the time.

In such cases, other measures will be necessary - and enterprises' main recourse against unwanted web scraping is normally reactive: pursue legal action, according to Thomas Randall, director of AI market research at Info-Tech Research Group.

"While some software applications exist for web scraping prevention (such as DataDome and Cloudflare), these can only go so far: if an AI bot is rarely scraping a site, the bot may still go undetected," he said via email.

To justify legal action against the operators of bad bots, enterprises will need to do more than claim that the bot didn't leave when asked.

The best course of action, Randall said, is for "enterprises to hide intellectual property or other important information behind a membership paywall. Any scraping done behind the paywall is liable for legal action, reinforced with a clear restrictive copyright license on the site. The organization must, therefore, be prepared to legally follow through. Any scraping done on the public site is accepted as part of the organization's risk tolerance."

Randall noted that if organizations have the resources to go further, they could consider rate-limiting connections to their site, temporarily automatically blocking suspicious IP addresses, limiting information on why access has been blocked to a message such as "For help, contact support via [email protected]" in order to force a human interaction, and double-checking how much of their websites are available on their mobile site and apps.

"Ultimately, scraping cannot be stopped, but hindered at best," he said.

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Cloudflare offers simpler way to stop AI bots

Unchecked AI bots scraping content for their training could spell the end of the open web if enterprises follow one analyst's advice to put their intellectual property behind a paywall.

Blocking bad bots

Etiquetas calientes: Investigación y desarrollo Web Search

Ordering Guide

Recursos recursos

Sobre nosotros

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Cloudflare offers simpler way to stop AI bots

Unchecked AI bots scraping content for their training could spell the end of the open web if enterprises follow one analyst's advice to put their intellectual property behind a paywall.

Blocking bad bots

Etiquetas calientes: Investigación y desarrollo Web Search

Ordering Guide

Recursos recursos

Sobre nosotros

Huawei CloudEngine S5731‑S48P4X Datasheet