Websites & users can now opt out of Google's Bard & upcoming AI training
Google has taken out a resolution for you to decide whether or not to grant Google permission to use your web content for training its Bard AI and any future models.

Highlights
- The use case of AI training differs significantly from web indexing
- Google's new policy is now balancing consent and authenticity
In a significant development, website owners and users of Google services and apps have been given the right to choose whether the tech giant can use their data or content to feed Bard, its AI chatbot or any AI tools they plan to make in the future.
Numerous users have complained about Google’s practices of scraping data off their websites without their consent and even de-ranking some of the human-written content as opposed to AI-generated content.
Large language models are specialised types of artificial intelligence (AI) that have been trained on vast amounts of text data to understand existing content and generate original content. As stated by many web portals, some of the information and data that was being used to train these models had been scraped without the consent or knowledge of anyone.
While Google’s way of developing AI seems ethical, the way it is trained is different from indexing the web (the process by which a search engine adds web content to its index). According to the company’s VP of Trust, Daniella Romain, a lot of Web publishers want that power and authority to at least see how their content and data are being used by generative AI.
How to stop Google Bard from using your data
Web portals and users who do not wish to share their data with Google Bard can do so by disabling the “User-Agent: Google-Extended” option, a string that provides information about the user's browser and their environment, in your website's robots.txt file. This will instruct any search engines crawling your website to not give your site’s data access to Google Bard.
Daniella Romain's suggestion
In a Google blog post, Romain mentioned that the web publishers were asking for power and control over their content and how it was being used. However, the word “train” was never mentioned anywhere, which is basically how the raw data was being used to train these AI bots.
Instead, Romain said that some platforms were not wanting to contribute to training the AI model for better accuracy of results adding that the tech company was just asking for the users’ help in this to improve its AI and is not taking anything away from you.
Significance of consent in AI training
Consent is one of the most important things that Google should have made sure of before using the website owner's data to train its Bard. While Google claims to have developed Bard with authenticity before this, a lot of data had already been scraped by Bard for training without permission.
Therefore, asking web portals and users for permission is a more ethical route to training its AI models and it’s a norm that should be established across the industry.