Wikipedia was on exposure, which – bots that cleanse the text and multimedia from the encyclopedia for teaching generative models of artificial intelligence – on their servers, which leads to an increase in the cost and more slow load time for users in some cases. Perhaps in an attempt to stop the bots from beating on the web site of public wikipedia and absorbing too large throughput, the Wikimedia Foundation (which controls Wikipedia data) offers AI developers that they can freely use.
The organization united with the Kaggle, Data Science platform, to offer beta -version of the structured data set in both English and French. – Which owns Kaggle – the data set is formatted for machine learning to make it more useful for training, development and data science.
Wikimedia Enterprise that the data set includes “abstracts, short descriptions, information in the Infobox style, links to images and clearly segmented sections of the article”. There are no links or other “non -core elements”, such as video clips. The lack of links can make an attribution for information in the data set of a somewhat foggy. Nevertheless, Wikimedia Enterprise (part of the Wikimedia Foundation, which seeks to make Wikipedia data available through the API), says that the content in the data set is freely licensed under the creative community, public property, and so on, since it is all from Wikipedia.