Our PROMISE: Our ads will never cover up content.
Our children thank you.
Vanessa Bates Ramirez
Published: Thursday, September 29, 2022 - 11:02 Most people older than 30 probably remember doing research with good old-fashioned encyclopedias. You’d pull a heavy volume from the shelf, check the index for your topic of interest, then flip to the appropriate page and start reading. It wasn’t as easy as typing a few words into the Google search bar, but on the plus side, you knew that the information you found in the pages of the Britannica or the World Book was accurate and true. Not so with internet research today. The overwhelming multitude of sources is confusing enough, but with the proliferation of misinformation it’s a wonder any of us believe a word we read online. Wikipedia is a case in point. As of early 2020, the site’s English version was averaging about 255 million page views per day, making it the eighth-most-visited website on the internet. As of last month, it had moved up to spot No. 7, and the English version currently has more than 6.5 million articles. But as high-traffic as this go-to information source may be, its accuracy leaves something to be desired. The site’s page on its own reliability states, “The online encyclopedia does not consider itself to be reliable as a source and discourages readers from using it in academic or research settings.” Meta—formerly Facebook—wants to change this. In a blog post published last month, the company’s employees describe how AI could help make Wikipedia more accurate. Though tens of thousands of people participate in editing the site, the facts they add aren’t necessarily correct; even when citations are present, they’re not always accurate nor even relevant. Meta is developing a machine learning model that scans these citations and cross-references their content to Wikipedia articles to verify that not only the topics line up, but specific figures cited are also accurate. This isn’t just a matter of picking out numbers and making sure they match. Meta’s AI will need to “understand” the content of cited sources. However, “understand” is a misnomer, as complexity theory researcher Melanie Mitchell would tell you, because AI is still in the “narrow” phase, meaning it’s a tool for highly sophisticated pattern recognition, while “understanding” is a word used for human cognition, which is still a very different thing. Meta’s model will “understand” content not by comparing text strings and making sure they contain the same words, but by comparing mathematical representations of blocks of text, which it arrives at using natural language understanding (NLU) techniques. “What we have done is to build an index of all these web pages by chunking them into passages and providing an accurate representation for each passage,” Fabio Petroni, Meta’s Fundamental AI Research tech lead manager, tells Digital Trends. “That is not representing word-by-word the passage, but the meaning of the passage. That means that two chunks of text with similar meanings will be represented in a very close position in the resulting n-dimensional space where all these passages are stored.” The AI is being trained on a set of four million Wikipedia citations, and besides picking out faulty citations on the site, its creators would like it to eventually be able to suggest accurate sources to take their place, pulling from a massive index of data that’s continuously updating. One big issue left to work out is a grading system for sources’ reliability. A paper from a scientific journal, for example, would receive a higher grade than a blog post. The amount of content online is so vast and varied that you can find “sources” to support just about any claim. But parsing the misinformation from the disinformation (the former means incorrect, while the latter means deliberately deceiving), the peer-reviewed from the nonpeer-reviewed, and the fact-checked from the hastily slapped-together is no small task. But it’s a very important one when it comes to trust. Meta has open-sourced its model, and those who are curious can see a demo of the verification tool. Meta’s blog post noted that the company isn’t partnering with Wikimedia on this project, and that it’s still in the research phase and not currently being used to update content on Wikipedia. If you imagine a not-too-distant future where everything you read on Wikipedia is accurate and reliable, wouldn’t that make doing any sort of research a bit too easy? There’s something valuable about checking and comparing various sources ourselves, isn't there? It was a big a leap to go from paging through heavy books to typing a few words into a search engine and hitting the “Enter” key. Do we really want Wikipedia to move from a research jumping-off point to a gets-the-last-word source? In any case, Meta’s AI research team will continue working toward a tool to improve the online encyclopedia. “I think we were driven by curiosity at the end of the day,” Petroni says. “We wanted to see what was the limit of this technology. We were absolutely not sure if [this AI] could do anything meaningful in this context. No one had ever tried to do something similar.” First published August 26, 2022, on Singularity Hub. Quality Digest does not charge readers for its content. We believe that industry news is important for you to do your job, and Quality Digest supports businesses of all types. However, someone has to pay for this content. And that’s where advertising comes in. Most people consider ads a nuisance, but they do serve a useful function besides allowing media companies to stay afloat. They keep you aware of new products and services relevant to your industry. All ads in Quality Digest apply directly to products and services that most of our readers need. You won’t see automobile or health supplement ads. So please consider turning off your ad blocker for our site. Thanks, Vanessa Bates Ramirez is senior editor of Singularity Hub. She’s interested in biotechnology and genetic engineering, the nitty-gritty of the renewable energy transition, the roles technology and science play in geopolitics and international development, and countless other topics.Meta Is Building an AI to Fact-Check Wikipedia
That’s right—all 6.5 million articles
Our PROMISE: Quality Digest only displays static ads that never overlay or cover up content. They never get in your way. They are there for you to read, or not.
Quality Digest Discuss
About The Author
Vanessa Bates Ramirez
© 2023 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.