Featured Product
This Week in Quality Digest Live
Quality Insider Features
Harish Jose
The next logical step in complexity
Katy Kvalvik
Effective communication is foundational for effective leadership
Wendy Stanley
PLM, MES, QMS, and ERP work best when they work together
Katherine Harmon Courage
‘The idea is if people just have information, then they will make the rational choice. And that’s just wrong.’
Matthew Staymates
Visual proof supports the ‘Cover smart, do your part, slow the spread’ slogan

More Features

Quality Insider News
Real-time data collection and custom solutions for any size shop, machine type, or brand
Lloyd Instruments launches the LS5 high-speed universal testing machine
Measure diameter, ovality of wire samples, optical fibers and magnet wire, including transparent products
Training, tips and tricks, unboxing, and product videos provide additional information for users
How to develop an effective strategic plan and make the best major decisions in the context of uncertainty and ambiguity
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Laser scanning also used to help create safety covers for credit card readers
A complimentary webinar for novices to experts on May 27-28, 2020
MetLogix Mx200 DRO is fully featured and easy to use

More News

NIST

Quality Insider

To Measure Bias in Data, NIST Initiates ‘Fair Ranking’ Research Effort

New initiative has long-term goal of making search technology more even-handed

Published: Wednesday, December 18, 2019 - 13:02

A new research effort at the National Institute of Standards and Technology (NIST) aims to address a pervasive issue in our data-driven society: a lack of fairness that sometimes turns up in the answers we get from information retrieval software.

A measurably “fair search” would not always return the exact same list of answers to a repeated, identical query. Instead, the software would consider the relative relevance of the answers each time the search runs—thereby allowing different, potentially interesting answers to appear higher in the list at times.

Software of this type is everywhere, from popular search engines to less-known algorithms that help specialists comb through databases. This software usually incorporates forms of artificial intelligence that help it learn to make better decisions over time. But it bases these decisions on the data it receives, and if those data are biased in some way, the software will learn to make decisions that reflect that bias, too. These decisions can have real-world consequences—for instance, influencing which music artists a streaming service suggests, and whether you get recommended for a job interview.

“It’s now recognized that systems aren’t unbiased,” says Ellen Voorhees, a NIST computer scientist. “They can actually amplify existing bias because of the historical data the systems train on. The systems are going to learn that bias and recommend you take an action that reflects it.”

As a step toward confronting this problem, NIST has launched the Fair Ranking track this year as part of its long-running Text Retrieval Conference (TREC), which took place last month at NIST’s Gaithersburg, Maryland, campus. Proposed and organized by researchers from Microsoft, Boise State University, and NIST, the track—essentially an incubator for a new area of study—aims to coordinate research around the idea of fairness. By finding appropriate ways to measure the amount of bias in data and search techniques, the organizers hope to identify strategies for eliminating it.

“We would like to develop systems that serve all of their users, as opposed to benefiting a certain group of people,” says Asia Biega, a postdoctoral researcher at Microsoft Research in Montreal and one of the track’s co-organizers. “We are trying to avoid developing systems that amplify existing inequality.”

Although awareness of the trouble that biased data create is growing, there are also many different ways of defining and evaluating fairness in data sets and search tools. In order to keep the research effort focused, the organizers have chosen a particular set of data used by a specific search tool: the Semantic Scholar search engine, developed by the nonprofit Allen Institute for Artificial Intelligence to help academics search for papers relevant to their field. The Allen Institute provided TREC with a 400-gigabyte database of queries made to Semantic Scholar, the resulting lists of answers it returned, and—as a measure of each answer’s relevance to the searcher—the number of clicks each answer received.

The organizers also have elected to concentrate on a problem that often crops up both in commercial search engine results and in scholarly searches for research papers: the appearance of the same answers at the top of a list every time after running a particular search term.

One problem with academic searches is that they often return a list of papers by the best-known researchers from high-ranked institutions. In one way, this approach to ranking makes sense, as conscientious scientists want to show they have reviewed the most relevant past research before claiming to have discovered something new. Similarly, when we use an internet search tool, we might not mind if we see a well-respected company at the top of our search for a product we know little about.

Although this result is fine if we are looking for the most popular answer, it is problematic if there are great numbers of worthwhile answers. To be sure, some of us do scroll through pages of results, but by and large, most people who use search tools never look beyond the first page.

“The results on that first page influence people’s economic livelihood in the real world,” Biega says. “Search engines have the power to amplify exposure. Whoever is on the first page gets more.”

A fair algorithm, by the research track’s measure, would not always return the exact same list of articles in the same order in response to a query, but instead would give other articles their fair share of exposure. In practice, this would mean more prominent articles might still show up more frequently, and the less prominent ones less so—but the returned list would not always be the same. It would contain answers relevant to the searcher’s needs, but it would vary in ways that would be quantifiable.

This is of course not the only way to define fairness, and Voorhees says she does not expect a single research project to solve such a broad societal problem as this one. She does say that quantifying the problem is an appropriate first step, however.

“It’s important for us to be able to measure the amount of bias in a system effectively enough that we can do research on it,” she says. “We need to measure it if we want to try.”

The Fair Ranking track is open to all interested research teams. NIST will make the official call this month for participation in the 2020 TREC, which will take place Nov. 18–20, 2020, in Gaithersburg, Maryland.

Discuss

About The Author

NIST’s picture

NIST

Founded in 1901, The National Institute of Standards and Technology (NIST) is a nonregulatory federal agency within the U.S. Department of Commerce. Headquartered in Gaithersburg, Maryland, NIST’s mission is to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life.