Editor's Note: The Partner Spotlight series features posts from our Partner
Network – a community of companies offering a wide range of services that
leverage Mechanical Turk.
Today’s post is from Kevin Dodds, a Senior
Project Manager at Information
Evolution, a consulting
partner focused on helping clients utilize
Mechanical Turk to collect large quantities of data online.
At Information Evolution we help clients collect
information from many sources across the web. Data
harvesting (or web scraping) can be a very cost efficient approach to extracting
data from websites. That said, sometimes
automated methods aren’t effective, and it’s actually easier, cheaper or more
accurate to have people find and aggregate the data instead.
Mechanical Turk enables us to drastically reduce
the turnaround time and cost of harvesting data by using Mechanical Turk Workers
in place of in-house resources or outsourcers.
In our post today, we share our some of our primary strategies for managing
data collection projects on Mechanical Turk, as well as a few tips that
you can use to optimize results.
Using Plurality to Verify Results
For HITs where there is only one
correct result, we typically assign each HIT to multiple Workers and compare
their answers. In our experience, this method – referred to as the Simple Plurality Review Policy in Mechanical Turk’s API
documentation – can be a very effective strategy for verifying results.
If two or more Workers
agree, then the answer is correct.
If Workers disagree, then we extend the
number of assignments to additional Workers until a sufficient percentage of
the Workers agree on the same answer. Keep in mind, Workers can agree on the wrong answer. For this reason, we
typically use Qualifications (specifically, Masters – see more below) to limit
access to Workers that meet our performance criteria. If Workers have a proven performance of
submitting accurate results – chances are higher that they’ll agree on the
right answer.
Also, Plurality is ideal for clearly
defined, objective questions – particularly when the answer can be consistently
found from a specific resource online. For
example, if you’re asking Workers to verify a business address or phone – you
could require Workers to use the business’ primary website or, alternatively, a
specific online directory.
Here are a few additional helpful
tips:
- Disagreements between Workers can yield valuable
insight, potentially exposing ambiguities in instructions or opportunities to
improve your HIT. Alternatively, Workers may disagree because there isn’t a clear
answer, which can also be helpful to know.
- Clearly defining
formatting requirements when Workers are submitting free text answers makes
comparing results easier and minimizes the number of assignments required. For
instance, when asking Workers to return a phone number, do you require “555-867-5309”
or “(555) 867-5309”? We typically use data validators in our HIT template to
standardize formatting and strip extraneous characters from Worker inputs. Using fuzzy logic is another approach that
can make comparing answers easier.
Targeting the Best Workers with Masters
Masters
are Mechanical Turk Workers who’ve demonstrated accurate
performance while completing thousands of HITs for Requesters on Mechanical
Turk. While there’s an additional cost
per assignment with Masters, when used in conjunction with Plurality, agreement
between Workers tends to be higher. As a
result, we’ve been able to reduce the total number of assignments required to
process HITs, and subsequently, our overall cost. In some cases, our results
have been so accurate with Masters that we’ve been assign only one Worker per
HIT, creating even greater cost efficiency.
For more
subjective HITs where there isn’t one right answer, rather varying degrees of
correctness, using Plurality may not be an option. If this is the case, qualifying Workers based
on skill or past performance is critical. Masters are often a great starting point, but
for the most complex tasks, we’ll often start by running small batches of HITs with Masters and
manually evaluating results. As we
review we’ll track which Workers meet our performance expectations, and grant them
a custom Qualification. While this process is more time consuming, it will
enable you to narrow your Worker pool to only those that meet your specific
expectations.
Measuring Performance with Known Answers
Using Known Answers, another
Review Policy supported by Mechanical Turk’s API, is another method of managing Worker performance
over time. We use Known Answers wherever
possible in conjunction with Qualifications and Plurality to manage quality. In
our experience, a Worker’s performance can deteriorate over time if
unmonitored. With Known Answers, Requesters can continually audit Worker inputs
in real-time, while automating actions like approving or rejecting assignments
based on whether or not a Worker’s inputs match the answers you expect. Keep in mind, just as with Plurality, Known
Answers work best with clearly defined, objective questions and consistent
formatting. When answers are free text, fuzzy logic or validators can make it
easier to detect when a Worker’s answer matches the Known Answer.
Additionally, by using the Known
Answers Review Policy in conjunction with Worker
Statistics, another tool available in the
API, you can track Workers’ performance on assignments over time. Based on their scores, you can automate
actions like rewarding the best Workers with bonuses or revoking Qualifications
from Workers whose performance wanes (thereby removing their access to your
HITs).
The methods above will improve
your ability to manage quality – especially when used in combination. There are additional steps that Requesters
can take to make HITs easier for Workers that can affect quality. Specifically, designing ergonomically
efficient HITs can impact Worker engagement – and as Workers become more
familiar a specific HIT, the quality of their results typically improves. For instance, if your HIT requires a Google
Search, you might consider including a link in the HIT that opens a new window
with a pre-populated Google query.
We hope these insights are helpful
as you consider future data collection HITs.
For more information on Information Evolution’s Managed Crowdsourcing
services, visit our website.