Many Requesters want to be able to manage HITs in groups that make sense for their use-cases. For example, say a Requester wants to tag some pictures. (S)he may have 2 collections of pictures – nature & people, and would like to manage the HITs for each picture collection separately. (S)he would also like to have Workers work on the nature pictures as a group, and the people pictures as a group. The HITs are similar for both groups, but the Requester would like to manage the HITs separately. The Requester would also like to control how the HITs are grouped for Workers – such that nature pictures are grouped together. How would the Requester manage these HITs in Mechanical Turk?
Requesters can use different techniques to manage groups of HITs in mTurk, so we’ll walk through the options depending upon whether they use the User Interface, API, or command line tools. Whichever interface the Requester uses, (s)he can also control how the HITs are grouped for Workers, by managing the metadata associated with a HIT.
Using the Requester User Interface (https://requester.mturk.com)
The Requester User Interface provides a Batching mechanism to manage your groups. When Requesters want to publish some hits, they walk through a Design->Publish->Manage Workflow. HITs are generated by merging the HIT Template selected with the input data provided, and published in batches. The HIT Template contains the metadata associated with any HIT that is created from that template. A Batch is defined by what’s in the CSV file – all the HITs created (one per row of input) are in the same batch, and the UI lets Requesters access them as batches. Requesters use the Manage tab to review the different batches, and act on the data.
Workers see HITs on the mturk.com site grouped by the metadata – i.e. the HIT Template’s properties. Specifically, if the following fields - Title, Description, Keywords, ‘Reward per assignment’, ‘Results are automatically approved in’, ‘Time allotted per assignment’, and Masters & Additional Qualifications – are all the same, then the HITs will be grouped together for Workers. If any of those are different, then the HITs will be grouped separately for Workers. If the Requester publishes HITs multiple times using the same template, the Requester will see multiple Batches, but the Workers see all those HITs in the same group. If the Requester uses a different template with different metadata, then Workers will see those HITs in a different group.
In our example of tagging pictures, the Requester can manage the nature & people pictures separately by using 2 templates & using 2 different input files.
Using the API
Requesters who use the API can use some simple techniques to keep their groups of HITs organized – a good way is to use different HITTypes for each group. Requesters can use the RegisterHITType API to create a new HITType, which consists of the following properties of a HIT -
- A set of zero or more QualificationRequirements
Changing any of these will result in a new HITType. Using identical parameters will result in the same HITType being returned to the Requester. For each group of related work, a Requester can vary one or more of these attributes to create a new HITType – e.g. changing the Title or Description will provide a different HITType. Using a different HITType for each group provides an easy way to manage groups of HITs. Requesters can simply call CreateHIT, with a group-specific HITType to set up different groups – say one HITType for people pictures, and another for nature pictures in our example. Calls to GetReviewableHITs take the HITType as a parameter, so using a different HITType for different groups sets you up conveniently for later retrieval. Workers will see HITs created from the same HITType in the same group. Note that this technique is for API users only – the UI does not expose the HITType for its Batches (although it associates HITType information with the HITTemplate).
Using the Command Line Tools (CLT)
The CLT also provides a natural way to group items. Each tool that comprises the CLT has options that specify the input and output file(s) (this depends upon the operation being performed by the tool). All the items included in an input or output file constitute a group of HITs that are acted upon every time the CLT command is invoked. E.g. GetResults takes a file containing all the HITIDs to be retrieved, and produces an output file with all the retrieved HIT data in it – the contents of the file define the group of HITs being acted upon. The Help documentation that is part of the Command Line Tools provides details for each command (as does typing in commandName –h). Users can manage different groups of HITs by keeping all related items in different input/output files.