If you’re a Requester using the API on Mechanical Turk to create a lot of HITs, here are some tips to help you make your API usage more efficient:
- Track the HITIds that come back from CreateHIT: we often see Requesters use GetReviewableHITs/SearchHITs (GetAllHITs in .net) to query which HITs they have in the system, and then do further operations on only a few of those HITs. These calls return a much larger volume of HITs – which you then have to sort through to find the HITs you’re looking for. Try using GetHIT with the HITId instead, and you’ll find it’s snappier.
- Dispose old HITs: many Requesters don’t explicitly dispose their HITs. While the system will eventually Auto-Dispose these HITs after 120 days of inactivity, letting them accumulate after you are done with them just increases the amount of data returned for your other calls like GetReviewableHITs/SearchHITs. As the data returned increases, the response to these calls will take longer. You should call DisposeHIT after you have approved or rejected the HIT and stored the Worker’s submission. This will reduce the un-actionable data returned in response to your API calls & improve performance.
- Use Larger Page Sizes: some of our APIs have pagination built in (e.g. GetReviewableHITs). When calling these paginated APIs, if the number of items is large, use larger page sizes (such as 100) rather than smaller sizes.
- Try Different Sort Properties: If you have large numbers of HITs in the system and are looping through them to find specific HITs near the end, you can try using different sort properties to bring HITs you are looking for to the front. Also consider enabling or disabling filtering by HitType.
- Use the Reviewing State: If you are processing a large batch of HITs, you can use the Reviewing state to reduce the amount of HITs returned in each call to GetReviewableHITs. Setting the status to Reviewing will allow you to filter for only new data that has been submitted in since your last call. A while back, we posted the lifecycle of a HIT that outlines the various states a HIT can be in – Reviewing is one of those helpful states that can help streamline your API usage.
- Use just a few threads: Requesters with steady activity with a limited number of client threads can get large amounts of work done. Increasing the number of threads does not always help, and too many threads can actually slow you down as your threads will be competing with each other. There is no single ‘optimal’ setting for a Requester – experiment with your settings to find your sweet spot.
- Manage your concurrency: bursts of activity on the same API are more likely to see errors associated with capacity or concurrency. Reducing ‘bursts’ in your calls should reduce error rates as well.