Crowdsourcing performance evaluations of user interfaces

DSpace/Manakin Repository

Crowdsourcing performance evaluations of user interfaces

Citable link to this page


Title: Crowdsourcing performance evaluations of user interfaces
Author: Komarov, Steven; Reinecke, Katharina; Gajos, Krzysztof Z

Note: Order does not necessarily reflect citation order of authors.

Citation: Komarov, Steven, Katharina Reinecke, and Krzysztof Z. Gajos. 2013. “Crowdsourcing Performance Evaluations of User Interfaces.” In proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France, April 27-May 2, 2013.
Full Text & Related Files:
Abstract: Online labor markets, such as Amazon's Mechanical Turk (MTurk), provide an attractive platform for conducting human subjects experiments because the relative ease of recruitment, low cost, and a diverse pool of potential participants enable larger-scale experimentation and faster experimental revision cycle compared to lab-based settings. However, because the experimenter gives up the direct control over the participants' environments and behavior, concerns about the quality of the data collected in online settings are pervasive. In this paper, we investigate the feasibility of conducting online performance evaluations of user interfaces with anonymous, unsupervised, paid participants recruited via MTurk. We implemented three performance experiments to re-evaluate three previously well-studied user interface designs. We conducted each experiment both in lab and online with participants recruited via MTurk. The analysis of our results did not yield any evidence of significant or substantial differences in the data collected in the two settings: All statistically significant differences detected in lab were also present on MTurk and the effect sizes were similar. In addition, there were no significant differences between the two settings in the raw task completion times, error rates, consistency, or the rates of utilization of the novel interaction mechanisms introduced in the experiments. These results suggest that MTurk may be a productive setting for conducting performance evaluations of user interfaces providing a complementary approach to existing methodologies.
Published Version: doi:10.1145/2470654.2470684
Terms of Use: This article is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at
Citable link to this page:
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search