MVT dataset, Minimum Viewing Time

The dataset is designed to provide objective guidance on the absolute difficulty of an image, bridging the gap between machine and human-level performance in object recognition. The metric utilized is the Minimum Viewing Time (MVT), which determines the difficulty of an image based on the amount of time subjects require to view and classify an object from a briefly flashed image. A shorter viewing time indicates that the image is easier, while longer times indicate a harder image.

Data Card Author(s)

  • David Mayo* (corresponding author)
  • Jesse Cummings*
  • Xinyu Lin*
  • Dan Gutfreund
  • Boris Katz
  • Andrei Barbu

*equal contribution

Dataset Overview

Sensitivity of Data, Dataset Version and Maintenance

Data Subject(s)

  • 2,647 participants from Amazon Mechanical Turk
  • 200,382 trials

Dataset Snapshot

  • 4,771 images from ImageNet and ObjectNet
  • 6 image presentation times, 17ms, 50ms, 100ms, 150ms, 250ms, 10sec
  • 42 presentations of each image
  • 7 presentations per image per timing
  • 50 object categories

Content Description

Summarize here. Include links if available.

Additional Notes: Add here.

Sensitivity of Data

Sensitivity Type(s)

  • This dataset contains no sensitive PII

Anonymized Fields

  • worker_id; contains an anonymized random string corresponding to a unique worker (S/PII were collected as a part of the dataset creation process.)

Dataset Version and Maintenance

Maintenance Status

Limited Maintenance - The data will not be updated, but any technical issues will be addressed.

Version Details

Current Version: 1.0

Last Updated: 05/2023

Release Date: 05/2023

Maintenance Plan

This dataset will be hosted on MIT servers in perpetuity at with a backup on dropbox. Our dataset collection toolbox is hosted publicly on github at

Dataset License



Creative Commons

  • We release our data under the Creative Commons BY-SA license