What is the Met dataset?
The Met dataset is a large-scale dataset for Instance-Level Recognition (ILR) in the artwork domain.
- We rely on the open access collection from the Metropolitan Museum of Art (The Met) in New York to form the training set, which consists of about 400k images from more than 224k classes, with artworks of world-level geographic coverage and chronological periods dating back to the Paleolithic period. Each museum exhibit corresponds to a unique artwork, and defines its own class. The training set exhibits a long-tail distribution with more than half of the classes represented by a single image, making it a special case of few-shot learning.
- We have established ground-truth for more than 1,100 images from museum visitors, which form the Met queries. The goal is to recognize the Met exhibits depicted in the Met queries. There is a distribution shift between these queries and the training images which are created in studio-like conditions. We additionally include a large set of images not related to The Met, which form an Out-Of-Distribution (OOD) query set, the distractor queries. There is no Met exhibit despicted in distractor queries. The query set is composed by the combination of those two sets.
Recognition performance on the test set (queries) of the Met Dataset is measured with average classification accuracy (ACC) on the Met queries, and with Global Average Precision (GAP)
on all queries.
The images of the dataset and the ground-truth files can be downloaded from the links below. All images have been resized so that their largest side is 500 pixels.
Models for descriptor extraction can be downloaded from the links below. They can be used directly to extract descriptors with the provided code for descriptor extraction.
- ResNet18-IN-SRC: trained on Met with contrastive loss (Syn+Real-Closest). Initialization: ImageNet pre-training
- ResNet18-SWSL-SRC: trained on Met with contrastive loss (Syn+Real-Closest). Initialization: SWSL
Descriptors extracted using the models above can be downloaded from the links below. They can be used directly with the provided code for kNN classification.
Code is provided on github
to offer support for:
- - using the dataset
- - performing the evaluation
- - reproducing experiments in the NeurIPS 2021 paper
The Met Dataset: Instance-level Recognition for Artworks
N.A. Ypsilantis, N. Garcia, G. Han, S. Ibrahimi, N. van Noord, G. Tolias
Accepted at NeurIPS 2021 Track on Datasets and Benchmarks.