DescriptionOver time many scientific repositories and file systems become disorganized, containing poorly described and error-ridden data. As a result, it is often difficult for researchers to discover crucial data. In this poster, we present a collection of image processing modules that collectively extract metadata from a variety of image formats. We implement these modules in Skluma—a system designed to automatically extract metadata from structured and semi-structured scientific formats. Our modules apply several image metadata extraction techniques that include processing file system metadata, header information, color content statistics, extracted text, feature-based clusters, and predicting tags using a supervised learning model. Our goal is to collect a large number of metadata that may then be used to organize, understand, and analyze data stored in a repository.