ScrubJay

ScrubJay is a framework for automatic and scalable data integration. Describe your input datasets (files, formats, database tables), then describe the semantics of the dataset(s) you desire, and let ScrubJay derive it for you in a consistent and reproducible way.

ScrubJay was originally developed for analyzing the supercomputing facilities at Lawrence Livermore National Laboratory, but is not specifically tied to any one kind of data.

Indices and tables