Authors:
Kaushik Jagini
1
;
Yifan Zhang
1
;
Yichen Guo
2
;
Julian Goddy
3
;
Dale Stansberry
4
;
Joshua Agar
3
and
Jeff Heflin
1
Affiliations:
1
Computer Science and Engineering, Lehigh University, Bethlehem, PA, U.S.A.
;
2
Materials Science and Engineering, Lehigh University, Bethlehem, PA, U.S.A.
;
3
Mechanical Engineering and Mechanics, Drexel University, Philadelphia, PA, U.S.A.
;
4
National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN, U.S.A.
Keyword(s):
Scientific Data Discovery, Data-Centric Indexing, Federated Data, User Interface, Semi-Structured Data.
Abstract:
There is a need for powerful, user-friendly tools for scientific data management and discovery. We present an architecture based on DataFed and Elasticsearch that allows scientists to easily share data they produce and a novel interface that allows other scientists to easily discover data of interest. This interface supports summary-level information about a collection of datasets that can be easily refined using schema-free search. We extend the recent idea of cell-centric search to semi-structured data, describe the architecture of the system, present a use case from the context of materials science, and evaluate the efficacy of the system.