Managing Scientific Data with Named Data Networking



download Download PDF

“Managing Scientific Data with Named Data Networking”
By Chengyu Fan, Catherine Olschanowsky, Susmit Shannigrahi, Christos Papadopoulos, Steve DiBenedetto, and Harvey Newman.
Network-aware Data Management Workshop, November 2015

Many scientific domains, such as climate science and High Energy Physics (HEP), have data management requirements that are not well supported by the IP network architecture. Named Data Networking (NDN) is a new network architecture whose service model is better aligned with the needs of data-oriented applications. NDN provides features such as best-location retrieval, caching, load sharing, and transparent failover that would otherwise be painstakingly (re- )implemented by each application using point-to-point semantics in an IP network. We present the first scientific data management application designed and implemented on top of NDN. We use this application to manage climate and HEP data over a dedicated, high-performance, testbed. Our application has two main components: a UI for dataset discovery queries and a federation of synchronized name catalogs. We show how NDN primitives can be used to implement common data management operations such as publishing, search, efficient retrieval, and publication access control.