Named Data Networking Strategies for Improving Large Scientific Data Transfers

download Download PDF

Named Data Networking Strategies for Improving Large Scientific Data Transfers
by Susmit Shannigrahi, Chengyu Fan and Christos Papadopoulos.
In Proceedings of the IEEE International Conference on Communications, May 2018.

Current scientific workflows such as climate science and High Energy Particle Physics (HEP) routinely generate and use large volumes of observed or simulated data. Users of the data are geographically dispersed and often need to transfer large volumes of data over the network for replication, archiving, or local analysis. Scientific communities have built sophisticated applications and dedicated networks to facilitate such data transfers, and yet, users continue to experience failures, delay, and unpredictable transfer latency.

Named Data Networking (NDN) is a new Internet architecture that can provide a much more flexible and intelligent network layer suitable for large data transfers. In this work, we use a real scientific data flow to demonstrate NDN’s flexibility and versatility that can make it a suitable choice for large-data workflows. We use deadline-based data transfers as our driving example since they are widely used for HEP data flows. We first discuss several NDN based forwarding strategies that can help such data flows. In addition to using standard forwarding strategies, we propose, at a high level, a bandwidth reservation protocol for NDN and an on-demand high-speed path creation mechanism. Using these as building blocks, we create a deadlinebased data transfer protocol and show how NDN can simplify scientific data distribution that currently requires complex applications. Finally, we use a week-long HEP data log to evaluate our protocol analytically.