Scalable Data Coupling Abstraction for Data-Intensive Simulation Workflows


Professor Manish Parashar and Dr. Ivan Rodero received an NSF grant of $547,283 for the project "Scalable Data Coupling Abstraction for Data-Intensive Simulation Workflows". The project abstract is given below.

Coupled scientific simulation workflows running at extreme scales on high-end resources have the potential for achieving unprecedented levels of accuracy and providing dramatic insights into complex phenomena. However, as data volumes and generation rates grow, the costs (latencies and energy) associated with extracting this data and transporting it for coupling and analysis

are dominating and are dictating the level of performance and productivity that can be achieved. The goal of this project is to develop conceptual solutions as well as a software framework that can enable the large-scale data-intensive simulations. Our approach is based on the premise that given the large data volumes and associated costs, data will have to be largely processed online, "in-situ" and "in-transit" while it is staged using resources within the computational platform, and the programming and runtime system must provide abstractions and mechanisms that facilitate such data processing. Our effort is organized around three key research thrusts: (1) Programming abstractions for in-situ/in-transit data management; (2) Design and implementation of a scalable data staging substrate; and (3) Data-centric mapping and scheduling.