Prof. Jha receives NSF Career Award


Dr. Shantenu Jha received the NSF CAREER Award for his project "Abstractions and Middleware for D3 Science on NSF Distributed Cyber infrastructure". This is a 5 year project with budget $700,000. The abstract is given below.

Dr. Jha joined the Department of Electrical and Computer Engineering in January 2011 as an Assistant Professor. His interests lie in areas of high-performance and distributed computing, computational and data-intensive science and engineering, large-scale cyberinfrastructure for science & engineering.

"CAREER: Abstractions and Middleware for D3 Science on NSF Distributed Cyberinfrastructure,"

This CAREER Award project will develop middleware to support Distributed Dynamic Data-intensive (D3) science on Distributed Cyberinfrastructure (DCI). Existing NSF-funded CI systems, such as the Extreme Science and Engineering Discovery Environment (XSEDE) and the Open Science Grid (OSG), use distributed computing to substantially increase the computational power available to research scientists around the globe; however, such distributed systems face limitations in their ability to handle the large data-volumes being generated by today's scientific instruments and simulations. To address this challenge, the PI will develop and deploying extensible abstractions that will facilitate the integration of high-performance computing and large-scale data sets. Building on previous work on pilot-jobs, these new abstractions will implement the analogous concept of "pilot-data" and the linking principle of "affinity."

The result will be a unified conceptual framework for improving the matching of data and computing resources and for facilitating dynamic workflow placement and scheduling. This research has the potential to significantly advance multiple areas of science and engineering, by generating production-grade middleware for accomplishing scalable big-data science on a range of DCI systems.

Increasingly, the high-performance computing resources available to scientific researchers are distributed across multiple machines in multiple locations. The integration of these resources requires a fabric of "middleware," upon which a wide variety of user applications, tools and services can be built and run. As more accurate, and more ubiquitous scientific instruments and models produce ever-larger volumes of data, however, this distributed cyberinfrastructure (DCI) is confronting unprecedented data-handling challenges that exceed the capabilities of existing DCI middleware. In this project, the PI will develop, test and implement new middleware solutions, specifically designed for the coming era of big-data distributed supercomputing.