Harnessing the Power of Big Data

Rutgers engineers are working with big data, harnessing its power to make complex industrial processes flow smoothly and to manage issues related to heath care. They are also shaping computing tools and software that allow high-energy physicists to discover the secrets of subatomic particles.

The School of Engineering is pioneering an industrial process that is pushing the bounds of big data in chemical engineering: the “continuous manufacturing” of pharmaceutical tablets. Pill-making is currently a “batch process” because, until now, it has been the only way to meet the need for strict monitoring and quality control. But batch processes are slow and expensive.

“Time to market is very important,” says Rohit Ramachandran, associate professor in chemical and biochemical engineering. “The faster you produce something of good quality, the more money you will save.”

Although chemical engineers have been developing processes this way for years, continuous manufacturing of pharmaceuticals poses new challenges.

A Rutgers development team is collaborating with Janssen Pharmaceuticals, a unit of Johnson & Johnson, to implement continuous manufacturing in a commercial production facility in Puerto Rico. This adds another challenge to Ramachandran’s management of big data.

“You’ll have companies with sites across different geographic regions,” he says. “We will need cloud-computing strategies to quickly send large amounts of data between multiple sites around the world.”

To deal with the challenges of big data, Ramachandran tapped the expertise of colleague Shantenu Jha, an associate professor in electrical and computer engineering. Jha is helping Ramachandran integrate his applications with advanced resources such as supercomputers and cloud computing.

“In just a year’s time, we’ve been able to get Rohit’s team to start using the world’s fastest and most powerful supercomputers,” Jha says.

Jha heads a team of cyberinfrastructure experts called RADICAL—the “Rutgers Advanced Distributed Cyberinfrastructure and Applications Laboratory.” He describes it as “a producer of technologies for big data-enabled science.”

Supercomputers, Jha points out, have traditionally been used to model and simulate physical processes, such as colliding black holes or galaxy formation. “However, instead of just running simulations that produce data from equations and governing principles, there is a reverse need: to process large volumes of data and derive governing principles,” he says. “The RADICAL team makes middleware for the largest NSF project that is tasked with doing this.”

Jha also works on processing data from the ATLAS subatomic particle detector at the Large Hadron Collider in Geneva, Switzerland, possibly the first project ever to have processed one “exabyte,” or a billion gigabytes, of data in a year.

“The process of particles colliding is a technology tour de force,” Jha says. “From our point of view, it’s a massive source of data generation. By some accounts, it’s the world’s most data-intensive academic project.”

He is part of a project team that recently received $2 million in funding from the U. S. Department of Energy to help design the next generation of supercomputing resources that will help process all that data.