Building the Next-Generation Data Networks Needed for High Energy Physics

Caltech’s High Energy Physics and network teams collaborate with partners to develop computer networks for “big data” science projects.
During a research exhibition at the Supercomputing 2018 Conference (SC18) in Dallas in November, Caltech’s High Energy Physics and network teams—working in collaboration with many university, laboratory, network, and industry partners, and led by Harvey Newman, Marvin L. Goldberger Professor of Physics—demonstrated the latest developments in global-scale computer networks, including several new methods for improving the speed and efficiency of data flows among widely distributed computing facilities located in different world regions.

Their efforts are currently focused on the science program at the European Organization for Nuclear Research’s (CERN’s) Large Hadron Collider (LHC), which involves more than 170 computing and storage facilities worldwide, in addition to the Large Synoptic Survey Telescope (LSST), now under construction in La Serena, Chile; however, the methods and developments are widely applicable to many other data-intensive disciplines.

The intelligent software-driven end-to-end network systems and high-throughput applications we developed and demonstrated at SC18 with our partners will enable the LHC and other leading programs to meet the unprecedented challenges they face in global-scale processing, distribution, and collaborative analysis of massive data sets, and to operate with a new level of efficiency and control,” says Newman. “Our demonstrations signify the emergence of a new paradigm among widely distributed computing and storage facilities hosting exabytes of data, where intelligent software at the end-sites and in the networks themselves cooperate to reach stable equilibrium at high rates.” An exabyte of data is equivalent to 1 quintillion bytes, or 1 billion gigabytes.

Some of the highlights from the SC18 demonstrations included: high-volume intercontinental data flows at close to wire speed going from the LSST telescope construction site in Chile to Dallas via Brazil and Miami, and onward to Korea via Chicago and Miami; a comprehensive intelligent network system that negotiates and provides end-to-end services across multiple labs, campuses, and intercontinental networks; a new unified software framework enabling the science programs to co-schedule massive distributed computing and network resources while balancing flows over multiple paths securely crossing many intermediate networks; and the first 400 gigabits-per-second data network on the SC18 floor that supported a steady flow of close to 800 gigabits-per-second between the Caltech and USC booths. “