A supercomputer that was once the world’s fastest will be retired at Oak Ridge National Laboratory on August 1.
The supercomputer is a Cray XK7 machine called Titan. It is operated by the Oak Ridge Leadership Computing Facility. It’s a petaflop system capable of performing up to 27 quadrillion calculations per second.
Titan was the world’s fastest supercomputer in November 2012, but it was bumped to number two by Tianhe-2, a Chinese supercomputer in June 2013. Still, Titan continued to rank as one of the world’s top 10 fastest supercomputers from its debut at number one in 2012 until this June, when it dropped to number 12.
In June, ORNL said Titan, which has been operating for seven years, will be decommissioned on August 1 and disassembled for recycling. Titan will be removed to make room for a new, much more powerful supercomputer, Frontier. That will be an exascale system capable of 1.5 exaflops, or 1.5 quintillion calculations per second (a billion billion calculations per second). Frontier will be a $600 million Cray computer that is expected to be the world’s most powerful when it debuts in 2021.
To prepare for the new machine, the Oak Ridge Leadership Computing Facility, or OLCF, is retrofitting 20,000 square feet of data center space that now includes Titan, its Atlas file system, and the Cray XC30 Eos cluster, all of which will be decommissioned in August.
Many OLCF system users are moving from Titan to Summit, an IBM AC922 supercomputer that launched in 2018 and has its own data center space. Summit, a $200 million system, is now the world’s fastest supercomputer. It is capable of 200 petaflops, or 200,000 trillion calculations per second. Frontier will eventually surpass Summit in computing power.
In a story published on the OLCF website, ORNL said June 30 was the last day that users could submit jobs to Titan or the Cray Eos cluster, which is also seven years old, before the August 1 decommissioning of both systems. Atlas, the file system, will be decommissioned on August 15.
“Titan has run its course,†said Operations Manager Stephen McNally. “The components of Titan are now seven years old, and it’s really impressive that users have been successfully producing high-impact science results since the system became available to them. But the reality is, in electronic years, Titan is ancient. Think of what a cell phone was like seven years ago compared to the cell phones available today. Technology advances rapidly, including supercomputers.â€
Decommissioning a computer of Titan’s size requires collaboration between onsite staff, facility vendors, and users, ORNL said. So, OLCF staff members are supporting users who need to complete runs, save data, or transition their projects to Summit and other resources.
“We’ve communicated shutdown deadlines to users so they can be prepared while still getting high-quality research done,†McNally said. “One big task for users has been cleaning up 32 petabytes of data and moving data from Atlas to other storage systems.â€
ORNL said electricians will safely shut down the nine megawatt-capacity system, and Cray staff will disassemble and recycle Titan’s electronics and its metal components and cabinets (which predated the system as Jaguar’s cabinets).
“People ask why we can’t split up Titan and donate sets of cabinets to different research groups, but the answer is that it’s simply not worth the cost to a data center or university of powering and cooling even fragments of Titan,†McNally said. “Titan’s value lies in the system as a whole.â€
ORNL said Titan has served hundreds of research teams around the world through more than 26 billion core hours of computing time. Researchers use supercomputers to work on complex problems in research areas that include advanced materials, artificial intelligence, astrophysics, biology, energy, genetics, human health, materials science, and physics. ORNL is a U.S. Department of Energy laboratory.
Before Titan, ONRL had Jaguar, a Cray XT5 system that was capable of 2.3 petaflops. It was the fastest petascale system on the semiannual TOP500 list of the world’s most powerful supercomputers in November 2009.
Titan, a second-generation petascale system capable of 27 petaflops, was designed to be about 10 times more powerful than Jaguar while consuming roughly the same amount of energy.
Titan was a new generation of supercomputer with a revolutionary architecture that combined AMD 16-core Opteron central processing units (CPUs) and NVIDIA Kepler accelerated processors known as graphics processing units (GPUs). The GPUs tackled computationally intensive math problems, while the CPUs efficiently directed tasks, ORNL said.
“Choosing a GPU-accelerated system was considered a risky choice,†said OLCF Program Director Buddy Bland. “A DOE independent project review committee insisted that we demonstrate that our users would be able to effectively use Titan for the broad range of modeling and simulation applications we support. We spent six months working with Cray, NVIDIA, and our users to convince the reviewers, DOE, and ourselves that GPUs would deliver what we needed. Yes, there was risk, but we developed effective ways to manage the risks and educate both our staff and users in how to use the system. The result has been a remarkably productive system that has led the way for many GPU-accelerated systems.â€
Summit, which is about eight times more powerful than Titan, also uses CPUs and GPUs. In an interview after Summit’s debut last year, ORNL Director Thomas Zacharia said the supercomputer’s central processing units, or CPUs, can do very detailed high-precision 64-bit calculations, while the graphics processing units, or GPUs, can do faster mixed-precision 16-bit calculations.
“The combination of traditional processors with graphics processing units to accelerate the performance of leadership-class scientific supercomputers is an approach pioneered by ORNL and its partners and successfully demonstrated through ORNL’s number one-ranked Titan and Summit supercomputers,†the lab said a press release about Frontier in May.
Supercomputers are large. Titan is as big as a basketball court, and Summit is as big as two tennis courts. They have rows of cabinets housing computing equipment, lots of wiring, and large cooling systems. Frontier, the next supercomputer at ORNL, will weigh more than one million pounds, about as much as 35 school buses, said Peter Ungaro, president and chief executive officer of Cray. It will cover an area of 7,300 square feet, equal to almost two basketball courts.
Supercomputers are also fast and, with machine learning, considered “smart.â€
As an example of Frontier’s power, it will be capable of loading 100,000 high-definition movies per second. It will have as much power as 160 of today’s top supercomputers combined.
There is international competition in supercomputing, particularly between the United States and China. Before Summit was named most powerful in June 2018, China had held the top supercomputing spot since June 2013, and the country had held the top two spots since June 2016.
Frontier is one of three exascale systems being developed at DOE laboratories that will be developed in the next several years. Planning for the exascale computers was under way even before Summit was unveiled in June 2018.
See the OLCF story about Titan here.
More information will be added as it becomes available.
You can contact John Huotari, owner and publisher of Oak Ridge Today, at (865) 951-9692 or [email protected].
Most news stories on Oak Ridge Today are free, brought to you by Oak Ridge Today with help from our advertisers, sponsors, and subscribers. This is a free story. Thank you to our advertisers, sponsors, and subscribers. You can see what we cover here.
Do you appreciate this story or our work in general? If so, please consider a monthly subscription to Oak Ridge Today. See our Subscribe page here. Thank you for reading Oak Ridge Today.
Copyright 2019 Oak Ridge Today. All rights reserved. This material may not be published, broadcast, rewritten, or redistributed.
Leave a Reply