Tesla unveiled its latest version of its Dojo supercomputer, which proved to be so powerful that it shut down the power grid in Palo Alto.
Dojo is Tesla’s own dedicated supercomputing platform, built from the ground up for AI machine learning and specifically for video training using video data coming from Tesla electric vehicles.
The automaker already has a large supercomputer based on Nvidia GPUs, which is one of the most powerful in the world, but the new custom Dojo computer uses chips and all the infrastructure designed by Tesla.
The new supercomputer will boost Tesla’s ability to train neural networks using video data, which is critical for computer vision technology to enable self-driving driving. A year ago, the company first announced the creation of Dojo. Now she spoke about the progress made over the year.
Tesla claims it can replace 6 GPU blocks with a single tile (Dojo tile), which the company claims costs less than one GPU block. On a tray (system tray) 6 such tiles. One tray is equivalent to “3-4 fully loaded supercomputer racks”. Tesla can fit two of these system trays in the same Dojo chassis. Tesla is still developing and testing the infrastructure needed to combine multiple cabinets to create the first Dojo Exapod, whose main specifications are already known: 1.1 EFLOP, 1.3TB SRAM, and 13TB high-bandwidth DRAM.
“We knew we needed to rethink every aspect of the data center infrastructure to deliver unprecedented cooling and power density,” said Bill Chang, Tesla’s chief system engineer for the Dojo project.
They had to develop their own powerful cooling and power system for Dojo cabinets. Chang confirmed that Tesla shut down its local power grid substation while testing Dojo: “Earlier this year, we started stress testing our power and cooling infrastructure and managed to increase capacity by more than 2 MW before we shut down our substation and received a call from cities”.
There are currently plans to build seven Dojo Exapods in Palo Alto.