The 405B version of Llama3 was trained on over 16K H-100 NIVIDIA GPUs. This is one of the largest cluster used.
Elon's cluster that went live in Memphis: 100K H-100 to train Grok3.
It is going to get interesting.
The 405B version of Llama3 was trained on over 16K H-100 NIVIDIA GPUs. This is one of the largest cluster used.
Elon's cluster that went live in Memphis: 100K H-100 to train Grok3.
It is going to get interesting.