Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

FromContinuous improvement


Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

FromContinuous improvement

ratings:
Length:
6 minutes
Released:
Aug 8, 2024
Format:
Podcast episode

Description


Training multibillion-parameter models in machine learning poses significant challenges, particularly concerning GPU memory limitations. A single NVIDIA A100 or H100 GPU, with its 80 GB of GPU RAM, often falls short when handling 32-bit full-precision models. This blog post will delve into two powerful techniques to overcome these challenges: quantization and distributed training.
Released:
Aug 8, 2024
Format:
Podcast episode

Titles in the series (100)

A podcast about technology, business and personal development.