Open navigation menu

Welcome to Everand!

Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

FromContinuous improvement

Start listening View podcast show

Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

FromContinuous improvement

ratings:

Length:

6 minutes

Released:

Aug 8, 2024

Format:

Podcast episode

Description

Training multibillion-parameter models in machine learning poses significant challenges, particularly concerning GPU memory limitations. A single NVIDIA A100 or H100 GPU, with its 80 GB of GPU RAM, often falls short when handling 32-bit full-precision models. This blog post will delve into two powerful techniques to overcome these challenges: quantization and distributed training.

Released:

Aug 8, 2024

Format:

Podcast episode

Titles in the series (100)

A podcast about technology, business and personal development.

More Episodes from Continuous improvement

Related podcast episodes