Support INT8 mixed-precision training from torchao? #578

gau-nernst · 2024-09-14T04:05:00Z

Recently I worked on INT8 mixed-precision training in torchao. The relevant PR is here pytorch/ao#748

Preliminary results show that with torchtitan, it improves speed by 20% on 8x A100 with no noticeable difference in loss curve. See the PR for more details.

Would you be open to add an experimental flag for this in torchtitan? Similar to Float8 training. This can also help to profile and improve INT8 training performance directly in torchtitan for future perf optimization.

cc @msaroufim

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support INT8 mixed-precision training from torchao? #578

Support INT8 mixed-precision training from torchao? #578

gau-nernst commented Sep 14, 2024

Support INT8 mixed-precision training from torchao? #578

Support INT8 mixed-precision training from torchao? #578

Comments

gau-nernst commented Sep 14, 2024