Skip to content

Issues: pytorch/torchtitan

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[BE] remove unused value "max_batch_size" CLA Signed This label is managed by the Meta Open Source bot.
#585 by XilunWu was merged Sep 19, 2024 Loading…
max_batch_size argument in ModelArgs
#583 by eminorhan was closed Sep 19, 2024
Better lr tracking
#579 by philippguevorguian was closed Sep 16, 2024 Loading…
fix float8 after the HSDP PR CLA Signed This label is managed by the Meta Open Source bot.
#575 by tianyu-l was merged Sep 12, 2024 Loading…
update README to include new features and remove outdated msg CLA Signed This label is managed by the Meta Open Source bot.
#574 by tianyu-l was merged Sep 13, 2024 Loading…
Fixes test_fused_rms_norm.py CLA Signed This label is managed by the Meta Open Source bot.
#573 by fegin was merged Sep 10, 2024 Loading…
How to calculate the total batchsize question Further information is requested
#572 by zyushun was closed Sep 10, 2024
Revert "merge upstream changes" CLA Signed This label is managed by the Meta Open Source bot.
#570 by tianyu-l was merged Sep 8, 2024 Loading…
merge upstream changes CLA Signed This label is managed by the Meta Open Source bot.
#569 by tianyu-l was merged Sep 8, 2024 Loading…
Removed unused dw tensor in Triton RMSNorm CLA Signed This label is managed by the Meta Open Source bot.
#567 by awgu was merged Sep 4, 2024 Loading…
Multi-node training without AWS EFA clusters question Further information is requested
#566 by LeoXinhaoLee was closed Sep 4, 2024
remove float8 install as H100 is not available in CI yet CLA Signed This label is managed by the Meta Open Source bot.
#565 by tianyu-l was merged Aug 30, 2024 Loading…
Add 3d+compile to test runner CLA Signed This label is managed by the Meta Open Source bot.
#563 by H-Huang was merged Aug 30, 2024 Loading…
[ez] Remove legacy float8 enabling flag in 405B toml CLA Signed This label is managed by the Meta Open Source bot.
#559 by fduwjj was merged Aug 23, 2024 Loading…
Address comment from PR(#557) CLA Signed This label is managed by the Meta Open Source bot.
#558 by fduwjj was merged Aug 23, 2024 Loading…
Fix the CI failure for using pciutils CLA Signed This label is managed by the Meta Open Source bot.
#557 by fduwjj was merged Aug 23, 2024 Loading…
Fix the performance number for 405B CLA Signed This label is managed by the Meta Open Source bot.
#556 by fduwjj was merged Aug 23, 2024 Loading…
remove PP tracer CLA Signed This label is managed by the Meta Open Source bot.
#555 by tianyu-l was merged Aug 22, 2024 Loading…
[405B] Add performance data for 405B model CLA Signed This label is managed by the Meta Open Source bot.
#554 by fduwjj was merged Aug 23, 2024 Loading…
Update 3b-cfg to latest CLA Signed This label is managed by the Meta Open Source bot.
#553 by daviswer was closed Aug 21, 2024 Draft
405b more CLA Signed This label is managed by the Meta Open Source bot.
#552 by fduwjj was closed Aug 21, 2024 Loading…
[MRG] relax the FP8 CUDA arch limitation to SM89 CLA Signed This label is managed by the Meta Open Source bot.
#549 by leeeizhang was merged Aug 21, 2024 Loading…
ProTip! Updated in the last three days: updated:>2024-09-17.