feat: Add multi-GPU training and improve config/ignore

Add train_multigpu.py for distributed data parallel training.

Update train.py to save the training configuration to a JSON file.

Generalize .gitignore to exclude all *.pt checkpoint files.

Delete obsolete train_dpp.py file.
This commit is contained in:
2025-10-17 14:09:34 +08:00
parent 053f86f4da
commit d760c45baf
4 changed files with 282 additions and 401 deletions

2
.gitignore vendored
View File

@@ -5,7 +5,7 @@
__pycache__/
# Model checkpoints
best_model_checkpoint.pt
*.pt
# Large data files
ukb_delphi.txt