feat: Add load_model function and update training script

Added a `load_model` function to `utils.py` to allow loading of trained models from configuration and state dictionary files.

The `train_iter.py` script was also modified, likely to incorporate or test this new functionality.
This commit is contained in:
2025-10-18 11:07:59 +08:00
parent f7356b183c
commit a631ac6d59
2 changed files with 48 additions and 2 deletions

View File

@@ -23,8 +23,8 @@ class TrainConfig:
n_embd = 120
n_layer = 12
n_head = 12
pdrop = 0.1
token_pdrop = 0.1
pdrop = 0.0
token_pdrop = 0.0
# Training parameters
max_iter = 200000