feat: Add load_model function and update training script
Added a `load_model` function to `utils.py` to allow loading of trained models from configuration and state dictionary files. The `train_iter.py` script was also modified, likely to incorporate or test this new functionality.
This commit is contained in:
@@ -23,8 +23,8 @@ class TrainConfig:
|
||||
n_embd = 120
|
||||
n_layer = 12
|
||||
n_head = 12
|
||||
pdrop = 0.1
|
||||
token_pdrop = 0.1
|
||||
pdrop = 0.0
|
||||
token_pdrop = 0.0
|
||||
|
||||
# Training parameters
|
||||
max_iter = 200000
|
||||
|
Reference in New Issue
Block a user