a832a45c62f584931b2cabc3da38ea74ffbf323a
Increase model size (n_embd, n_layer, n_head) for the multi-GPU configuration. Explicitly set AdamW betas to (0.9, 0.99).
DeepHealth
Description
Languages
Python
60.9%
Jupyter Notebook
39.1%