- Removed `run_evaluations_multi_gpu.sh` script as it was redundant.
- Updated `run_experiments_multi_gpu.sh` to handle evaluation jobs instead of training.
- Changed command-line options to support evaluation-specific parameters.
- Implemented run directory discovery and validation for evaluation jobs.
- Enhanced logging to capture evaluation details and outputs.
- Added options for centralized output management and skipping existing results.
- Introduced `evaluate.py` for time-dependent evaluation of models, including data loading and model inference.
- Added `evaluation_time_dependent.py` to compute various evaluation metrics such as AUC, average precision, and precision/recall at specified thresholds.
- Implemented CIF calculation methods in `losses.py` for different loss types, including exponential and piecewise exponential models.
- Created utility functions in `utils.py` for context selection and multi-hot encoding of events within specified horizons.