# DeepHealth ## Evaluation This repo includes two event-driven evaluation entrypoints: - `evaluate_next_event.py`: next-event prediction using short-window CIF - `evaluate_horizon.py`: horizon-capture evaluation using CIF at multiple horizons ### IMPORTANT metric disclaimers - **AUC** reported by `evaluate_horizon.py` is “time-dependent” only because the label depends on the chosen horizon $\tau$. Without explicit follow-up end times / censoring, this is **not** a classical risk-set AUC with IPCW. Use it for **model comparison and diagnostics**, not strict statistical interpretation. - **Brier score** reported by `evaluate_horizon.py` is an unadjusted diagnostic/proxy metric (no censoring adjustment). Use it to detect probability-mass compression / numerical stability issues; do not claim calibrated absolute risk. ### Example ```bash # Next-event (no --horizons) python evaluate_next_event.py \ --run_dir runs/your_run \ --tau_short 0.25 \ --age_bins 40 45 50 55 60 65 70 inf \ --device cuda \ --batch_size 256 \ --seed 0 # Horizon-capture python evaluate_horizon.py \ --run_dir runs/your_run \ --horizons 0.25 0.5 1.0 2.0 5.0 10.0 \ --age_bins 40 45 50 55 60 65 70 inf \ --device cuda \ --batch_size 256 \ --seed 0 ```