Amirmohammad Farzaneh

Research Associate

NeurIPS 2025 Tutorial: From Tuning to Guarantees


December 19, 2025

Earlier this month at NeurIPS 2025, I had the pleasure of delivering a tutorial titled
“From Tuning to Guarantees: Statistically Valid Hyperparameter Selection.”
It was a genuinely rewarding experience, and I wanted to share a few reflections here, along with the tutorial materials.
The tutorial focused on a question that comes up repeatedly in modern machine learning practice:
how can we move beyond heuristic hyperparameter tuning and instead design procedures with explicit statistical guarantees?
We discussed a unifying perspective based on hypothesis testing, reliability graphs, and multiple testing control, covering methods such as LTT, Pareto Testing, and more recent approaches for multi objective and structured hyperparameter spaces.

How it went

I was delighted by the level of engagement throughout the session. The attendance was excellent, and the questions and discussions were thoughtful, technical, and wide ranging, exactly what one hopes for in a tutorial setting. It was particularly encouraging to see interest from both theory oriented researchers and practitioners thinking about deployment and reliability.
I am very grateful to Professor Osvaldo Simeone for his support and input during the preparation of this tutorial. Many of the ideas presented build on ongoing collaborations and discussions around statistically valid learning and decision making.

Slides and recording

For those who attended and would like to revisit the material, or for anyone who could not make it in person, you can find:
  • The tutorial slides are attached to this post below.
  • The live recording is available via the NeurIPS platform.
The slides include references, algorithmic details, and examples that we did not have time to fully expand on during the live session.
Thanks again to everyone who attended and contributed to the discussion. I hope the material proves useful, and I look forward to continuing the conversation around principled, reliable model selection.
Full Screen
Exit Full Screen

Share

Tools
Translate to