Home  /  Learn  /  Webinars  /  Machine learning using Stata via H2O

Machine learning using Stata via H2O

November 2025

Eduardo Garcia Echeverri

Eduardo García Echeverri

Senior Econometrician

Which customers will default on their loan? How will stock prices change in the coming weeks? What key factors drive surgical success?

High-performance machine learning methods can answer questions like these and many more, even when the data-generating process is complex. With the new h2oml suite in Stata, you can implement two of the most used models in machine learning via H2O: gradient boosting machine (GBM) and random forest (RF).

In this webinar, we will see how to fit these models to predict continuous, binary, and multinomial outcomes; how to evaluate model performance; and how to optimally select model hyperparameters. We will also explore the tools available in Stata to interpret and explain the predictions of our machine learning models. These include variable importance, global surrogate models, partial dependence plots, ICE curves, and Shapley values.