Abstract: We present a hybrid surf-zone model that combines numerical simulations and statistical/machine learning techniques, enabling accurate calculations of nearshore wave and hydrodynamic parameters with high computational efficiency. The approach involves defining representative forcing conditions, carrying out
numerical model (XBeach) simulations for these cases, and training machine learning models capable of
predicting selected model output variables. Data decomposition via Empirical Orthogonal Function analysis
further simplifies the process, reducing the output data dimensionality, with minimal accuracy loss (with
exception of certain wetting-drying processes). Three machine learning approaches of increasing complexity are compared: a multi-variate linear regression (LR), a Radial Basis Functions (RBF) interpolator and a Deep Neural Network (DNN). The LR model fails to account for the complex non-linearities in coastal wave dynamics, which warrants the use of more complex machine learning techniques. Both the RBF interpolator and the DNN models demonstrate high levels of accuracy in the prediction of short wave heights, mean wavelength, and depthaveraged currents, with slightly lower accuracy for long (infragravity) wave heights and fraction of breaking waves. The proposed surrogate model thus offers an efficient alternative to computationally expensive numerical model simulations, enabling rapid and reliable long-period deterministic simulations (multi-decadal hindcasts) and/or multi-ensemble probabilistic scenario simulations of nearshore hydrodynamic conditions. We provide a comprehensive description of the implementation details and assess the surrogate model's performance in representing various wave and hydrodynamic parameters. We discuss potential use cases and limitations, noting that this hybrid modeling technique can be adapted for use with other numerical models in various settings.