| Component | Hyperparameter / Setting |
| Kernel | Radial basis function |
| Regularization parameter $ c $ | 1000 |
| Kernel coefficient $ \gamma $ | 0.01 |
| Estimator $ \epsilon $ | 0.01 |
This work focuses on estimating soil properties from water moisture measurements. We consider simulated data generated by solving the initial-boundary value problem governing vertical infiltration in a homogeneous, bounded soil profile, with the usage of the Fokas method. To address the parameter identification problem, which is formulated as a two-output regression task, we explore various machine learning models. The performance of each model is assessed under different data conditions: full, noisy, and limited. Overall, the prediction of diffusivity $ D $ tends to be more accurate than that of hydraulic conductivity $ K. $ Among the models considered, Support Vector Machines (SVMs) and Neural Networks (NNs) demonstrate the highest robustness, achieving near-perfect accuracy and minimal errors.
| Citation: |
Table 1. Hyperparameters and architecture of the support vector regressor
| Component | Hyperparameter / Setting |
| Kernel | Radial basis function |
| Regularization parameter $ c $ | 1000 |
| Kernel coefficient $ \gamma $ | 0.01 |
| Estimator $ \epsilon $ | 0.01 |
Table 2. Hyperparameters and architecture of the XGB regressor
| Component | Hyperparameter / Setting |
| Loss function | Squared error |
| Estimators | 300 |
| Max depth | 7 |
| Learning rate | 0.2 |
| $ L^2 $ regularization parameter | 100 |
Table 3. Hyperparameters and architecture of the random forest regressor
| Component | Hyperparameter / Setting |
| Number of trees | 100 |
| Maximum depth | 20 |
| Min sample split | 2 |
| Min sample leaf | 1 |
| Number of features | Sqrt |
Table 4. Hyperparameters and architecture of the fully connected neural network model
| Component | Hyperparameter / Setting |
| Activation function | LeakyReLU with $ \alpha=0.01 $ |
| Regularization | $ L^2 $ with parameter $ 2\times10^{-4} $ |
| Layer 1 | Dense with 128 nodes |
| Layer 2 | Dense with 64 nodes |
| Layer 3 | Dense with 32 nodes |
| Output layer | Dense with 2 nodes |
| Optimizer | Adam with learning rate 0.0001 |
| Loss function | Mean Squared Error (MSE) |
| Epochs | 2000 |
| Early stopping | Patience = 50 and $ \Delta = 10^{-4} $ |
Table 5.
Hyperparameters and architecture of the
| Component | Hyperparameter / Setting |
| Number of nearest neighbors | 3 |
| Weight function | distance |
| Power parameter | 2 |
Table 6.
Metrics for predicting the diffusivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2$ score | 1.0000 | 0.9999 | 0.9999 | 1.0000 | 0.9998 |
| MSE | 2.9787 | 9.0156 | 14.4745 | 0.5293 | 15.0795 |
| MAE | 1.4625 | 2.2871 | 2.7821 | 0.5640 | 2.8197 |
Table 7. Metrics for predicting the conductivity K in the test set using exact data.
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2$ score | 1.0000 | 0.9710 | 0.9612 | 0.9999 | 0.9687 |
| MSE | 0.0000 | 0.0212 | 0.0284 | 0.0001 | 0.0229 |
| MAE | 0.0042 | 0.1044 | 0.1206 | 0.0064 | 0.1114 |
Table 8.
Randomly selected predicted values of the diffusivity
| Actual | SVM | XGBoost | RF | NN | kNN | |
| 1684.59 | 1683.87 | 1684.87 | 1682.27 | 1685.58 | 1685.83 | |
| 1869.99 | 1869.60 | 1867.21 | 1870.23 | 1870.85 | 1868.19 | |
| 2117.94 | 2121.01 | 2116.82 | 2121.32 | 2118.22 | 2117.13 | |
| 1735.98 | 1737.62 | 1737.57 | 1739.88 | 1736.25 | 1744.83 | |
| 1382.93 | 1384.57 | 1384.15 | 1383.48 | 1382.67 | 1379.06 | |
| MAE | 1.49 | 1.40 | 2.08 | 0.53 | 3.31 | |
Table 9.
Randomly selected predicted values of the conductivity
| Actual | SVM | XGBoost | RF | NN | kNN | |
| 4.323 | 4.320 | 4.211 | 4.393 | 4.323 | 4.322 | |
| 4.596 | 4.602 | 4.648 | 4.606 | 4.595 | 4.651 | |
| 5.472 | 5.475 | 5.242 | 5.332 | 5.460 | 5.435 | |
| 5.305 | 5.302 | 5.409 | 5.174 | 5.302 | 5.053 | |
| 4.092 | 4.088 | 3.919 | 4.025 | 4.089 | 4.173 | |
| MAE | 0.004 | 0.134 | 0.084 | 0.004 | 0.085 | |
Table 10.
Cross validation (CV
| SVM | XGBoost | RF | NN | kNN | |
| Mean | 1.0000 | 0.9999 | 0.9998 | 1.0000 | 0.9998 |
| StDev. | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
Table 11.
Cross validation (CV
| SVM | XGBoost | RF | NN | kNN | |
| Mean | 1.0000 | 0.9604 | 0.9539 | 0.9999 | 0.9623 |
| StDev. | 0.0000 | 0.0059 | 0.0057 | 0.0001 | 0.0038 |
Table 12.
Metrics for predicting the diffusivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $R^2 $ score | 1.0000 (0%) | 0.9999 (0%) | 0.9997 (-0.02%) | 1.0000 (0%) | 0.9998 (0%) |
| MSE | 2.9248 (-1.8%) | 7.0271 (-22.0%) | 26.0184 (+79.7%) | 0.7718 (+45.8%) | 24.2639 (+61%) |
| MAE | 1.4709 (+0.6%) | 2.0080 (-12.2%) | 3.7125 (+33.4%) | 0.7007 (+24.2%) | 3.7301 (+32.3%) |
Table 13.
Metrics for predicting the conductivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $R^2 $ score | 1.0000 (0%) | 0.9792 (+0.8%) | 0.9690 (+0.8%) | 0.9999 (0%) | 0.9758 (+73.3%) |
| MSE | 0.0000 (0%) | 0.0152 (-28.3%) | 0.0227 (-20.0%) | 0.0001 (0%) | 0.0177 (-22.7%) |
| MAE | 0.0036 (-14.3%) | 0.1044 (-14.7%) | 0.1077 (-10.7%) | 0.0067 (+4.7%) | 0.0964 (-13.4%) |
Table 14.
Metrics for predicting the diffusivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2$ score | 1.0000 (0%) | 0.9997 (-0.02%) | 0.9995 (-0.04%) | 1.0000 (0%) | 0.9997 (-0.01%) |
| MSE | 4.3500 (+46.0%) | 31.1700 (+245.8%) | 47.0970 (+225.5%) | 1.2679 (+139.5%) | 29.9027 (+98.3%) |
| MAE | 1.8180 (+24.3%) | 3.8978 (+70.4%) | 4.9244 (+77.0%) | 0.8583 (+52.2%) | 3.4076 (+20.8%) |
Table 15.
Metrics for predicting the conductivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2$ score | 1.0000 (0%) | 0.9430 (-2.88%) | 0.9080 (-5.54%) | 0.9997 (-0.02%) | 0.9432 (-2.63%) |
| MSE | 0.0000 (0%) | 0.0417 (+96.7%) | 0.0672 (+136.6%) | 0.0002 (+100.0%) | 0.0415 (+81.2%) |
| MAE | 0.0051 (+21.4%) | 0.1411 (+35.1%) | 0.1921 (+59.3%) | 0.0092 (+43.8%) | 0.1319 (+18.4%) |
Table 16.
Metrics for predicting the diffusivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2$ score | 0.9990 (-0.10%) | 0.9983 (-0.16%) | 0.9979 (-0.20%) | 0.9986 (-0.14%) | 0.9991 (-0.07%) |
| MSE | 98.7650 (+3216.2%) | 168.2688 (+1766.3%) | 210.1488 (+1351.8%) | 135.3956 (+25457.2%) | 90.5098 (+500.4%) |
| MAE | 6.9092 (+372.3%) | 8.8069 (+285.0%) | 8.8462 (+218.0%) | 8.2829 (+1368.7%) | 6.9310 (+145.8%) |
Table 17.
Metrics for predicting the conductivity
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2$ score | 0.6497 (-35.0%) | 0.5853 (-39.7%) | 0.7440 (-22.6%) | 0.6742 (-32.6%) | 0.7610 (-21.5%) |
| MSE | 0.2559 (+∞%) | 0.3029 (+1327.8%) | 0.1870 (+558.5%) | 0.2380 (+168102%) | 0.1745 (+662.0%) |
| MAE | 0.3539 (+8328.6%) | 0.3753 (+259.4%) | 0.3169 (+162.9%) | 0.3462 (+5309.4%) | 0.3065 (+175.3%) |
Table 18.
Performance on predicting
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2 $ score | 0.9914 $ (-0.9\%) $ | 0.9999 (0%) | 0.9986 $ (-0.1\%) $ | 0.9999 $ (-0.01\%) $ | 0.9998 (0%) |
| MSE | 8.7198 (+192.7%) | 846.0999 (+9266.6%) | 139.4403 (+863.1%) | 12.1855 (+2199.0%) | 23.6522 (+56.8%) |
| MAE | 2.2573 (+54.3%) | 17.2005 (+651.6%) | 8.3986 (+201.8%) | 2.7825 (+393.2%) | 3.6748 (+30.3%) |
Table 19.
Performance on predicting
| Metric | SVM | XGBoost | RF | NN | kNN |
| $ R^2 $ score | 0.9864 $ (-1.4\%) $ | 0.5541 $ (-43.0\%) $ | 0.8008 $ (-16.7\%) $ | 0.9756 $ (-2.4\%) $ | 0.9546 $ (-1.4\%) $ |
| MSE | 0.0099 (+$ \infty $%) | 0.3257 (+1436.3%) | 0.1455 (+412.0%) | 0.0178 (+17700.0%) | 0.0331 (+44.5%) |
| MAE | 0.0757 (+1702.4%) | 0.4001 (+2832.2%) | 0.2758 (+1286.4%) | 0.1093 (+1607.8%) | 0.1431 (+128.5%) |
| [1] |
F. Abbasi, J. Simunek, J. Feyen, M. T. Van Genuchten and P. J. Shouse, Simultaneous inverse estimation of soil hydraulic and solute transport parameters from transient field experiments: Homogeneous soil, Transactions of the ASAE, 46 (2003), 1085.
doi: 10.13031/2013.13960.
|
| [2] |
I. Argyrokastritis, K. Kalimeris and L. Mindrinos, An analytical solution for vertical infiltration in homogeneous bounded profiles, European Journal of Soil Science, 75 (2024), e13547.
doi: 10.1111/ejss.13547.
|
| [3] |
E. Chlouveraki, N. Katsenios, A. Efthimiadou, E. Lazarou, K. Kounani, E. Papakonstantinou, D. Vlachakis, A. Kasimati, I. Zafeiriou, B. Espejo-Garcia and S. Fountas, Estimation of soil properties using hyperspectral imaging and machine learning, Smart Agricultural Technology, 10 (2025), 100790.
doi: 10.1016/j.atech.2025.100790.
|
| [4] |
H. Diao, X. Fei, H. Liu and L. Wang, Determining anomalies in a semilinear elliptic equation by a minimal number of measurements, Inverse Problems, 41 (2025), 055004.
doi: 10.1088/1361-6420/adc82a.
|
| [5] |
F. A. Diaz-Gonzalez, J. Vuelvas, C. A. Correa, V. E. Vallejo and D. Patino, Machine learning and remote sensing techniques applied to estimate soil indicators–review, Ecological Indicators, 135 (2022), 108517.
doi: 10.1016/j.ecolind.2021.108517.
|
| [6] |
A. S. Fokas, A unified transform method for solving linear and certain nonlinear pdes, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 453 (1997), 1411-1443.
doi: 10.1098/rspa.1997.0077.
|
| [7] |
A. Fokas, A. Himonas and D. Mantzavinos, The nonlinear schrödinger equation on the half-line, Transactions of the American Mathematical Society, 369 (2017), 681-709.
|
| [8] |
G. Forkuor, O. K. L. Hounkpatin, G. Welp and M. Thiel, High resolution mapping of soil properties using remote sensing variables in south-western burkina faso: A comparison of machine learning and multiple linear regression models, PloS One, 12 (2017), e0170478.
doi: 10.1371/journal.pone.0170478.
|
| [9] |
I. J. Goodfellow, Y. Bengio and A. C. Courville, Deep Learning. Adaptive Computation and Machine Learning, MIT Press, New York, 2016.
|
| [10] |
L. Guellouz, B. Askri, J. Jaffré and R. Bouhlila, Estimation of the soil hydraulic properties from field data by solving an inverse problem, Scientific Reports, 10 (2020), 9359.
doi: 10.1038/s41598-020-66282-5.
|
| [11] |
T. Hastie, R. Tibshirani and J. Friedman, Elements of Statistical Learning: Data Mining, Inference and Prediction, 2$^{nd}$ edition, Springer, New York, 2001.
|
| [12] |
J. Heaton, S. McElwee, J. Fraley and J. Cannady, Early stabilizing feature importance for tensorflow deep neural networks, in 2017, International Joint Conference on Neural Networks (IJCNN), (2017), 4618-4624.
|
| [13] |
I. T. Jolliffe, Principal Component Analysis, 2$^{nd}$ edition, Springer, New York, 2002.
|
| [14] |
P. Kerkides, A. Poulovassilis, I. Argyrokastritis and S. Elmaloglou, Comparative evaluation of analytic solutions in predicting soil moisture profiles in vertical one-dimensional infiltration under ponded and constant flux boundary conditions, Water Resources Management, 11 (1997), 323-338.
doi: 10.1023/A:1007978714468.
|
| [15] |
S. Khanal, J. Fulton, A. Klopfenstein, N. Douridas and S. Shearer, Integration of high resolution remotely sensed data and machine learning techniques for spatial prediction of soil properties and corn yield, Computers and Electronics in Agriculture, 153 (2018), 213-225.
doi: 10.1016/j.compag.2018.07.016.
|
| [16] |
K ügler, Online parameter identification in time-dependent differential equations as a non-linear inverse problem, European Journal of Applied Mathematics, 19 (2008), 479-506.
doi: 10.1017/S0956792508007547.
|
| [17] |
M. H. Kutner, C. J. Nachtsheim, J. Neter and W. Li, Applied Linear Statistical Models, McGraw-Hill/Irwin, Boston, MA, 5th ed., 2005.
|
| [18] |
O. Le Bourgeois, C. Bouvier, Br unet and P.-A. Ayral, Inverse modeling of soil water content to estimate the hydraulic properties of a shallow soil and the associated weathered bedrock, Journal of Hydrology, 541 (2016), 116-126.
doi: 10.1016/j.jhydrol.2016.01.067.
|
| [19] |
L. McInnes, J. Healy and J. Melville, Umap: Uniform Manifold Approximation and Projection for Dimension Reduction, preprint, arXiv: 1802.03426, 2018.
|
| [20] |
D. Moghadas and A. Badorreck, Machine learning to estimate soil moisture from geophysical measurements of electrical conductivity, Near Surface Geophysics, 17 (2019), 181-195.
doi: 10.1002/nsg.12036.
|
| [21] |
T. Özsarı and N. Yolcu, The initial-boundary value problem for the biharmonic schr\" odinger equation on the half-line, Communications on Pure and Applied Analysis, 18 (2019), 3285-3316.
|
| [22] |
J. Padarian, B. Minasny and A. B. McBratney, Using deep learning to predict soil properties from regional spectral data, Geoderma Regional, 16 (2019), e00198.
doi: 10.1016/j.geodrs.2018.e00198.
|
| [23] |
N. Pallikarakis, A. Kalogeropoulos and N. L. Tsitsas, Exploring the inverse line-source scattering problem in dielectric cylinders with deep neural networks, Physica Scripta, 99 (2024), 116013.
doi: 10.1088/1402-4896/ad852c.
|
| [24] |
N. Pallikarakis and A. Ntargaras, Application of machine learning regression models to inverse eigenvalue problems, Comput. Math. Appl., 154 (2024), 162-174.
doi: 10.1016/j.camwa.2023.11.038.
|
| [25] |
M. Raissi, Pe rdikaris and G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics, 378 (2019), 686-707.
doi: 10.1016/j.jcp.2018.10.045.
|
| [26] |
W. J. Rawls, D. L. Brakensiek and K. E. Saxtonn, Estimation of soil water properties, Transactions of the ASAE, 25 (1982), 1316-1320.
doi: 10.13031/2013.33720.
|
| [27] |
A. Ritter, F. Hupet, R. Muñoz-Carpena, S. Lambot and M. Vanclooster, Using inverse methods for estimating soil hydraulic properties from field data as an alternative to direct methods, Agricultural Water Management, 59 (2003), 77-96.
doi: 10.1016/S0378-3774(02)00160-9.
|
| [28] |
B. C. Si and R. G. Kachanoski, Estimating soil hydraulic properties during constant flux infiltration inverse procedures, Soil Science Society of America Journal, 64 (2000), 439-449.
|
| [29] |
J. Simunek and M. T. Van Genuchten, Estimating unsaturated soil hydraulic properties from tension disc infiltrometer data by numerical inversion, Water Resources Research, 32 (1996), 2683-2696.
doi: 10.1029/96WR01525.
|
| [30] |
D. N. Tanyu, J. Ning, T. Freudenberg, N. Heilenkötter, A. Rademacher, U. Iben and Ma ass, Deep learning methods for partial differential equations and related parameter identification problems, Inverse Problems, 39 (2023), 103001.
doi: 10.1088/1361-6420/ace9d4.
|
| [31] |
B. B. Trangmar, R. S. Yost and G. Uehara, Application of geostatistics to spatial studies of soil properties, Advances in Agronomy, 38 (1986), 45-94.
doi: 10.1016/S0065-2113(08)60673-2.
|
| [32] |
W. G. Yeh, Review of parameter identification procedures in groundwater hydrology: The inverse problem, Water Resources Research, 22 (1986), 95-108.
doi: 10.1029/WR022i002p00095.
|
| [33] |
M. Zeraatpisheh, S. Ayoubi, A. Jafari, S. Tajik and Fi nke, Digital mapping of soil properties using multiple machine learning in a semi-arid region, central iran, Geoderma, 338 (2019), 445-452.
doi: 10.1016/j.geoderma.2018.09.006.
|
Moisture profiles
PCA dimension reduction for
UMAP dimension reduction for
Relative errors between the actual and predicted values of
The loss function of the neural network model for exact data
Regression plots of
Residual plots of
The five most important features identified for each model
Regression plots of
The impact of test set noise on the performance metrics for predicting
The impact of test set noise on the performance metrics for predicting
The effect of dataset size on the metrics for predicting
The effect of dataset size on the metrics for predicting
The effect of feature count on the metrics for predicting
The effect of feature count on the metrics for predicting