\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Estimating properties of a homogeneous bounded soil using machine learning models

  • *Corresponding author: Leonidas Mindrinos

    *Corresponding author: Leonidas Mindrinos 

The first author acknowledges support by the Sectoral Development Program (SDP 5223471) of the Ministry of Education, Religious Affairs and Sports, through the National Development Program (NDP) 2021-25, grant no 200/1029.

Abstract / Introduction Full Text(HTML) Figure(15) / Table(19) Related Papers Cited by
  • This work focuses on estimating soil properties from water moisture measurements. We consider simulated data generated by solving the initial-boundary value problem governing vertical infiltration in a homogeneous, bounded soil profile, with the usage of the Fokas method. To address the parameter identification problem, which is formulated as a two-output regression task, we explore various machine learning models. The performance of each model is assessed under different data conditions: full, noisy, and limited. Overall, the prediction of diffusivity $ D $ tends to be more accurate than that of hydraulic conductivity $ K. $ Among the models considered, Support Vector Machines (SVMs) and Neural Networks (NNs) demonstrate the highest robustness, achieving near-perfect accuracy and minimal errors.

    Mathematics Subject Classification: Primary: 35G16, 35R30, 76S05; Secondary: 86-10.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Moisture profiles $ \theta $ in soil for different infiltration times due to flooding (left). Moisture profiles at given depth positions with respect to time (right)

    Figure 2.  PCA dimension reduction for $ D $ (left) and $ K $ (right)

    Figure 3.  UMAP dimension reduction for $ D $ (left) and $ K $ (right)

    Figure 4.  Relative errors between the actual and predicted values of $ D $ (left), as presented in Table 8, and of $ K $ (right), as shown in Table 9. The indices on the $ x $-axis represent the row numbers of the respective tables

    Figure 5.  The loss function of the neural network model for exact data

    Figure 6.  Regression plots of $ D \, \left[\mathrm{cm}^2 / \mathrm{h}\right] $ (left) and $ K \, [\mathrm{cm} / \mathrm{h}] $ (right) in the test set for exact data

    Figure 7.  Residual plots of $ D \, \left[\mathrm{cm}^2 / \mathrm{h}\right] $ (left) and $ K \, [\mathrm{cm} / \mathrm{h}] $ (right) in the test set for exact data

    Figure 8.  The five most important features identified for each model

    Figure 9.  Regression plots of $ D \, \left[\mathrm{cm}^2 / \mathrm{h}\right] $ (left) and $ K \, [\mathrm{cm} / \mathrm{h}] $ (right) in the test set for data with $ 2 \% $ noise. Methods that achieved an $ R^2 $ score greater than 0.95 for both outputs are presented

    Figure 10.  The impact of test set noise on the performance metrics for predicting $ D $

    Figure 11.  The impact of test set noise on the performance metrics for predicting $ K $

    Figure 12.  The effect of dataset size on the metrics for predicting $ D. $ The results refer to the test set for exact data

    Figure 13.  The effect of dataset size on the metrics for predicting $ K. $ The results refer to the test set for exact data

    Figure 14.  The effect of feature count on the metrics for predicting $ D. $ The results refer to the test set for exact data

    Figure 15.  The effect of feature count on the metrics for predicting $ K. $ The results refer to the test set for exact data

    Table 1.  Hyperparameters and architecture of the support vector regressor

    Component Hyperparameter / Setting
    Kernel Radial basis function
    Regularization parameter $ c $ 1000
    Kernel coefficient $ \gamma $ 0.01
    Estimator $ \epsilon $ 0.01
     | Show Table
    DownLoad: CSV

    Table 2.  Hyperparameters and architecture of the XGB regressor

    Component Hyperparameter / Setting
    Loss function Squared error
    Estimators 300
    Max depth 7
    Learning rate 0.2
    $ L^2 $ regularization parameter 100
     | Show Table
    DownLoad: CSV

    Table 3.  Hyperparameters and architecture of the random forest regressor

    Component Hyperparameter / Setting
    Number of trees 100
    Maximum depth 20
    Min sample split 2
    Min sample leaf 1
    Number of features Sqrt
     | Show Table
    DownLoad: CSV

    Table 4.  Hyperparameters and architecture of the fully connected neural network model

    Component Hyperparameter / Setting
    Activation function LeakyReLU with $ \alpha=0.01 $
    Regularization $ L^2 $ with parameter $ 2\times10^{-4} $
    Layer 1 Dense with 128 nodes
    Layer 2 Dense with 64 nodes
    Layer 3 Dense with 32 nodes
    Output layer Dense with 2 nodes
    Optimizer Adam with learning rate 0.0001
    Loss function Mean Squared Error (MSE)
    Epochs 2000
    Early stopping Patience = 50 and $ \Delta = 10^{-4} $
     | Show Table
    DownLoad: CSV

    Table 5.  Hyperparameters and architecture of the $ k- $nearest neighbors regressor

    Component Hyperparameter / Setting
    Number of nearest neighbors 3
    Weight function distance
    Power parameter 2
     | Show Table
    DownLoad: CSV

    Table 6.  Metrics for predicting the diffusivity $ D $ in the test set using exact data

    Metric SVM XGBoost RF NN kNN
    $ R^2$ score 1.0000 0.9999 0.9999 1.0000 0.9998
    MSE 2.9787 9.0156 14.4745 0.5293 15.0795
    MAE 1.4625 2.2871 2.7821 0.5640 2.8197
     | Show Table
    DownLoad: CSV

    Table 7.  Metrics for predicting the conductivity K in the test set using exact data.

    Metric SVM XGBoost RF NN kNN
    $ R^2$ score 1.0000 0.9710 0.9612 0.9999 0.9687
    MSE 0.0000 0.0212 0.0284 0.0001 0.0229
    MAE 0.0042 0.1044 0.1206 0.0064 0.1114
     | Show Table
    DownLoad: CSV

    Table 8.  Randomly selected predicted values of the diffusivity $ D $ in the test set for exact data

    Actual SVM XGBoost RF NN kNN
    1684.59 1683.87 1684.87 1682.27 1685.58 1685.83
    1869.99 1869.60 1867.21 1870.23 1870.85 1868.19
    2117.94 2121.01 2116.82 2121.32 2118.22 2117.13
    1735.98 1737.62 1737.57 1739.88 1736.25 1744.83
    1382.93 1384.57 1384.15 1383.48 1382.67 1379.06
    MAE 1.49 1.40 2.08 0.53 3.31
     | Show Table
    DownLoad: CSV

    Table 9.  Randomly selected predicted values of the conductivity $ K $ in the test set for exact data

    Actual SVM XGBoost RF NN kNN
    4.323 4.320 4.211 4.393 4.323 4.322
    4.596 4.602 4.648 4.606 4.595 4.651
    5.472 5.475 5.242 5.332 5.460 5.435
    5.305 5.302 5.409 5.174 5.302 5.053
    4.092 4.088 3.919 4.025 4.089 4.173
    MAE 0.004 0.134 0.084 0.004 0.085
     | Show Table
    DownLoad: CSV

    Table 10.  Cross validation (CV$ = 5 $) $ R^2 $ results for predicting the diffusivity $ D $ using exact data

    SVM XGBoost RF NN kNN
    Mean 1.0000 0.9999 0.9998 1.0000 0.9998
    StDev. 0.0000 0.0000 0.0000 0.0000 0.0000
     | Show Table
    DownLoad: CSV

    Table 11.  Cross validation (CV$ = 5 $) $ R^2 $ results for predicting the conductivity $ K $ using exact data

    SVM XGBoost RF NN kNN
    Mean 1.0000 0.9604 0.9539 0.9999 0.9623
    StDev. 0.0000 0.0059 0.0057 0.0001 0.0038
     | Show Table
    DownLoad: CSV

    Table 12.  Metrics for predicting the diffusivity $ D $ in the test set for exact data using the five most important features. Relative changes (%) compared to all features (see Table 6) are provided in parentheses

    Metric SVM XGBoost RF NN kNN
    $R^2 $ score 1.0000 (0%) 0.9999 (0%) 0.9997 (-0.02%) 1.0000 (0%) 0.9998 (0%)
    MSE 2.9248 (-1.8%) 7.0271 (-22.0%) 26.0184 (+79.7%) 0.7718 (+45.8%) 24.2639 (+61%)
    MAE 1.4709 (+0.6%) 2.0080 (-12.2%) 3.7125 (+33.4%) 0.7007 (+24.2%) 3.7301 (+32.3%)
     | Show Table
    DownLoad: CSV

    Table 13.  Metrics for predicting the conductivity $ K $ in the test set for exact data using the five most important features. Relative changes (%) compared to all features (see Table 7) are provided in parentheses

    Metric SVM XGBoost RF NN kNN
    $R^2 $ score 1.0000 (0%) 0.9792 (+0.8%) 0.9690 (+0.8%) 0.9999 (0%) 0.9758 (+73.3%)
    MSE 0.0000 (0%) 0.0152 (-28.3%) 0.0227 (-20.0%) 0.0001 (0%) 0.0177 (-22.7%)
    MAE 0.0036 (-14.3%) 0.1044 (-14.7%) 0.1077 (-10.7%) 0.0067 (+4.7%) 0.0964 (-13.4%)
     | Show Table
    DownLoad: CSV

    Table 14.  Metrics for predicting the diffusivity $ D $ in the test set using PCA generated features. Relative changes (%) compared to all features (see Table 6) are provided in parentheses

    Metric SVM XGBoost RF NN kNN
    $ R^2$ score 1.0000 (0%) 0.9997 (-0.02%) 0.9995 (-0.04%) 1.0000 (0%) 0.9997 (-0.01%)
    MSE 4.3500 (+46.0%) 31.1700 (+245.8%) 47.0970 (+225.5%) 1.2679 (+139.5%) 29.9027 (+98.3%)
    MAE 1.8180 (+24.3%) 3.8978 (+70.4%) 4.9244 (+77.0%) 0.8583 (+52.2%) 3.4076 (+20.8%)
     | Show Table
    DownLoad: CSV

    Table 15.  Metrics for predicting the conductivity $ K $ in the test set using PCA generated features. Relative changes (%) compared to all features (see Table 7) are provided in parentheses

    Metric SVM XGBoost RF NN kNN
    $ R^2$ score 1.0000 (0%) 0.9430 (-2.88%) 0.9080 (-5.54%) 0.9997 (-0.02%) 0.9432 (-2.63%)
    MSE 0.0000 (0%) 0.0417 (+96.7%) 0.0672 (+136.6%) 0.0002 (+100.0%) 0.0415 (+81.2%)
    MAE 0.0051 (+21.4%) 0.1411 (+35.1%) 0.1921 (+59.3%) 0.0092 (+43.8%) 0.1319 (+18.4%)
     | Show Table
    DownLoad: CSV

    Table 16.  Metrics for predicting the diffusivity $ D $ using UMAP features. Relative changes (%) compared to all features (see Table 6) are provided in parentheses

    Metric SVM XGBoost RF NN kNN
    $ R^2$ score 0.9990 (-0.10%) 0.9983 (-0.16%) 0.9979 (-0.20%) 0.9986 (-0.14%) 0.9991 (-0.07%)
    MSE 98.7650 (+3216.2%) 168.2688 (+1766.3%) 210.1488 (+1351.8%) 135.3956 (+25457.2%) 90.5098 (+500.4%)
    MAE 6.9092 (+372.3%) 8.8069 (+285.0%) 8.8462 (+218.0%) 8.2829 (+1368.7%) 6.9310 (+145.8%)
     | Show Table
    DownLoad: CSV

    Table 17.  Metrics for predicting the conductivity $ K $ using UMAP features. Relative changes (%) compared to all features (see Table 7) are provided in parentheses

    Metric SVM XGBoost RF NN kNN
    $ R^2$ score 0.6497 (-35.0%) 0.5853 (-39.7%) 0.7440 (-22.6%) 0.6742 (-32.6%) 0.7610 (-21.5%)
    MSE 0.2559 (+∞%) 0.3029 (+1327.8%) 0.1870 (+558.5%) 0.2380 (+168102%) 0.1745 (+662.0%)
    MAE 0.3539 (+8328.6%) 0.3753 (+259.4%) 0.3169 (+162.9%) 0.3462 (+5309.4%) 0.3065 (+175.3%)
     | Show Table
    DownLoad: CSV

    Table 18.  Performance on predicting $ D $ for noisy test data (2% noise) with relative changes (%) compared to exact data of Table 6

    Metric SVM XGBoost RF NN kNN
    $ R^2 $ score 0.9914 $ (-0.9\%) $ 0.9999 (0%) 0.9986 $ (-0.1\%) $ 0.9999 $ (-0.01\%) $ 0.9998 (0%)
    MSE 8.7198 (+192.7%) 846.0999 (+9266.6%) 139.4403 (+863.1%) 12.1855 (+2199.0%) 23.6522 (+56.8%)
    MAE 2.2573 (+54.3%) 17.2005 (+651.6%) 8.3986 (+201.8%) 2.7825 (+393.2%) 3.6748 (+30.3%)
     | Show Table
    DownLoad: CSV

    Table 19.  Performance on predicting $ K $ for noisy test data (2% noise), with relative changes (%) compared to exact data in Table 7

    Metric SVM XGBoost RF NN kNN
    $ R^2 $ score 0.9864 $ (-1.4\%) $ 0.5541 $ (-43.0\%) $ 0.8008 $ (-16.7\%) $ 0.9756 $ (-2.4\%) $ 0.9546 $ (-1.4\%) $
    MSE 0.0099 (+$ \infty $%) 0.3257 (+1436.3%) 0.1455 (+412.0%) 0.0178 (+17700.0%) 0.0331 (+44.5%)
    MAE 0.0757 (+1702.4%) 0.4001 (+2832.2%) 0.2758 (+1286.4%) 0.1093 (+1607.8%) 0.1431 (+128.5%)
     | Show Table
    DownLoad: CSV
  • [1] F. AbbasiJ. SimunekJ. FeyenM. T. Van Genuchten and P. J. Shouse, Simultaneous inverse estimation of soil hydraulic and solute transport parameters from transient field experiments: Homogeneous soil, Transactions of the ASAE, 46 (2003), 1085.  doi: 10.13031/2013.13960.
    [2] I. Argyrokastritis, K. Kalimeris and L. Mindrinos, An analytical solution for vertical infiltration in homogeneous bounded profiles, European Journal of Soil Science, 75 (2024), e13547. doi: 10.1111/ejss.13547.
    [3] E. ChlouverakiN. KatseniosA. EfthimiadouE. LazarouK. KounaniE. PapakonstantinouD. VlachakisA. KasimatiI. ZafeiriouB. Espejo-Garcia and S. Fountas, Estimation of soil properties using hyperspectral imaging and machine learning, Smart Agricultural Technology, 10 (2025), 100790.  doi: 10.1016/j.atech.2025.100790.
    [4] H. DiaoX. FeiH. Liu and L. Wang, Determining anomalies in a semilinear elliptic equation by a minimal number of measurements, Inverse Problems, 41 (2025), 055004.  doi: 10.1088/1361-6420/adc82a.
    [5] F. A. Diaz-GonzalezJ. VuelvasC. A. CorreaV. E. Vallejo and D. Patino, Machine learning and remote sensing techniques applied to estimate soil indicators–review, Ecological Indicators, 135 (2022), 108517.  doi: 10.1016/j.ecolind.2021.108517.
    [6] A. S. Fokas, A unified transform method for solving linear and certain nonlinear pdes, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 453 (1997), 1411-1443.  doi: 10.1098/rspa.1997.0077.
    [7] A. FokasA. Himonas and D. Mantzavinos, The nonlinear schrödinger equation on the half-line, Transactions of the American Mathematical Society, 369 (2017), 681-709. 
    [8] G. Forkuor, O. K. L. Hounkpatin, G. Welp and M. Thiel, High resolution mapping of soil properties using remote sensing variables in south-western burkina faso: A comparison of machine learning and multiple linear regression models, PloS One, 12 (2017), e0170478. doi: 10.1371/journal.pone.0170478.
    [9] I. J. GoodfellowY. Bengio and  A. C. CourvilleDeep Learning. Adaptive Computation and Machine Learning, MIT Press, New York, 2016. 
    [10] L. GuellouzB. AskriJ. Jaffré and R. Bouhlila, Estimation of the soil hydraulic properties from field data by solving an inverse problem, Scientific Reports, 10 (2020), 9359.  doi: 10.1038/s41598-020-66282-5.
    [11] T. Hastie, R. Tibshirani and J. Friedman, Elements of Statistical Learning: Data Mining, Inference and Prediction, 2$^{nd}$ edition, Springer, New York, 2001.
    [12] J. Heaton, S. McElwee, J. Fraley and J. Cannady, Early stabilizing feature importance for tensorflow deep neural networks, in 2017, International Joint Conference on Neural Networks (IJCNN), (2017), 4618-4624.
    [13] I. T. Jolliffe, Principal Component Analysis, 2$^{nd}$ edition, Springer, New York, 2002.
    [14] P. KerkidesA. PoulovassilisI. Argyrokastritis and S. Elmaloglou, Comparative evaluation of analytic solutions in predicting soil moisture profiles in vertical one-dimensional infiltration under ponded and constant flux boundary conditions, Water Resources Management, 11 (1997), 323-338.  doi: 10.1023/A:1007978714468.
    [15] S. KhanalJ. FultonA. KlopfensteinN. Douridas and S. Shearer, Integration of high resolution remotely sensed data and machine learning techniques for spatial prediction of soil properties and corn yield, Computers and Electronics in Agriculture, 153 (2018), 213-225.  doi: 10.1016/j.compag.2018.07.016.
    [16] K ügler, Online parameter identification in time-dependent differential equations as a non-linear inverse problem, European Journal of Applied Mathematics, 19 (2008), 479-506.  doi: 10.1017/S0956792508007547.
    [17] M. H. Kutner, C. J. Nachtsheim, J. Neter and W. Li, Applied Linear Statistical Models, McGraw-Hill/Irwin, Boston, MA, 5th ed., 2005.
    [18] O. Le BourgeoisC. BouvierBr unet and P.-A. Ayral, Inverse modeling of soil water content to estimate the hydraulic properties of a shallow soil and the associated weathered bedrock, Journal of Hydrology, 541 (2016), 116-126.  doi: 10.1016/j.jhydrol.2016.01.067.
    [19] L. McInnes, J. Healy and J. Melville, Umap: Uniform Manifold Approximation and Projection for Dimension Reduction, preprint, arXiv: 1802.03426, 2018.
    [20] D. Moghadas and A. Badorreck, Machine learning to estimate soil moisture from geophysical measurements of electrical conductivity, Near Surface Geophysics, 17 (2019), 181-195.  doi: 10.1002/nsg.12036.
    [21] T. Özsarı and N. Yolcu, The initial-boundary value problem for the biharmonic schr\" odinger equation on the half-line, Communications on Pure and Applied Analysis, 18 (2019), 3285-3316. 
    [22] J. Padarian, B. Minasny and A. B. McBratney, Using deep learning to predict soil properties from regional spectral data, Geoderma Regional, 16 (2019), e00198. doi: 10.1016/j.geodrs.2018.e00198.
    [23] N. PallikarakisA. Kalogeropoulos and N. L. Tsitsas, Exploring the inverse line-source scattering problem in dielectric cylinders with deep neural networks, Physica Scripta, 99 (2024), 116013.  doi: 10.1088/1402-4896/ad852c.
    [24] N. Pallikarakis and A. Ntargaras, Application of machine learning regression models to inverse eigenvalue problems, Comput. Math. Appl., 154 (2024), 162-174.  doi: 10.1016/j.camwa.2023.11.038.
    [25] M. RaissiPe rdikaris and G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics, 378 (2019), 686-707.  doi: 10.1016/j.jcp.2018.10.045.
    [26] W. J. RawlsD. L. Brakensiek and K. E. Saxtonn, Estimation of soil water properties, Transactions of the ASAE, 25 (1982), 1316-1320.  doi: 10.13031/2013.33720.
    [27] A. RitterF. HupetR. Muñoz-CarpenaS. Lambot and M. Vanclooster, Using inverse methods for estimating soil hydraulic properties from field data as an alternative to direct methods, Agricultural Water Management, 59 (2003), 77-96.  doi: 10.1016/S0378-3774(02)00160-9.
    [28] B. C. Si and R. G. Kachanoski, Estimating soil hydraulic properties during constant flux infiltration inverse procedures, Soil Science Society of America Journal, 64 (2000), 439-449. 
    [29] J. Simunek and M. T. Van Genuchten, Estimating unsaturated soil hydraulic properties from tension disc infiltrometer data by numerical inversion, Water Resources Research, 32 (1996), 2683-2696.  doi: 10.1029/96WR01525.
    [30] D. N. TanyuJ. NingT. FreudenbergN. HeilenkötterA. RademacherU. Iben and Ma ass, Deep learning methods for partial differential equations and related parameter identification problems, Inverse Problems, 39 (2023), 103001.  doi: 10.1088/1361-6420/ace9d4.
    [31] B. B. TrangmarR. S. Yost and G. Uehara, Application of geostatistics to spatial studies of soil properties, Advances in Agronomy, 38 (1986), 45-94.  doi: 10.1016/S0065-2113(08)60673-2.
    [32] W. G. Yeh, Review of parameter identification procedures in groundwater hydrology: The inverse problem, Water Resources Research, 22 (1986), 95-108.  doi: 10.1029/WR022i002p00095.
    [33] M. ZeraatpishehS. AyoubiA. JafariS. Tajik and Fi nke, Digital mapping of soil properties using multiple machine learning in a semi-arid region, central iran, Geoderma, 338 (2019), 445-452.  doi: 10.1016/j.geoderma.2018.09.006.
  • 加载中

Figures(15)

Tables(19)

SHARE

Article Metrics

HTML views(1285) PDF downloads(132) Cited by(0)

Access History

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return