and MSE of 1.78 and 6.75 MPa, which are very close to the
ANN performance using 700 features. Converting to RNT,
these errors come out to 0.75 and 1.20 °C, respectively. On the
NCA side, the same GPR performs slightly better by 0.2 MAE
and 1 MSE, demonstrating the NCA’s slight edge over mRMR.
However, given the small degree of error, both methods could
be used interchangeably and this validates the use of feature
extraction for the previous methods proposed by the authors.
Conclusions
This paper presented the latest results relative to a vibration-
based NDT method, currently under development at the
University of Pittsburgh, to determine longitudinal stress and
neutral temperature in continuous welded rail. Specifically, the
study presented in this paper aimed at determining the most
significant bandwidths of interest that could be used to train
an ML algorithm to achieve the scope of the testing approach.
In addition, a systematic cross comparison of the predictions
from seven different ML algorithms was conducted.
The results demonstrated that only a few selected band-
widths within the frequency range of 0–700 Hz can be used
without compromising the accuracy of the proposed method-
ology. In addition, the Gaussian process regression and ANNs
outperformed the others. The analyses conducted here were
based on a relatively small set of experimental data collected
during four days of testing. Once fully proven, the method will
require only a few measurements to be taken at any time of the
day, regardless of the steel temperature.
As the current approach relies on knowing the neutral
temperature associated with the empirical data used for the
T A B L E 4
Machine learning algorithm sweep results for stress (MPa) using top 30 NCA features
Model Type Mean absolute error Mean-squared error R2
Linear Linear 3.607 17.76 0.940
Linear Interactions 3.197 14.73 0.950
Linear Robust 3.535 17.83 0.940
Tree Fine 3.201 14.73 0.950
Tree Medium 2.358 14.84 0.950
Tree Coarse 2.390 13.95 0.953
SVM Linear 2.386 12.55 0.958
SVM Quadratic 3.630 18.01 0.939
SVM Cubic 2.722 11.50 0.961
SVM Fine Gaussian 2.581 10.19 0.966
SVM Medium Gaussian 2.017 7.792 0.974
SVM Coarse Gaussian 2.272 8.365 0.972
Ensemble Boosted trees 2.662 10.80 0.963
Ensemble Bagged trees 2.753 11.52 0.961
GPR Rational quadratic 2.205 11.58 0.961
GPR Squared exponential 2.213 8.275 0.972
GPR Matern 5/2 1.857 6.730 0.977
GPR Exponential 1.527 5.678 0.981
ANN Narrow 1.657 5.981 0.980
ANN Medium 2.397 9.302 0.969
ANN Wide 2.211 8.385 0.972
ANN Bilayered 1.922 7.211 0.976
ANN Trilayered 2.182 8.265 0.972
Kernel SVM kernel 2.078 8.133 0.972
Kernel Least-squares kernel 3.599 21.69 0.927
Note: Best is shown in red
J A N U A R Y 2 0 2 4 • M A T E R I A L S E V A L U A T I O N 77
2401 ME January.indd 77 12/20/23 8:01 AM
ANN performance using 700 features. Converting to RNT,
these errors come out to 0.75 and 1.20 °C, respectively. On the
NCA side, the same GPR performs slightly better by 0.2 MAE
and 1 MSE, demonstrating the NCA’s slight edge over mRMR.
However, given the small degree of error, both methods could
be used interchangeably and this validates the use of feature
extraction for the previous methods proposed by the authors.
Conclusions
This paper presented the latest results relative to a vibration-
based NDT method, currently under development at the
University of Pittsburgh, to determine longitudinal stress and
neutral temperature in continuous welded rail. Specifically, the
study presented in this paper aimed at determining the most
significant bandwidths of interest that could be used to train
an ML algorithm to achieve the scope of the testing approach.
In addition, a systematic cross comparison of the predictions
from seven different ML algorithms was conducted.
The results demonstrated that only a few selected band-
widths within the frequency range of 0–700 Hz can be used
without compromising the accuracy of the proposed method-
ology. In addition, the Gaussian process regression and ANNs
outperformed the others. The analyses conducted here were
based on a relatively small set of experimental data collected
during four days of testing. Once fully proven, the method will
require only a few measurements to be taken at any time of the
day, regardless of the steel temperature.
As the current approach relies on knowing the neutral
temperature associated with the empirical data used for the
T A B L E 4
Machine learning algorithm sweep results for stress (MPa) using top 30 NCA features
Model Type Mean absolute error Mean-squared error R2
Linear Linear 3.607 17.76 0.940
Linear Interactions 3.197 14.73 0.950
Linear Robust 3.535 17.83 0.940
Tree Fine 3.201 14.73 0.950
Tree Medium 2.358 14.84 0.950
Tree Coarse 2.390 13.95 0.953
SVM Linear 2.386 12.55 0.958
SVM Quadratic 3.630 18.01 0.939
SVM Cubic 2.722 11.50 0.961
SVM Fine Gaussian 2.581 10.19 0.966
SVM Medium Gaussian 2.017 7.792 0.974
SVM Coarse Gaussian 2.272 8.365 0.972
Ensemble Boosted trees 2.662 10.80 0.963
Ensemble Bagged trees 2.753 11.52 0.961
GPR Rational quadratic 2.205 11.58 0.961
GPR Squared exponential 2.213 8.275 0.972
GPR Matern 5/2 1.857 6.730 0.977
GPR Exponential 1.527 5.678 0.981
ANN Narrow 1.657 5.981 0.980
ANN Medium 2.397 9.302 0.969
ANN Wide 2.211 8.385 0.972
ANN Bilayered 1.922 7.211 0.976
ANN Trilayered 2.182 8.265 0.972
Kernel SVM kernel 2.078 8.133 0.972
Kernel Least-squares kernel 3.599 21.69 0.927
Note: Best is shown in red
J A N U A R Y 2 0 2 4 • M A T E R I A L S E V A L U A T I O N 77
2401 ME January.indd 77 12/20/23 8:01 AM



















































































































