(A only or B only), the = x​​ 2​ +x​ model performs better than
the =x​ model.
Since the data was simulated from a quadratic function (of
the form = x​​ 2​ +x +m +mx +m x​​ 2​​ a model that includes a
cubic term should result in overfitting. To test this, a higher-order
model was also fit: = x​​ 3​ + x​​ 2​ +x +m +xm + x​​ 2​ m + x​​ 3​ m​ In
the simulation, the estimated coefficients for ​​ 3​​ and ​​ 3​ m​ were
not significantly different from zero. This indicates that these
terms do not improve the model and should not be included.
If this model were written explicitly, it would be ​​y​ =0 × x​​ 3​ +
5.5155 +29.5736x +2.2768m +28.8131mx +6.9602 x​​ 2​ +
7.4635m​x​​ 2​ +0 × x​​ 3​ m​ which is identical to the last model in
Table 1. Although the estimated model is the same, adding the
cubic terms leads to a poorer fit, as indicated by the larger (and
therefore worse) AIC and BIC values in Table 2 when compar-
ing the overfit model (labeled ​​ 3​ +…+m​x​​ 3​​ to the best model
(labeled = x​​ 2​ +x +m +mx +m x​​ 2​​ Overfitting is also evident
in the difference between the approximate ​​ ​2​​ of 0.9593 and
the adjusted ​​ ​2​​ (which is adjusted to penalize for unnecessary
variables), ​​ a​ =0.9562​ which is smaller. Finally, the critical
values ​​ 90​​​ and ​​ 90/95​​​ become negative, so this overfit model
does not provide usable POD estimates.
Tables 4 and 5 present the normality (via Shapiro–
Wilk) and constant variance (via Durbin–Watson) tests
[17]. Normality is met in all cases, but when material is not
accounted for in the model, the constant variance assump-
tion is violated (P-values 0.05). These tables also list the ​​​μ​pod​​​​ ˆ
and ​ˆ​​​​ σ​pod values. Note that the values are reported on different
scales and cannot be compared directly. While Equation 6
describes how to calculate these values for the simple linear
model, these values will be calculated differently as the models
grow in complexity.
Figure 3 shows the resulting POD curve for all considered
models (from Table 2), and Table 6 lists the ​​ 50​​​ ​​ 90​​​ and ​​ 90/95​​​
values for each model. In this simple simulation, several of the
more complex models provided smaller critical values, suggest-
ing that a poor-fitting model may underestimate the capability
of the NDE system. The common practice of using ​ˆ​​​ μ​pod = a​50​​​
does not apply when a change of variables is used, as with the
quadratic models.
A better linear fit to the data will provide a more accurate
estimate of the POD. In Simulation 1, a general trend is seen
where the ​​ 90​​​ and ​​ 90/95​​​ values decreased as the quality of
the model increased. Better-fitting models will have smaller
standard error, and because these values are sensitive to the
standard error, this is not surprising.
The next section describes Simulation 2, in which
Simulation 1 is repeated 10 000 times. This Monte Carlo study
is intended to provide estimates of how the results could
change with the quality of the model, given a large population
of datasets.
4.3. Simulation 2: Monte Carlo Study
To estimate how much the results change with the quality of
the model, a Monte Carlo simulation was run using 10 000
runs of Equation 42, refitting all models after each run. AIC
and BIC values for each model of the 10 000 simulations were
smaller (therefore, better) for the quadratic models. Consistent
with the behavior observed in a single simulation (see Table 3),
the LRT results showed that more complex models outper-
formed simpler models in all but three of the 10 000 simula-
tions (0.03%).
The best overall model was = x​​ 2​ +x +m +mx +
m x​​ 2​​ The = x​​ 2​ +x​ model outperformed =x​ and
y =x +m +xm​ also outperformed =x​ Although the
LRT cannot be used to compare models = x​​ 2​ +x​ and
y =x +m +xm​ directly (since they are not nested), a direct
comparison of log-likelihoods, AIC, and BIC revealed that
y =x +m +xm​ was always a better fit to the data than =
x​​ 2​ +x​.
According to the LRT results, the +m +xm​ model is
always different from the ​​ 3​ + x​​ 2​ +x +m +xm + x​​ 2​ m + x​​ 3​ m​
model. However, 93.6% of the best models (quadratic with
interactions, = x​​ 2​ +x +m +mx +m x​​ 2​​ showed no signifi-
cant improvement (P-value 0.05) when the cubic terms were
added (i.e., ​​ 3​ + x​​ 2​ +x +m +xm + x​​ 2​ m + x​​ 3​ m​
Contrary to expectation, the median (via Hodges–
Lehmann) critical values of ​​ 50​​​ ​​ 90​​​ and ​​ 90/95​​​ from the qua-
dratic model (​ = x​​ 2​ +x +m +mx +m x​​ 2​​ were more similar
to those from the =x +m +xm​ model than those from the
y =x +material​ model. Figures 4, 5, and 6 show box plots
and density plots for each of the critical values, and Table 7
provides the Hodges–Lehmann estimates of the median
critical values with 95% confidence intervals for the 10 000
simulations.
ME
|
PODMODELING
TA B L E 6
Critical values for the example simulation
(corresponding to the values in Figure 3)
Model y =Material a50 a90 a90/95
x Combo 0.2702 0.4706 0.5055
x2 +x Combo 0.2089 0.2874 0.3076
x A subset 0.2645 0.3980 0.4320
x2 +x A subset 0.3742 0.4335 0.4503
x +m A 0.2645 0.4544 0.4958
x2 +x +m* A 0.2707* 0.3474* 0.3652*
x B subset 0.2731 0.3911 0.4213
x2 +x B subset 0.2464 0.2923 0.3053
x +m B 0.2731 0.3693 0.3917
x2 +x +m* B 0.1771* 0.2176* 0.2271*
*The critical values corresponding to the model that best fit the data.
68
M AT E R I A L S E V A L U AT I O N A U G U S T 2 0 2 5
For Material A, ​​ 50​​​ has a similar mean for many of the
models, but the = x​​ 2​ +x​ model (collapsed over material)
has a smaller ​​ 50​​​ while the = x​​ 2​ +x​ model (fit to Material A
only) has a larger ​​ 50​​​ Figures 5 and 6 show that the results
from ​​ 90​​​ and ​​ 90/95​​​ are similar. For Material A, the = x​​ 2​ +x​
model (collapsed over material) yields the smallest estimates,
likely because it averages over material, so the smaller values
in Material B cause the estimates to be smaller. All the other
models produce estimates larger than the quadratic model
with interactions. For Material B, the quadratic model with
interactions (​ = x​​ 2​ +x +m +mx +m x​​ 2​​ yields smaller critical
values than all the simpler models. Models that excluded the ​​
x​​ 2​​ term (e.g., =x​ were more conservative than they needed
to be (i.e., overconservative) when compared with models that
included the ​​ 2​​ term.
According to the AIC, BIC, and LRT results, the best overall
model for all the simulated data is the quadratic model with
interactions: = x​​ 2​ +x +m +mx +m x​​ 2​​ Since this model
0.2 0.3 0.4
a50
y =x2 +x +material (Mat A)
y =x +material (Mat A)
y =x2 +x (Mat A only)
y =x (Mat A only)
y =x2 +x (Collapsed)
y =x (Collapsed)
0.2 0.3 0.4
a50
y =x2 +x +material (Mat B)
y =x +material (Mat B)
y =x2 +x (Mat B only)
y =x (Mat B only)
y =x2 +x (Collapsed)
y =x (Collapsed)
0.2 0.3 0.4
80
60
40
20
0
a50
0.2 0.3 0.4
75
50
25
0
a50
Figure 4. Resulting a50 values from 10 000 simulations by model. The dashed black line indicates the median for the best-fitting model, the
quadratic model with interactions (y =x2 +x +m +mx +mx2).
0.2 0.3 0.4 0.5
a90
y =x2 +x +material (Mat A)
y =x +material (Mat A)
y =x2 +x (Mat A only)
y =x (Mat A only)
y =x2 +x (Collapsed)
y =x (Collapsed)
0.2 0.3 0.4 0.5
a90
y =x2 +x +material (Mat B)
y =x +material (Mat B)
y =x2 +x (Mat B only)
y =x (Mat B only)
y =x2 +x (Collapsed)
y =x (Collapsed)
0.2 0.3 0.4 0.5
40
20
0
a90
0.2 0.3 0.4 0.5
60
40
20
0
a90
Figure 5. Resulting a90 values from 10 000 simulations by model. The dashed black line indicates the median for the best-fitting model, the
quadratic model with interactions (y =x2 +x +m +mx +mx2).
A U G U S T 2 0 2 5 M AT E R I A L S E V A L U AT I O N 69
Density
Density
Density
Density
Previous Page Next Page