data was simulated from a quadratic function, so models with
cubic terms constitute overfitting. None of the 10 000 values for
Material A existed, and the 10 000 values for Material B were
negative. Specifically, ​​ 90​​​ covered a range of –0.0508 to –0.0162,
and ​​ 90/95​​​ covered a range of –0.0421 to –0.0021. Even if the
cubic term were to provide an explanation for the negative
values, the absolute values of these critical values are an
order of magnitude smaller than those produced by the other
models. Future work, which carefully handles multicollinearity,
may provide different results.
Conclusion
The current military handbook for probability of detection
studies (POD) (MIL-HDBK1823A) uses a simple linear model
relating the signal from an NDE system to the observed discon-
tinuity size [4]. However, other important variables may be nec-
essary to describe the NDE response. Currently, POD studies
involving important variables beyond discontinuity size are
handled by either (1) creating subsets of the data, building a
POD for each one, and choosing the most conservative (largest ​​
a​90/95​​​ result or (2) naively assuming that discontinuity size is
the only important variable, essentially averaging over all other
variables. Subsetting the data may cause problems with sample
size, and taking the most conservative result from the set can
lead to an unnecessarily conservative (i.e., overconservative)
estimate, rather than a fleet-wide average, as demonstrated
by the simulation experiment. Ignoring important variables
inflates the variance, creating a larger confidence interval.
Furthermore, without the other variables included, the linear
model will be skewed toward the cases where the majority of
the data occurs. Data collection is often influenced by cost and
availability—for example, if one material is cheaper, it may be
overrepresented in the study, skewing the results toward that
material. Neither of these options is ideal. Therefore, a meth-
odology was built so that additional variables included in the
linear model are correctly handled in the transformation to
probability with respect to discontinuity size.
It is reasonable to expect that a better linear fit to the data
will provide a more accurate POD estimate. Using a linear
model that omits important variables in a POD study can
introduce bias into the POD curve estimates. MIL-HDBK-1823A
explains how to perform POD analysis for the relationship
between signal response vs. a) and discontinuity size this
paper extends those methods by showing how to include other
important variables and higher-order terms of discontinuity
size. These methods can be used to build more accurate linear
models for NDE applications. The categorical variable used in
our example demonstrated differences among materials, but
these methods could also be used to model other categorical
variables—such as cracks versus notches—potentially eliminat-
ing the need to separately estimate a knock-down factor.
This paper used a simulation size of 10 000 to illus-
trate how simple models may yield biased estimates. Future
work could apply these methodologies to a large dataset
of bootstrap-sampled NDE data to determine the relation-
ship between the quality of a linear model fit and the bias
in the POD estimate. For some NDE applications, a simple
model may be sufficiently accurate, while for others, a more
complex model may be necessary. This paper provides a prac-
tical demonstration that can be readily extended to include
N​ continuous and/or categorical variables, along with addi-
tional higher-order terms. Overfitting the model can lead
TA B L E 8
Hodges-Lehmann estimated differences in the critical values when comparing simpler models to the
complex model (y =x2 +x +m +mx +mx2)
Model y =Material a50 (95% confidence) a90 (95% confidence) a90/95 (95% confidence)
x A –0.0028 (–0.0032, –0.0024) –0.1349 (–0.1353, –0.1346) –0.1620 (–0.1623, –0.1616)
x2 +x A 0.0609 (0.0607, 0.0611) 0.0543 (0.0541, 0.0545) 0.0539 (0.0537, 0.0541)
x A subset –0.1023 (–0.1024, –0.1022) –0.0993 (–0.0995, –0.0992) –0.1024 (–0.1026, –0.1023)
x2 +x A subset –0.0028 (–0.0031, –0.0025) –0.0413 (–0.0417, –0.0409) –0.0615 (–0.0619, –0.0611)
x +m A –0.0026 (–0.0031, –0.0020)* –0.1217 (–0.1223, –0.1212)* –0.1550 (–0.1555, –0.1544)*
x B –0.0947 (–0.0950, –0.0945) –0.2616 (–0.2619, –0.2614) –0.2928 (–0.2931, –0.2925)
x2 +x B –0.0310 (–0.0310, –0.0309) –0.0723 (–0.0724, –0.0722) –0.0768 (–0.0770, –0.0767)
x B subset –0.0691 (–0.0691, –0.0690) –0.0679 (–0.0680, –0.0678) –0.0696 (–0.0696, –0.0695)
x2 +x B subset –0.0947 (–0.0950, –0.0945) –0.1681 (–0.1683, –0.1678) –0.1923 (–0.1926, –0.1920)
x +m B –0.0947 (–0.0950, –0.0945)* –0.1524 (–0.1527, –0.1522)* –0.1703 (–0.1706, –0.1700)*
Note: Negative values indicate that the estimates from the quadratic model with interactions were smaller than those from the simpler model listed.
Mann-Whitney two-sample matched paired two-tailed tests indicated that every comparison was significant (all P-values 0.0001).
The values in bold were closest to the quadratic model with interactions.
*The critical values corresponding to the model that best fit the data.
A U G U S T 2 0 2 5 M AT E R I A L S E V A L U AT I O N 71
to nonexistent or inaccurate results, but overfitting can be
avoided by monitoring when ​​ a​ is smaller than ​​ ​2​​ and when
AIC and BIC values increase (rather than decrease) as more
variables are included. Additionally, potential inflation of
parameter variances can be reduced by centering interaction
and higher-order terms in the model. Future work is planned
to investigate other NDE techniques and demonstrate how this
process can be applied to more complex inspection scenar-
ios. More realistic and complex simulations are also planned
to explore how these methods affect estimates of the critical
values (​​ 50​​​ ​​ 90​​​ and ​​ 90/95​​​
EDITOR‘S NOTE
Appendices for this paper are available to download from the digital
edition.
REFERENCES
1. Berens, A. P. 1989. “NDE reliability data analysis.” In ASM Handbook,
9th ed., Vol. 17: 689–701. ASM International.
2. Hovey, P. W., and A. P. Berens. 1988. “Statistical Evaluation of NDE
Reliability in the Aerospace Industry,” in Review of Progress in Quantita-
tive Nondestructive Evaluation, eds. D. O. Thompson and D. E. Chimenti.
Boston, MA: Springer. https://doi.org/10.1007/978-1-4613-0979-6_108.
3. Cherry, M., and C. Knott. 2022. “What is probability of detection?” Mate-
rials Evaluation 80 (12): 24–28. https://doi.org/10.32548/2022.me-04324.
4. US Department of Defense. 2009. Department of Defense Handbook:
Nondestructive Evaluation System Reliability Assessment (MIL-HDBK-
1823A). Standardization Order Desk: Philadelphia, PA.
5. US Department of Defense. 2016. Department of Defense Standard
Practice: Aircraft Structural Integrity Program (MIL-STD-1530D). Standard-
ization Order Desk: Philadelphia, PA.
6. Annis, C. 2016. mh1823 POD (Probability of Detection) Software, Version
5.2.1.
7. Gohmann, C., T. Boehnlein, and C. Knott. 2023. POD (Probability of
Detection) Software, Version 4.5.
8. Hoppe, W. C. 2009. “Parametric probability of detection (POD) estima-
tion for eddy current crack detection.” 14th International Workshop on
Electromagnetic Nondestructive Evaluation. Dayton, OH.
9. Aldrin, J. C., J. S. Knopp, and H. A. Sabbagh. 2013. “Bayesian methods
in probability of detection estimation and model-assisted probability of
detection evaluation.” AIP Conference Proceedings 1511: 1733–1740. https://
doi.org/10.1063/1.4789250.
10. Shell, E. B., J. C. Aldrin, H. A. Sabbagh, E. Sabbagh, R. K. Murphy, S.
Mazdiyasni, and E. A. Lindgren. 2015. “Demonstration of model-based
inversion of electromagnetic signals for crack characterization.” AIP
Conference Proceedings 1650: 484–493. https://doi.org/10.1063/1.4914645.
11. Müller, C., and T. Öberg. 2004. “Strategy for Verification and Demon-
stration of the Sealing Process for Canisters for Spent Fuel.” SKB Report
R-04-56. https://skb.com/publication/22558.
12. Ronneteg, U., L. Cederqvist, H. Rydén, T. Öberg, and C. Müller. 2006.
“Reliability in sealing of canister for spent nuclear fuel.” SKB Report
R-06-26. https://www.skb.com/publication/1137244.
13. Aldrin, J. C., E. A. Medina, J. Santiago, E. A. Lindgren, C. F. Buynak,
and J. S. Knopp. 2012. “Demonstration study for reliability assessment
of SHM systems incorporating model-assisted probability of detection
approach.” AIP Conference Proceedings 1430: 1543–1550. https://doi.
org/10.1063/1.4716398.
14. Smart, L. J., B. J. Engle, L. J. Bond, J. MacKenzie, and G. Morris. 2016.
“Material characterization of pipeline steels: Inspection techniques review
and potential property relationships.” Proceedings of the 2016 11th Inter-
national Pipeline Conference. Vol. 3: Operations, Monitoring, and Mainte-
nance. https://doi.org/10.1115/IPC2016-64157.
15. Barrett, A., R. Smith, and M. Modarres. 2018. “A multivariate model
to assess the probability of detection and sizing of defects in aluminum
panels using eddy current inspections.” Engineering Failure Analysis 94:
182–194. https://doi.org/10.1016/j.engfailanal.2018.07.028.
16. Knott, C. E., C. S. Kabban, and J. C. Aldrin. 2023. “Simple and multiple
linear regression for probability of detection.” Proceedings SPIE 12491, 8th
International Workshop on Reliability of NDT/NDE: 1249103. https://doi.
org/10.1117/12.2660140.
17. Kutner, M. H., C. I. Nachtsheim, J. Neter, and W. Li. 2004. Applied
Linear Statistical Models. 5th ed. New York, NY: McGraw-Hill-Irwin.
18. Montgomery, D. C. 2017. Design and Analysis of Experiments. 9th ed.
New York: John Wiley.
19. Casella, G., and R. Berger. 2001. Statistical Inference. 2nd ed. Boston,
MA: Cengage Learning.
20. Box, G. E. P., and D. R. Cox. 1964. “An analysis of transformations.”
Journal of the Royal Statistical Society: Series B, Statistical Methodology 26
(2): 211–243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x.
21. Milliard, S. P. 2025. “boxcox: Boxcox Power Transformation.” Accessed
29 June 2025. https://www.rdocumentation.org/packages/EnvStats/
versions/3.0.0/topics/boxcox.
22. Stroup, W. W. 2013. Generalized Linear Mixed Models: Modern
Concepts, Methods and Applications. Boca Raton, FL: Taylor &Francis
Group LLC.
23. James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. An Introduction
to Statistical Learning with Applications in R. New York: Springer.
24. Therneau, T. M. 2023. A Package for Survival Analysis in R, R package
Version 3.5-5. https://CRAN.R-project.org/package=survival [in Appendix].
25. Truxillo, C. 2012. Statistical Analysis with the GLIMMIX Procedure:
Course Notes. Cary, NC: SAS Institute Inc. [in Appendix].
ME
|
PODMODELING
72
M AT E R I A L S E V A L U AT I O N A U G U S T 2 0 2 5
Previous Page Next Page