ABSTR ACT
This paper presents a methodology for extending
probability of detection (POD) modeling for
continuously valued vs. a) signal responses to
allow for the addition of multiple variables beyond
the simple discontinuity size model, along with
higher-level interactions. The statistical methodology
for correctly transforming these more complex
linear models into POD curves is provided, and the
approach is illustrated with a simulated dataset that
includes polynomial and categorical predictors.
KEYWORDS: probability of detection, linear regression,
â versus a, eddy current
1. Introduction
Probability of detection (POD) modeling is a useful tool for
assessing the detection capability of a nondestructive evalu-
ation (NDE) inspection system. Structural components that
may experience fatigue during normal use can be periodically
reinspected to find discontinuities so that any defects can
be repaired or replaced before growing to dangerous sizes.
These inspection intervals are chosen based on the inspec-
tion system’s capability. POD considers false detection, but it
focuses on true detection and an inspection system’s capa-
bility to find discontinuities before they reach a critical size.
An inspection system that finds small discontinuities is of less
interest than one that consistently finds all the large disconti-
nuities. Inspection systems prone to missing large discontinu-
ities will require more frequent reinspection [1–5].
Often, variables other than the discontinuity size can affect
the nondestructive inspection response. Current methods and
software [6, 7] for POD modeling do not support this. Some
prior work has considered additional parameters in â versus
a signal-response analysis. Improved model fits have been
achieved by considering additional parameters such as discon-
tinuity depth, alongside discontinuity length, for representing
signals from surface discontinuities using eddy current testing
[8–10]. Concepts of multifactorial designed experiments and
multiple linear regression (MLR) for nondestructive testing
reliability evaluation were discussed by Müller and Öberg [11,
12], but no applications with data were presented. In another
study, a multi-parameter linear regression model was fit to
experimental vibration-based structural health monitoring
data and used to evaluate POD for varying measurement
locations and degradation over time [13]. This work used a
Monte Carlo approach to generate POD estimates, but it did
not present much detail on the model form or consideration of
higher-order terms. Smart et al. [14] considered a multivariate
regression model for pipe material characterization perfor-
mance, but this model was not extended to POD evaluation.
In other work [15], a Bayesian approach was successfully used
to model 11 additional variables, and although this method
provided POD estimates, it did not provide a functional form.
This paper extends current signal-response models for
POD to include additional categorical and higher-order
response variables. It was inspired by a large bolt-hole eddy
MULTIVARIATE PROBABILITY OF DETECTION
MODELING INCLUDING CATEGORICAL
VARIABLES AND HIGHER-ORDER RESPONSE
MODELS
CHRISTINE E. KNOTT†*, CHRISTINE SCHUBERT KABBAN‡, AND JOHN C. ALDRIN§
Materials and Manufacturing Directorate, Air Force Research Laboratory,
Wright-Patterson AFB, OH 45433
Department of Mathematics and Statistics, Air Force Institute of Technology,
Graduate School of Engineering and Management, Wright-Patterson AFB, OH
45433
§ Computational Tools, Gurnee, IL 60031
*Corresponding author: christine.knott.1@us.af.mil
Materials Evaluation 83 (8): 57–72
https://doi.org/10.32548/2025.me-04532
©2025 American Society for Nondestructive Testing
NDTTECHPAPER
|
ME
A U G U S T 2 0 2 5 M AT E R I A L S E V A L U AT I O N 57
current study [16] in which many variables, in addition to dis-
continuity size, had a significant effect on the NDE response
(see Section 4.1 for further details).
2. Background: Standard Signal Response
Methodology
Inspection systems traditionally provide results which have
either continuous values (handled with â vs. a analysis) or
binary values (handled with hit/miss analysis). This paper
focuses on POD methodology for continuous data using â vs.
a methods. According to MIL-HDBK-1823A [4] and other sup-
porting references [1, 2, 5], once the data is collected, the â vs.
a statistical methods proceed through the following steps:
1. Build a simple linear model relating the NDE sensor
response to discontinuity size.
2. Transform the simple linear model into probability space to
form a probability curve.
3. Estimate ​​ 90​​​ the discontinuity size corresponding to 90%
POD.
4. Estimate a 95% confidence interval on the probability
curve and determine ​​ 90/95​​​ which is the discontinuity size
corresponding to 90% POD with 95% confidence. This step
generates a useful metric that is often used to represent NDE
capability.
This section describes each of these steps in detail. The
notation in this paper follows the conventions of linear
modeling, but in practice, POD data analysis benefits from
using survival regression, which offers the added benefits of
correctly handling censoring and estimating the variance of
the residual error (​​​σ​ε​​​​ When describing the standard linear
modeling process, will be used as a generic explanatory
variable, but when the explanatory variable represents the size
of the discontinuity, will be used instead.
2.1. Building a Linear Model Relating NDE Response to
Discontinuity Size
The NDE response can be written as a vector, The variables
that cause the variability in the NDE response are fixed effects,
X​ written as a matrix of values related to Any variability
in the NDE response that cannot be explained by the fixed
effects is considered random error, a vector representing the
random noise in the experiment [3]. A linear model [17, 18, 23]
for data with observations and variables has the form of
Equation 1:
In Equation 1, the first column of the matrix consists of
ones because it corresponds to the intercept. A simple linear
model includes only one variable—discontinuity size—denoted ​​
x​ij​​​ so it can be written as Equation 2. Note that ​​ N×1​​ a vector
of ones, ​​ 1​​​ is a vector of the observed discontinuities, and ε
is a vector of random error terms, each of size × 1​.
Using software, a maximum likelihood estimation can
provide the best estimates of the parameter in given the
observed data. Equation 3 is the fitted model (from Equation 1),
with the estimated values denoted by hats (​
(3)​ =​​ˆ 0​​​​ 1N×1​​ +​​ˆ 1​​​​ x1​​ +··· +​​ˆ k​​​​ xk​​​​
The linear model also allows for variables to have a
combined effect on the response, as defined by adding an
interaction term (e.g., ​​ i​​ x​1​​​x​2​​​ for variables ​​ 1​​​ and ​​ 2​​​ Higher-
order terms of continuous values are also possible for
example, a quadratic term could be included to form
β​0​​ + β​1​​​x​1​​ + β​2​​​x​1​​​ 2 Although interactions and polynomial-
ordered variables commonly occur in real data, standard POD
methodology is defined only for the simple case (Equation 2).
When using a linear model, several assumptions must be
met to provide statistical inference [3, 17–19]. A linear model
describes the mean behavior of the response, and therefore
assumes that is Gaussian distributed with a mean of 𝛃​ and
variance of ​​​ ε​​​​ 2​​ Note that the inference for is based on the
random error. The random errors (​ are assumed to be inde-
pendently and identically distributed Gaussian (or Normal)
with a mean of zero and a variance of ​​​ σ​ε​​​​2​​ where represents
the identity matrix, i.e., ​​​ i​​ N​(​​0, σ​ε​​​​2​​)​​​​.
To verify these assumptions, one can examine the residu-
als of the fitted model, calculated by subtracting the original
y​ values from the fitted values, ​​ˆ​​ If these assumptions are
violated, then the inferences generated from the linear model
may be inaccurate. Of the assumptions required, autocorrela-
tion (lack of independence) is rarely observed in randomized
POD studies however, normality and constant variance are
rarely met in the original dataset. Often a transformation of the
signal is needed. A common transformation is the natural loga-
rithm, but these authors prefer a more flexible approach called
the Box-Cox transformation [20] (Equation 4):
(4)​ Box−Cox transformation of y​​=​y′​ =
{​​
(y​​λ​ 1)​ λ​ if λ 0​
log(y)​​ if λ =0​​​
ME
|
PODMODELING
58
M AT E R I A L S E V A L U AT I O N A U G U S T 2 0 2 5
Previous Page Next Page