current study [16] in which many variables, in addition to dis-
continuity size, had a significant effect on the NDE response
(see Section 4.1 for further details).
2. Background: Standard Signal Response
Methodology
Inspection systems traditionally provide results which have
either continuous values (handled with â vs. a analysis) or
binary values (handled with hit/miss analysis). This paper
focuses on POD methodology for continuous data using â vs.
a methods. According to MIL-HDBK-1823A [4] and other sup-
porting references [1, 2, 5], once the data is collected, the â vs.
a statistical methods proceed through the following steps:
1. Build a simple linear model relating the NDE sensor
response to discontinuity size.
2. Transform the simple linear model into probability space to
form a probability curve.
3. Estimate ​​ 90​​​ the discontinuity size corresponding to 90%
POD.
4. Estimate a 95% confidence interval on the probability
curve and determine ​​ 90/95​​​ which is the discontinuity size
corresponding to 90% POD with 95% confidence. This step
generates a useful metric that is often used to represent NDE
capability.
This section describes each of these steps in detail. The
notation in this paper follows the conventions of linear
modeling, but in practice, POD data analysis benefits from
using survival regression, which offers the added benefits of
correctly handling censoring and estimating the variance of
the residual error (​​​σ​ε​​​​ When describing the standard linear
modeling process, will be used as a generic explanatory
variable, but when the explanatory variable represents the size
of the discontinuity, will be used instead.
2.1. Building a Linear Model Relating NDE Response to
Discontinuity Size
The NDE response can be written as a vector, The variables
that cause the variability in the NDE response are fixed effects,
X​ written as a matrix of values related to Any variability
in the NDE response that cannot be explained by the fixed
effects is considered random error, a vector representing the
random noise in the experiment [3]. A linear model [17, 18, 23]
for data with observations and variables has the form of
Equation 1:
In Equation 1, the first column of the matrix consists of
ones because it corresponds to the intercept. A simple linear
model includes only one variable—discontinuity size—denoted ​​
x​ij​​​ so it can be written as Equation 2. Note that ​​ N×1​​ a vector
of ones, ​​ 1​​​ is a vector of the observed discontinuities, and ε
is a vector of random error terms, each of size × 1​.
Using software, a maximum likelihood estimation can
provide the best estimates of the parameter in given the
observed data. Equation 3 is the fitted model (from Equation 1),
with the estimated values denoted by hats (​
(3)​ =​​ˆ 0​​​​ 1N×1​​ +​​ˆ 1​​​​ x1​​ +··· +​​ˆ k​​​​ xk​​​​
The linear model also allows for variables to have a
combined effect on the response, as defined by adding an
interaction term (e.g., ​​ i​​ x​1​​​x​2​​​ for variables ​​ 1​​​ and ​​ 2​​​ Higher-
order terms of continuous values are also possible for
example, a quadratic term could be included to form
β​0​​ + β​1​​​x​1​​ + β​2​​​x​1​​​ 2 Although interactions and polynomial-
ordered variables commonly occur in real data, standard POD
methodology is defined only for the simple case (Equation 2).
When using a linear model, several assumptions must be
met to provide statistical inference [3, 17–19]. A linear model
describes the mean behavior of the response, and therefore
assumes that is Gaussian distributed with a mean of 𝛃​ and
variance of ​​​ ε​​​​ 2​​ Note that the inference for is based on the
random error. The random errors (​ are assumed to be inde-
pendently and identically distributed Gaussian (or Normal)
with a mean of zero and a variance of ​​​ σ​ε​​​​2​​ where represents
the identity matrix, i.e., ​​​ i​​ N​(​​0, σ​ε​​​​2​​)​​​​.
To verify these assumptions, one can examine the residu-
als of the fitted model, calculated by subtracting the original
y​ values from the fitted values, ​​ˆ​​ If these assumptions are
violated, then the inferences generated from the linear model
may be inaccurate. Of the assumptions required, autocorrela-
tion (lack of independence) is rarely observed in randomized
POD studies however, normality and constant variance are
rarely met in the original dataset. Often a transformation of the
signal is needed. A common transformation is the natural loga-
rithm, but these authors prefer a more flexible approach called
the Box-Cox transformation [20] (Equation 4):
(4)​ Box−Cox transformation of y​​=​y′​ =
{​​
(y​​λ​ 1)​ λ​ if λ 0​
log(y)​​ if λ =0​​​
ME
|
PODMODELING
58
M AT E R I A L S E V A L U AT I O N A U G U S T 2 0 2 5
Generally, the response ​​ a vector of Box-Cox transformed
y​ values, is modeled as a function of β +ε​ using maximum
likelihood estimation. The method for estimating is described
in [16, 17, 19], and many commercial and open-source software
packages [21] can recommend an optimal by comparing the
sum of squared errors at various values across the range of
power transforms, stepping in increments (e.g., 0.1) between
both positive and negative integers, with the transformation at
a value of 0 representing the natural logarithm.
After fitting the model, the transformation should be
reversed to return to the original data units of the response, .
Once the value is estimated and applied to the response,
then is considered fixed. The residuals are an estimate of the
variability that the model cannot explain. When using a simple
linear model, residuals may be examined, or a lack-of-fit test
may be conducted to determine if additional variables beyond
discontinuity size are necessary [3, 17, 18].
If a transformation is applied to the same transforma-
tion should also be applied to the variable ​​ dec​​​ (which will
be described in Section 2.2), so that they have the same units
(say, ​​ dec​ For simplicity, the following equations will refer to ,
but if a transformation is necessary, ​​ should be substituted.
Lastly, if the data shows evidence of censoring, a modifica-
tion to the likelihood equation is necessary during maximum
likelihood estimation. Censoring often occurs in real-world
NDE measurements. Very large discontinuities may max out
the NDE system’s signals, providing a flat response at the
maximum detection limit, and thus providing a lower bound
on the response, rather than the true value. This is common
if the gain on a system is set high such that small discontinu-
ities are detectable, but large discontinuities may saturate the
detection system. Very small discontinuities that fall below the
detection threshold of the NDE system may return values of
zero, which is also a form of censoring.
2.2. Transforming the Linear Model into Probability
Assume a simple linear model has been fitted using and
(discontinuity sizes): ​​​y​1​​​ =​​ ˆ 0​​​ +​​ ˆ 1​​​ a​i​​​ Note that, for convention, ​​
a​i​​​ is now used in place of ​​ i​​​ in the simple linear model. Recall
that ​​ i​​​ is assumed to be normally distributed. A POD curve
consists of a standard normal cumulative distribution function
(CDF), denoted as the response. Therefore, the first step
requires a transformation from the normal distribution (with
estimated mean ​ˆ​​ and estimated residual error variance ​​​σ​ε​​​​​2​​)
to the standard normal distribution (with mean 0 and
variance 1), as shown in Equation 5:
(5)​ i​​​ = ​ˆ​ε​​​​Zi​​ + ˆ N​(ˆ​,​ˆε​​​​​2​)​​​⇒​Zi​​ = 1
​ˆε​​​​​​
ˆ i​​​ ˆ N​(μ =0,​σ2​ =1)​​
Before performing a POD evaluation, a predefined proce-
dure incorporating calibration and specific call criteria based
on the system’s noise floor is determined for the NDE system,
and this value is called the decision threshold, ​​ dec​​​ Therefore,
Equation 6 becomes the formula for the POD of a discontinuity
of size ​​ i​​​ [1–4]:
(6)​ POD(​ai​​)​ =Φ(​Zi​​)​ =Φ(​​ ˆ i​​​_​​ce​d​y​−
​ˆ​ε​​​
)​​
=Φ​ (​​
ai​​​ + (​​ˆ0​​​ ydec​​)​ ​​ˆ​1​​​ __________
​ˆε​​​ ​​ˆ1​​​
)​ =Φ​ (​​
ai_​​​do​p​​ˆ​−​​​
​ˆd​​​ σpo )​​
The final equation provides a new mean, ​ˆ​​​​ μ​pod and a new
variance, ​ˆ​​​​​2​​ σ​pod as shown in Equation 7:
(7)​ ​ˆd​​​ po = (​ˆ0​​​ y​dec​​)​ _
​ˆ​1​​​

,and​ ​​ ​ˆd​​​​​2​ po = (​​
​ˆ​ε​​​​
​ˆ​1​​​​)​​​
2​​
2.3. Estimating a90, the Discontinuity Size at 90% POD
By inverting Equation 6, the discontinuity size associated with
each probability can be calculated. The discontinuity size, ,
with POD ​​ %=POD(​​a)​​​​ is denoted ​​ p​​​ (see Equation 8),
where ​​​ ​−1​​(​​p/100​)​​ = z​p​​​​ is calculated using a standard normal
Z-table [1–4].
(8)​ ap​​ = Φ−1(POD(a)​)​​​ˆd​​​ po + ​ˆd​​​=​​​Φ−1(p)​​ˆd​​​ po σpo + ​ˆd​​​ po = zp​​​​ˆd​​​ po +​ˆd​​​​o​p​
Equation 9 shows how to find ​​ 90​​​ the discontinuity
size associated with 90% probability, which corresponds to
p =0.90​:
(9)​ a90​​ = z​0.9​​​ˆd​​​ po + ​ˆd​​​ po 1.2816ˆd​​​ ​​ po +​ˆd​​​​o​p​
For a simple linear regression model, Equations 8 and 9
simplify to Equation 10:
(10)​ ap​​ = zp​​​
(​​
​ˆε​​​ ​​
​ˆ​1​​​​)​
+ (​y_, dec​​ ​​ˆ0​​​)​
​ˆ​1​​​
and ​​ 90​​ = z​0.9​​​ (​​
​ˆ​ε​​​ ​ˆ​1​​​​)​ + (​ydec​​​ ​​ˆ0​​​)​​
_
​ˆ​1​​​

1.2816​ (​​
​ˆε​​​ ​​
​ˆ​1​​​​)​ + (​y_​)​​​​0​ˆ​​−​​ce​d
​ˆ​1​​​

2.4. Estimating a 95% Confidence Interval and the a90/95
Discontinuity Size
A confidence interval can be estimated for the linear model, in
terms of the three estimated model parameters: ​​β​0​​​​ ​​β​1​​​​ and ​​​σ​ε​​​​.
Estimates of the variances and the covariances of the param-
eters are also provided in the variance-covariance matrix (see
Equation 11) returned when fitting the linear model.
However, to calculate a confidence interval on the POD
curve, a variance-covariance matrix is needed in terms of the
parameters ​​​μ​pod​​​​ ˆ and ​ˆ​​​​ σ​pod as shown in Equation 12:
(12)​ Vpod​​ =V​(​​ˆd​​​,​ˆd​​​)​=​​​o​p​o​p [​
Var​(​​ˆd​​​)​​ po Cov​(​ˆd​​​,​ˆd​​​)​​op​​op​​
Cov​(​​ˆd​​​,​ˆd​​​)​​ po po Var​(ˆd​​​)​ ​​ po ]​​
A U G U S T 2 0 2 5 M AT E R I A L S E V A L U AT I O N 59
Previous Page Next Page