the discontinuity (hit, ​D =1​ or does not
(miss, =0​ The probability of detection
is the conditional probability:
(2)​ POD(a)​ =P​(D =1 |a)​​
Probability distributions such as
Normal (Gaussian), Log-Normal, and
Beta distributions are commonly used to
build POD curves, providing a statistical
foundation for addressing uncertainty.
The selection of appropriate probability
models is crucial for deriving meaningful
POD curves.
Ñ The Normal (Gaussian) distribution
models symmetrical variations around
a mean value, making it well suited
for cases where errors result from
random deviations that are evenly
distributed, such as sensor noise
and measurement fluctuations. It is
frequently applied in ultrasonic testing
or radiographic inspection, where
measurement noise and variability are
common [49].
Ñ The Log-Normal distribution is
effective for modeling asymmetric
uncertainties, where smaller values
occur frequently but large deviations
are possible. Its multiplicative nature
aligns with processes like corrosion
propagation or material wear [50].
Ñ The Beta distribution is commonly
used when the variables are naturally
constrained, where it quantifies
detection confidence levels by
modeling bounded uncertainties
(between 0 and 1). It has proven
effective for representing disconti-
nuity detection probabilities, given its
bounded nature aligns well with the
physical constraints often encountered
in NDE inspections, where measure-
ments or probabilities cannot exceed
realistic limits [51].
RELIABILITY-BASED METHODS FOR
DISCONTINUITY DETECTION
Reliability-based UQ methods evaluate
the probability of discontinuity detection
failure by incorporating probabilistic
models and statistical reliability analysis.
Instead of merely identifying a disconti-
nuity, reliability-based UQ predicts the
likelihood of a detected discontinuity
leading to failure under various operat-
ing conditions, making these methods
essential in industries with strict safety
requirements.
First-order and second-order reli-
ability methods (FORM and SORM) are
typical reliability approaches that lin-
earize the boundary between safe and
failure conditions. Instead of considering
every possible scenario, FORM locates
the most probable point (MPP) on the
limit-state surface, ​​ (​​X)​​ =0​​ by trans-
forming the original variables into stan-
dard-normal U-space and then lineariz-
ing at that point via a first-order Taylor
expansion. Central to this method is the
Hasofer–Lind reliability index, which
quantifies the shortest distance from
the origin to the limit-state surface in
standard normal space [52]. The failure
probability is approximated by:
(3)​ Pf​​​ ϕ(−β)​​
where
ϕ​ is the standard-normal CDF.
SORM improves on FORM by
adding the second-order terms of the
Taylor series, fitting a curved surface
at the MPP [53]. It corrects FORM’s
linear approximation with curvature
factors derived from the principal radii
of the limit-state surface, yielding more
accurate failure probabilities for highly
nonlinear problems, which are shown in
Figure 3.
Recent innovations have enhanced
FORM’s capabilities by combining it
with complementary techniques. For
instance, Zhu and Xiang [54] paired
FORM with the stochastic pseudo-
excitation method (SPEM) to improve
dynamic reliability analysis under
random excitation. Such hybrid
approaches address FORM’s limitations
in handling complex system behaviors,
expanding its applicability in NDE.
Monte Carlo reliability analysis
(MCRA) is another prominent reli-
ability method for UQ. It operates on
the principle of random sampling to
estimate failure probabilities, making
it particularly effective for complex or
high-dimensional problems where tradi-
tional methods like FORM may struggle.
The strength of MCRA lies in its ability
to handle nonlinear and discontinuous
performance functions without relying
on gradient-based approximations
[55]. Its versatility extends to various
NDE contexts, including structural reli-
ability analysis [56] and evaluation of
cyber-physical systems [57], where it can
simulate complex failure modes and
operational constraints.
Current research gaps include
the need for improved integration of
FORM and Monte Carlo methods and
the development of adaptive sampling
strategies to enhance computational effi-
ciency. Machine learning advancements
could further refine reliability-based
UQ in NDE by optimizing discontinu-
ity detection under uncertainty [58].
Addressing these challenges will advance
the reliability and accuracy of NDE tech-
niques for complex engineering systems.
Statistical Approaches
Statistical UQ methods provide probabi-
listic evaluations of detection accuracy,
discontinuity size estimation, and mea-
surement noise effects. By integrating
statistical techniques such as measure-
ment uncertainty analysis, confidence
intervals, and resampling methods,
engineers can enhance sensor calibra-
tion, discontinuity characterization,
and decision-making in various NDE
applications.
U
2
U1
g 0
o
β
FORM
MPP u*
SORM
g 0
g =0
Figure 3. Comparison of first-order and
second-order reliability methods (FORM
and SORM) [53].
NDT TUTORIAL
|
UA&UQ
28
M AT E R I A L S E V A L U AT I O N A U G U S T 2 0 2 5
MEASUREMENT UNCERTAINTY
ANALYSIS
Measurement uncertainty analysis,
based on the Guide to the Expression
of Uncertainty in Measurement (GUM)
framework [59], systematically assesses
systematic and random errors in NDE
inspections. It is essential in engineering
metrology, ensuring that measurement
deviations are accurately quantified and
reported. Unlike methods that assume
a predefined probability model, this
approach systematically identifies and
combines uncertainty sources—such as
instrument calibration, environmental
conditions, and operator variability—into
a comprehensive uncertainty budget.
For example, if we want to estimate
discontinuity depth from an ultrasonic
echo, the workflow of the GUM-based
uncertainty analysis is as follows:
Ñ Identify input quantities. List every
variable/uncertainty that influences ,
such as sound velocity, time of flight,
probe angle, temperature, and calibra-
tion block tolerances.
Ñ Classify and evaluate. Each input is
then classified. Type A inputs come
from statistical data obtained through
repeat scans Type B inputs rely on
specifications or expert knowledge.
Each input is assigned a standard
uncertainty ​​ (​​ x​i​​​)​​​​.
Ñ Uncertainty propagation. Build a
measurement model ​​ =f(​​​x​1​​, x​2​​, …)​​​​
to propagate these individual uncer-
tainties. Insert the measured (or
nominal) values of all inputs
into to obtain the best point
estimate ​​ˆ​ =f​
(​​​​ ˆ ​1​​, ​ˆ​​2​​,…)​​​​ which
is the central value about which the
confidence interval will be built.
Linearize or run a Monte Carlo
simulation to obtain the combined
standard uncertainty:
(4)​ Uc​​​(d)​ = _____________​

i​
(​​​∂​x​i​​​
f u (​xi​​)​​ )​​​​ 2​​​​
Ñ Expand uncertainty. Multiply ​​ c​​​ (d)​​
by a coverage factor (typically =2​
for 95% confidence) to get the
expanded uncertainty, =k Uc​​​(d)​​,
and then obtain the uncertainty esti-
mation.
Ñ Uncertainty estimation. Report the
inspection result as =U ± ˆ .
Using GUM makes NDE uncertainty
statements transparent, traceable, and
comparable across laboratories and
inspection procedures. For example, in
phased array ultrasonic testing (PAUT)
for weld inspections, uncertainties arise
from variations in ultrasonic wave speed,
probe positioning, and couplet layer
thickness—all of which affect discon-
tinuity sizing accuracy. GUM has been
applied to analyze the uncertainty of
beam parameters in NDE [60].
Recent research explores the integra-
tion of advanced computational tech-
niques such as deep learning to improve
uncertainty quantification in dynamic
inspection environments [24].
CONFIDENCE INTERVALS AND
HYPOTHESIS TESTING
Confidence intervals (CIs) and hypoth-
esis testing (HT) provide a statistical
framework for quantifying uncertainty
in discontinuity detection and material
property measurements. Confidence
intervals estimate the probable range
where a discontinuity parameter, such
as discontinuity size or wall thickness,
is likely to fall, providing a measure of
quantification precision. Hypothesis
testing evaluates whether a detected
discontinuity is statistically significant
or merely an artifact caused by sensor
noise, material variability, or operator
inconsistencies, helping to reduce false
positives in inspections.
For instance, bootstrap confidence
intervals are used in process capability
analysis to evaluate performance indices
for non-normal data, as shown by Kashif
et al. [61] and Rao et al. [62]. These
methods provide reliable coverage prob-
abilities and narrower interval widths,
making them suitable for asymmetric
data distributions common in NDE.
Hypothesis testing is particularly
useful in NDE applications where mea-
surement variability can lead to false
indications. In radiographic testing (RT)
of welded joints, HT distinguishes actual
discontinuities from false signals caused
by scatter radiation or imaging incon-
sistencies. By reducing false alarms and
misclassifications, HT enhances inspec-
tion accuracy and helps prevent unnec-
essary repairs [63].
Generally, CIs quantify how precisely
the discontinuity size (or other mea-
surand) is known, while HT provides a
yes-or-no decision against an acceptance
criterion, with controlled false alarm risk.
It is often beneficial to combine both
methods to deliver a complete, safety-
focused reliability assessment for NDE
inspections.
For example, to set the acceptance
threshold ​​ crit​​​ for discontinuity depth
from an ultrasonic echo, the workflow is
as follows:
Ñ Collect representative data and
compute descriptive statistics.
Acquire repeated measurements ​​ 1​​ ,
x​2​​, …​x​n​​​ under controlled, reproducible
conditions. Calculate the mean _ and
standard deviation .
Ñ Construct the CI. Select confidence
level α​ (commonly 95%). The
two-sided CI for the mean is:
(5)​ CI =
_
± tn−1, α
2 ​​√
s _n
where
tn−1, α
2
is the Student’s​ t​-value for 1​
degrees of freedom, and
tn−1, α
2 ​​
s

_n
is the half-width, which is the
amount added and subtracted to
create the confidence band around
that mean.
Ñ Formulate hypotheses. Define the
null and alternative hypotheses:
dcrit​​ (Discontinuity acceptance)​​ ​H0​​μ
H1​​μ dcrit​​ (​​Discontinuity rejection​)​​​​
Ñ Calculate test statistic. Choose
significance level (commonly 0.05),
and then calculate one sample
statistic:
(6)​ t =
_x ​​ dcrit​​​ _
s
_n
Ñ Decision-making. The CI quantifies
measurement precision, while HT
provides a rigorously justified pass/
fail call. It is useful to combine both
results to strengthen the reliability
assessment in NDE inspections.
A U G U S T 2 0 2 5 M AT E R I A L S E V A L U AT I O N 29
Previous Page Next Page