MEASUREMENT UNCERTAINTY
ANALYSIS
Measurement uncertainty analysis,
based on the Guide to the Expression
of Uncertainty in Measurement (GUM)
framework [59], systematically assesses
systematic and random errors in NDE
inspections. It is essential in engineering
metrology, ensuring that measurement
deviations are accurately quantified and
reported. Unlike methods that assume
a predefined probability model, this
approach systematically identifies and
combines uncertainty sources—such as
instrument calibration, environmental
conditions, and operator variability—into
a comprehensive uncertainty budget.
For example, if we want to estimate
discontinuity depth from an ultrasonic
echo, the workflow of the GUM-based
uncertainty analysis is as follows:
Ñ Identify input quantities. List every
variable/uncertainty that influences ,
such as sound velocity, time of flight,
probe angle, temperature, and calibra-
tion block tolerances.
Ñ Classify and evaluate. Each input is
then classified. Type A inputs come
from statistical data obtained through
repeat scans Type B inputs rely on
specifications or expert knowledge.
Each input is assigned a standard
uncertainty ​​ (​​ x​i​​​)​​​​.
Ñ Uncertainty propagation. Build a
measurement model ​​ =f(​​​x​1​​, x​2​​, …)​​​​
to propagate these individual uncer-
tainties. Insert the measured (or
nominal) values of all inputs
into to obtain the best point
estimate ​​ˆ​ =f​
(​​​​ ˆ ​1​​, ​ˆ​​2​​,…)​​​​ which
is the central value about which the
confidence interval will be built.
Linearize or run a Monte Carlo
simulation to obtain the combined
standard uncertainty:
(4)​ Uc​​​(d)​ = _____________​

i​
(​​​∂​x​i​​​
f u (​xi​​)​​ )​​​​ 2​​​​
Ñ Expand uncertainty. Multiply ​​ c​​​ (d)​​
by a coverage factor (typically =2​
for 95% confidence) to get the
expanded uncertainty, =k Uc​​​(d)​​,
and then obtain the uncertainty esti-
mation.
Ñ Uncertainty estimation. Report the
inspection result as =U ± ˆ .
Using GUM makes NDE uncertainty
statements transparent, traceable, and
comparable across laboratories and
inspection procedures. For example, in
phased array ultrasonic testing (PAUT)
for weld inspections, uncertainties arise
from variations in ultrasonic wave speed,
probe positioning, and couplet layer
thickness—all of which affect discon-
tinuity sizing accuracy. GUM has been
applied to analyze the uncertainty of
beam parameters in NDE [60].
Recent research explores the integra-
tion of advanced computational tech-
niques such as deep learning to improve
uncertainty quantification in dynamic
inspection environments [24].
CONFIDENCE INTERVALS AND
HYPOTHESIS TESTING
Confidence intervals (CIs) and hypoth-
esis testing (HT) provide a statistical
framework for quantifying uncertainty
in discontinuity detection and material
property measurements. Confidence
intervals estimate the probable range
where a discontinuity parameter, such
as discontinuity size or wall thickness,
is likely to fall, providing a measure of
quantification precision. Hypothesis
testing evaluates whether a detected
discontinuity is statistically significant
or merely an artifact caused by sensor
noise, material variability, or operator
inconsistencies, helping to reduce false
positives in inspections.
For instance, bootstrap confidence
intervals are used in process capability
analysis to evaluate performance indices
for non-normal data, as shown by Kashif
et al. [61] and Rao et al. [62]. These
methods provide reliable coverage prob-
abilities and narrower interval widths,
making them suitable for asymmetric
data distributions common in NDE.
Hypothesis testing is particularly
useful in NDE applications where mea-
surement variability can lead to false
indications. In radiographic testing (RT)
of welded joints, HT distinguishes actual
discontinuities from false signals caused
by scatter radiation or imaging incon-
sistencies. By reducing false alarms and
misclassifications, HT enhances inspec-
tion accuracy and helps prevent unnec-
essary repairs [63].
Generally, CIs quantify how precisely
the discontinuity size (or other mea-
surand) is known, while HT provides a
yes-or-no decision against an acceptance
criterion, with controlled false alarm risk.
It is often beneficial to combine both
methods to deliver a complete, safety-
focused reliability assessment for NDE
inspections.
For example, to set the acceptance
threshold ​​ crit​​​ for discontinuity depth
from an ultrasonic echo, the workflow is
as follows:
Ñ Collect representative data and
compute descriptive statistics.
Acquire repeated measurements ​​ 1​​ ,
x​2​​, …​x​n​​​ under controlled, reproducible
conditions. Calculate the mean _ and
standard deviation .
Ñ Construct the CI. Select confidence
level α​ (commonly 95%). The
two-sided CI for the mean is:
(5)​ CI =
_
± tn−1, α
2 ​​√
s _n
where
tn−1, α
2
is the Student’s​ t​-value for 1​
degrees of freedom, and
tn−1, α
2 ​​
s

_n
is the half-width, which is the
amount added and subtracted to
create the confidence band around
that mean.
Ñ Formulate hypotheses. Define the
null and alternative hypotheses:
dcrit​​ (Discontinuity acceptance)​​ ​H0​​μ
H1​​μ dcrit​​ (​​Discontinuity rejection​)​​​​
Ñ Calculate test statistic. Choose
significance level (commonly 0.05),
and then calculate one sample
statistic:
(6)​ t =
_x ​​ dcrit​​​ _
s
_n
Ñ Decision-making. The CI quantifies
measurement precision, while HT
provides a rigorously justified pass/
fail call. It is useful to combine both
results to strengthen the reliability
assessment in NDE inspections.
A U G U S T 2 0 2 5 M AT E R I A L S E V A L U AT I O N 29
For the CI check, if the confidence
interval
_
±​ half-width lies entirely
beyond the critical limit ​​ crit​​​ the
discontinuity is deemed too large
and is flagged. For the HT check,
compare the calculated statistic
with the critical value ​​ n−1, α​​​ (or use
the P-value). Reject ​​ 0​​​ if is larger
otherwise, fail to reject ​​ 0​​​ .
BOOTSTRAP RESAMPLING FOR
UNCERTAINTY ESTIMATION
Bootstrap resampling is a nonparamet-
ric statistical method used to estimate
uncertainty by resampling measured
data to construct empirical confidence
intervals. Unlike parametric methods
that assume a specific probability
distribution, bootstrap resampling is
effective for non-Gaussian data, small
sample sizes, and variable discon-
tinuity characteristics. By randomly
sampling with replacement and recal-
culating discontinuity parameters
across multiple datasets, this method
quantifies uncertainty without requir-
ing predefined data assumptions. Its
effectiveness is demonstrated in NDE
applications such as image registra-
tion uncertainty estimation, where it
outperforms traditional approaches
like the Cramér–Rao bound [64], and
in metrology, where it reduces bias in
small-sample scenarios [65].
The basic process can be described
as follows:
Ñ Collect baseline data. Obtain an
original sample of inspection results
x​1​​, x​2​​, …​x​n​​​ (e.g., signal amplitudes,
discontinuity sizes, POD hit/miss
data).
Ñ Resample with replacement.
Generate bootstrap samples. For
each =1,…, B​ draw observations
from the baseline data with replace-
ment. Then, compute the statistic of
interest ​​ *​(​​b​)​​​​ (e.g., mean discontinuity
depth, POD at size
Ñ Build the empirical distribution.
The set {​​​ 1​ …​T​​B}​​​​ approximates the
sampling distribution of the statistic
without assuming any parametric
model.
Ñ Estimate bias and standard error:
(7)​ Bias =
_
T, SE = _______________
1 _
B 1

b=1​
B (T​​b​
_
)​​​2​​​​
where
T​ is the statistic from the original data,
and _
T​​ is the bootstrap mean.
Ñ Form confidence intervals (CIs).
Using the percentile method, define
the confidence interval using the
α /2​ and ​​ (​​α/2)​​​​ quantiles of the
bootstrap distribution, such as [​​ ​α/2​,
T​​1−​(​​α/2​)​​​​ The best estimate and its
uncertainty can be expressed as:
(8)​ T =
_
± half-width​
where the half-width is half the distance
between the lower and upper limits of
the selected bootstrap interval, repre-
senting 95% CI.
Simulation-Based UQ Methods
in NDE
Another widely used approach for ana-
lyzing UQ in NDE is simulation-based
UQ methods, which allow engineers to
model the physical interactions between
inspection techniques and inspected
materials. Unlike purely statistical
methods, simulation-based UQ propa-
gates input variability through a numeri-
cal model of the inspection therefore, the
full distribution of the NDE output can be
evaluated without closed-form formulas.
Three widely used approaches in
NDE are Monte Carlo simulation (MCS)
[66], Bayesian inference via Markov
chain Monte Carlo (MCMC) [67], and
polynomial chaos expansion (PCE)
[68]. Those methods share the same
core idea: begin by assigning prob-
ability distributions to all uncertain
inputs (e.g., material properties, dis-
continuity geometry, sensor settings)
and then propagate those input uncer-
tainties through a forward NDE model.
Whether by direct random sampling
(MCS), posterior sampling (Bayesian
MCMC), or spectral projection (PCE),
each method generates multiple sim-
ulated responses, which are then used
to calculate summary statistics such as
the average outcome, its variance, and
uncertainty bounds. In other words, they
all transform uncertainties in inputs into
quantified uncertainties in inspection
responses. The comparative process is
presented in Figure 4.
MONTE CARLO SIMULATION FOR NDE
MCS is a probabilistic method that helps
estimate failure probabilities and detec-
tion uncertainties by simulating a wide
range of inspection scenarios under dif-
ferent conditions. Instead of relying on
a single fixed outcome, MCS relies on
repeated random sampling to approx-
imate the probability distributions of
output quantities, making it particularly
useful for complex systems where ana-
lytical solutions are intractable.
Key advantages are its flexibility
in handling nonlinear and correlated
inputs and its ease of implementation
[66], making it suitable for a wide range
of NDE applications. In addition, all
the independent samples run concur-
rently, making the method well suited to
modern multicore and GPU hardware.
For instance, MCS has been applied
to estimate discontinuity detection
S2: Simulation
MCS
Direct random
sampling
PCE
Surrogate
model
MCMC
Posterior
distribution
S3: Response
Mean response
95% confidence interval
S1: Input signal
Figure 4. Illustrative process of three simulation-based UQ methods: Monte Carlo simulation
(MCS), Bayesian inference via Markov chain Monte Carlo (MCMC), and polynomial chaos
expansion (PCE).
NDT TUTORIAL
|
UA&UQ
30
M AT E R I A L S E V A L U AT I O N A U G U S T 2 0 2 5
Previous Page Next Page