a three-layer architecture was feasible with an inference time of 1.08 ms (SD =0.15 ms), while a four-layer architecture was infeasible at 1.34 ms per A-scan (SD =0.13 ms). Finally, with 32 filters (Figure 5c), architectures with up to two layers were feasible with 0.97 ms per A-scan (SD =0.27 ms), while a three-layer architecture had an inference time of 1.26 ms (SD =0.13 ms). In terms of parameter count, the largest feasible architecture was three layers with 16 filters in the first layer, yielding 100 997 parameters. Thus, this architecture was used for all subsequent experimentation. Performance Evaluation The architecture selected in the feasibility study demonstrated a very strong performance overall (Table 2). Within 10 ms of the event, the sensitivity of the four events ranged from 0.742 (SD =0.023) to 0.935 (SD =0.005). Within 30 ms, sensitivity reached 0.977 for all events but saturation, which peaked at 0.898 (SD =0.009). Overall, specificity for expulsion events was highest at 0.986 (SD =0.002), while the melting and SSID events had proportionally more false positives with specific- ity of 0.807 (SD =0.025) and 0.821 (SD =0.031), respectively. Event detectability curves were similar for melting and SSID, while curves for expulsion and saturation differed greatly and models were consistent with respect to each event across all timing error windows (Figure 6). Expulsion detection reached an asymptote at approximately the 5 ms error window and melting and SSID reached an asymptote at approximately 15 ms, while saturation reached an asymptote at approximately 30 ms of absolute error. Example distributions of timing error for melting (Figure 7a) and SSID (Figure 7b) events showed very symmetrical distri- butions, centered at approximately zero with relatively mild variance. Saturation (Figure 7c), on the other hand, yielded a timing error distribution with slight negative skew and greater variance, and was centered slightly above zero. Expulsion timing error (Figure 7d) yielded an extremely tight distribution centered just above zero. All models yielded similar error distributions. Model outputs plotted over time and compared against ground truth data (Figure 8) showed stability and smoothness on relatively clear M-scans, while output noise increased with decreasing M-scan quality. Overall, the models were insensitive to reasonable amounts of electromagnetic noise, weld time, stackup, and weld quality. Output curves for MNS were smooth and consistent with ground truth curves in terms of shape and position. In addition, welds without nugget formation (e.g., Figure 8a) were often correctly characterized. In general, welds with extremely late nugget formation (e.g., Figure 8b) were more difficult to characterize than those with earlier nugget formation (Figure 8c). Discussion A fast and performant approach was developed for real-time interpretation of data from ultrasonic RSW process monitoring, with the aim of creating actionable feedback to a weld control- ler using deep learning. All events were reliably detected over 95% of events were detected within 18 ms of ground truth for all events except for saturation, which was detected at a rate of 90% within 30 ms. It was expected that expulsions would be most reliably detected as they appear very clearly on M-scans as a discontinuity in which the stack bottom boundary abruptly moves upward T A B L E 2 Summary of performance results Melting SSID Saturation Expulsion Sensitivity (within 10 ms) 0.870 (0.010) 0.886 (0.009) 0.742 (0.023) 0.935 (0.005) Sensitivity (within 20 ms) 0.963 (0.006) 0.962 (0.007) 0.860 (0.014) 0.954 (0.006) Sensitivity (within 30 ms) 0.984 (0.003) 0.979 (0.002) 0.898 (0.009) 0.977 (0.001) Specificity 0.807 (0.025) 0.821 (0.031) 0.933 (0.018) 0.986 (0.002) %A-scans correct (within 0.1) 90.5 (0.4) Note: Mean sensitivity for each event within 10, 20, and 30 ms of event ground truth, mean specificity for each event, and mean accuracy of MNS within 0.1. Standard deviation in parentheses. Figure 6. Event detection sensitivity per event given absolute error of model prediction of event timing versus ground truth event timestamp. Means across three models are plotted with 1 standard deviation of mean shown with error bars. Expulsion (purple line) was most easily detectable, SSID (orange line) and melting (green line) were similarly moderately detectable, and saturation (red line) was most difficult to detect. Timing absolute error (ms) Legend Melting SSID Saturation Expulsion 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 J U L Y 2 0 2 3 • M A T E R I A L S E V A L U A T I O N 67 2307 ME July dup.indd 67 6/19/23 3:41 PM Sensitivity
in the M-scan (Figure 8d). It was not expected, however, that they would be so rapidly detected with 90% of detections occurring within 4 ms of ground truth. Because they appear so clearly in M-scans, they are also the easiest event to label during dataset preparation. On the other hand, saturation is by far the most difficult event to label as the saturation point was defined as “the moment at which the molten nugget appeared to stop growing vertically,” which is highly subjective without perfect nugget and stack boundary annotations. Similarly, but less so, melting and SSID are not always as apparent as expulsions. Thus, from our experience in reading these images, and considering the relative difficulty for a human to interpret ultrasonic M-scans and identify these events and the relative consistency of event annotations, we found that the relative detection rates of the four events completely align with expectations. Relatedly, as the ground truth labels for event timing as well as the top and bottom labels for the nugget and stack were used to develop the curves for MNS, the subjectivity and consistency of labels affects the performance of the models on the regression task as well. In particular, stack boundaries are almost always reasonably visible aside from after expulsions, while nugget boundaries vary in visibility based on nugget pool size, stage of weld, and stack geometry. With the investigated ME |AI/ML Non-weld Nominal Expulsion Insufficient Weld time (ms) 1 0 0 200 0 285 0 300 0 220 1 0 Figure 8. M-scan samples with ground truth markup and model outputs superimposed. Ground truth event timestamps are shown as dark thick vertical lines reaching the halfway mark vertically in images ground truth MNS is darker blue curve. Model outputs (thin curves – unprocessed model outputs thin vertical lines – event probability outputs thresholded at 0.5) are from most performant model. For event colors, green – melting yellow – SSID red – saturation purple – expulsion. Blue indicates model output for MNS. All model outputs and MNS targets are superimposed on images with 0 =bottom of image, 1 =top of image. Images cover various stackups and weld outcomes. Timing error (ms) 0.12 0.1 0.08 0.06 0.04 0.02 0 –30 –27 –24 –21–18 –15 –12–9 –6 –3 0 3 6 9 12 15 18 21 24 27 30 Timing error (ms) 0.12 0.1 0.08 0.06 0.04 0.02 0 –30 –27 –24 –21–18 –15 –12–9 –6 –3 0 3 6 9 12 15 18 21 24 27 30 Timing error (ms) 0.12 0.1 0.08 0.06 0.04 0.02 0 –30 –27 –24 –21–18 –15 –12–9 –6 –3 0 3 6 9 12 15 18 21 24 27 30 Timing error (ms) 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 –30 –27 –24 –21–18 –15 –12–9 –6 –3 0 3 6 9 12 15 18 21 24 27 30 Figure 7. Timing error distributions for detected (a) melting (b) SSID (c) saturation and (d) expulsion for the model with the best overall performance. 68 M A T E R I A L S E V A L U A T I O N • J U L Y 2 0 2 3 2307 ME July dup.indd 68 6/19/23 3:41 PM AI output level Proportion of events Proportion of events Proportion of events Proportion of events
ASNT grants non-exclusive, non-transferable license of this material to . All rights reserved. © ASNT 2025. To report unauthorized use, contact: customersupport@asnt.org