How Do We Actually Test Semiconductors?

Verifying Billions of Transistors

Mar 21, 2026

Have you ever wondered how semiconductor testing actually works?

How do we determine whether a chip containing billions of transistors is functioning correctly? The complexity of circuitry packed into something smaller than your fingernail is beyond imagination. If humans had to inspect these chips one by one, it would take hundreds of years to test a single chip.

My PhD research and current work focus on exactly this field.

Design for Testability (DFT): A design methodology that enables effective testing

In this article, I’ll walk you through the entire journey of how a semiconductor goes from fabrication to being certified as a “good” chip explained in a way that even non-engineers can understand.

Substack is currently in a trial phase, so all premium quality articles are available for free for now. Once paid subscriptions are introduced, pricing will be set at a level comparable to professional tech writers on Substack. Early supporters will have the benefit of locking in their current pledge price for future paid access.

1. What Is Testing? The Game of Inputs and Outputs

The essence of testing is simple:

Apply a specific input and verify that the expected output appears.

The problem lies in the structure of modern semiconductors. While externally accessible pins number only in the hundreds, internally there are billions of transistors. And we can’t directly observe internal states from outside.

Let me use the human body as an analogy. When something goes wrong in your body, how do you know whether the problem is in your stomach, intestines, or gallbladder? Doctors use endoscopes to look inside, or administer specific medications and observe the body’s response.

Semiconductors work the same way. We embed structures during the design phase that make internal states observable and controllable from the outside. This is the starting point of DFT (Design for Test).

2. DFT: The Minimum Infrastructure for Testability

DFT stands for Design for Test, literally “designing for testability.” Rather than diving deep into DFT, I’ll cover just enough to understand how testing becomes possible.

Scan Architecture: A Window into the Chip

By connecting internal flip-flops in a scan structure, we can inject and extract internal states from outside the chip.

Shift-in: Inject desired values into internal nodes
Capture: Store results after a brief operation
Shift-out: Extract internal results for observation

This structure enables detection of internal defects that would otherwise be inaccessible through external pins alone.

ATPG: The Automatic Pattern Generator

Once scan architecture is in place, ATPG (Automatic Test Pattern Generation) tools create test patterns based on fault models.

Defect vs. Fault: A Defect is an actual physical flaw that occurs during manufacturing. A Fault is the circuit-level abstraction or model of that defect.

The main fault models include:

Stuck-at Fault: A signal line stuck permanently at 0 or 1
Transition Fault: Slow transitions from 0→1 or 1→0 (delay defects)

ATPG designs patterns to “catch the maximum number of faults with the minimum number of patterns.”

Scan Compression: The Key to Reducing Test Time

While scan enables internal observation, it can significantly increase shift time. That’s why production uses scan compression to reduce test time.

The core concept is simple:

Drive more internal chains simultaneously with fewer ATE pins.

Without this capability, production costs would explode.

3. ATE: The Heart of Testing

Any discussion of semiconductor testing must include ATE (Automatic Test Equipment). These machines, built by companies like Teradyne and Advantest, cost anywhere from hundreds of thousands to tens of millions of dollars.

ATE’s job can be summarized in one line:

Supply power, apply inputs, and determine if outputs are correct.

But reality is far more sophisticated. ATE isn’t just a pattern-applying machine.

What ATE actually manages:

Levels (voltage references): What voltage to apply to each pin, and thresholds for High/Low determination
Timing: Nanosecond-precision definition of when outputs are valid (strobe timing)
Pin Electronics: Drive strength, comparison thresholds, protection conditions
DC Measurements: Leakage current, open/short detection, power consumption parameters

In other words, ATE performs both “digital comparison” and “electrical measurement” simultaneously.

In production, ATE time costs are enormous. Test time equals money. That’s why the #1 goal in test engineering usually boils down to this:

Reduce test time.

Even a 1-second reduction in test time translates to massive cost savings at production scale. One key method for achieving this is Multi-site Testing, testing multiple chips simultaneously with a single tester.

4-site, 8-site, 16-site, and even 32+ site configurations are possible
Tester costs can be divided by the number of chips
Trade-off: significantly more complex test programs and resource management

4. When Does Testing Happen?

Semiconductor testing is divided into three main stages. Each stage has different purposes, environments, and criteria for screening defects.

1) Wafer Test (Wafer Sort, CP Test)

When wafers are completed at the fab, testing begins before cutting them into individual chips. This is called Wafer Test or CP (Circuit Probe) Test.

At this stage, a probe card makes contact with each die on the wafer using microscopic needles to exchange electrical signals.

The core purpose of Wafer Test is singular:

Screen out as many defects as possible before packaging.

Packaging is expensive. Eliminating defective dies early saves significant costs. However, this stage has physical limitations—probe contact may not be perfect, and achieving the stable temperature control or high-speed conditions of Final Test can be difficult. Wafer Test is less about “perfect inspection” and more about filtering candidates for subsequent processes.

2) Package Test (Final Test, FT)

Dies that pass Wafer Test are cut (dicing) and placed into packages. Once packaging is complete, Final Test (FT) is performed.

Final Test includes functional testing and high-speed I/O testing depending on product requirements. The biggest difference from Wafer Test is the test environment:

Socket-based contact is much more stable
Various temperature conditions can be applied: room temperature, high (125°C), low (-40°C)
Final verification including any defects introduced during packaging

Final Test is essentially the last gate before shipment.

3) System Level Test (SLT)

Passing Final Test isn’t the end. Increasingly, companies are adopting System Level Test (SLT).

SLT tests chips under conditions similar to actual system environments. For a mobile AP, for example, this means actually booting an operating system and running applications.

Why is this stage necessary?

Because some defects only appear in real-world usage conditions that ATE-based testing can’t catch.

As processes shrink and chips grow more complex, cases of “passed testing but died in actual use” increase. SLT is the final insurance against this risk.

5. What’s the Actual Test Sequence?

Production testing operates in two modes: SOF (Stop on Failure) and COF (Continue on Failure).

SOF halts testing the moment any failure occurs. Since a chip with even one defect is defective, there’s no point continuing. This is the standard for production where time is critical.

COF continues testing even after failures occur. This is used for yield analysis to diagnose which stage produced which failures.

In production, SOF provides rapid pass/fail decisions, while separate samples run COF to collect detailed data for yield analysis. Therefore, tests are arranged “cheapest first, fastest-failing first.” If early tests fail, longer subsequent tests can be skipped.

A typical test flow looks like this:

Continuity Test: Pin open/short, contact issues
DC Parametric Test: Leakage current, power consumption, basic electrical characteristics
Scan Test (Stuck-at): Basic structural defect detection
Scan Test (Transition, At-speed): Delay defect detection
MBIST: Memory testing
Functional Test: Verification of actual operating modes
Speed/Voltage Margin: Measure operating limits for grade classification

MBIST and Repair: For memory products, MBIST (Memory Built-In Self-Test) circuits identify defects, then a Repair process replaces faulty rows/columns using redundancy. Even defective memory can be salvaged as good product.

6. Binning: Not All Good Chips Are Equal

Do all chips that pass testing have the same value?

No. Even with identical designs, process variation causes individual chips to differ in performance and power characteristics.

That’s why production doesn’t end at Pass/Fail—it includes Binning.

HBIN (Hardware Bin)

HBIN is the bin used when the tester physically sorts chips. It determines which tray chips go into and drives logistics and shipment decisions.

HBIN examples:

Good Grade A: Highest performance
Good Grade B: Standard quality
Good Grade C: Budget tier
Fail Scrap: Discard
Hold: Retained for analysis
Retest: Requires retesting

HBIN is limited by physical constraints, typically to a few dozen bins.

SBIN (Software Bin)

SBIN is a detailed cause code recorded in software. If HBIN answers “where does it go?”, SBIN answers “why?”

SBIN examples:

101: Scan Chain Fail
201: SA Fail Block A
202: SA Fail Block B
301: Transition Fail
401: Leakage Over
501: Specific IO Margin Fail

SBIN is critical data for yield analysis. “SBIN 301 spiked in this lot” means a specific delay fault pattern increased, triggering immediate process feedback.

What’s a Lot? Fabs don’t manage wafers individually—they process about 25 wafers together as a batch. That batch is a Lot.

Performance-Based Binning

Even among good chips, performance varies:

Speed Binning: Grades based on maximum operating frequency. 3.0GHz+ is premium, 2.8GHz is standard, 2.5GHz is budget. Those same CPUs at different price points? Usually the result of speed binning.

Leakage Binning: Grades based on leakage current. For mobile products where battery life matters, low-leakage chips command premium prices.

Functional Binning: If certain functional blocks are defective, those functions are disabled and the chip is sold as a lower-tier product. GPUs with some cores disabled are a classic example. Even flawed chips generate value instead of becoming scrap.

7. Testing and Yield: The Numbers Game That Drives Profitability

Yield is the ratio of good chips to total chips:

Yield = (Good chips / Total chips) × 100%

The difference between 90% and 95% yield is just 5 percentage points, but in production, that’s money. Running tens of thousands of wafers monthly can mean billions in difference.

D0: Defect Density

D0 represents defect density per unit area. In the simplest Poisson model, yield is approximated as:

Yield ≈ e^(-D0 × A)(where A is die area)

The core message is simple:

Larger dies have higher probability of containing defects.

This is one reason why cutting-edge AI chips struggle with yield—their massive die sizes. In reality, defects aren’t purely random (clustering effects exist), making models more complex, but the direction remains the same.

Yield Learning and Diagnosis

Improving yield requires understanding “why defects occur.”

Diagnosis uses DFT structures to rapidly identify which test patterns, which design blocks, which nets, which cells, even which layers are producing defects. Once identified, findings are communicated to design or process teams, leading directly to silicon improvements. This feedback loop that progressively improves initial yield is called Yield Ramp-up.

This is exactly what I did at Samsung, Qualcomm, and AMD. In my current role, I’m back to DFT design work.

Test Escape vs. Overkill

Test engineering faces two eternal dilemmas:

Test Escape: Defective chips that pass testing and ship to customers

Overkill: Good chips incorrectly flagged as defective and discarded

Stricter criteria reduce escapes but increase overkill; looser criteria do the opposite. Finding the optimal balance point is the core of test strategy.

Closing Thoughts

Semiconductor testing is a critical technology that quietly underpins the entire semiconductor industry.

In a world where even a single defect among billions of transistors can be catastrophic, accurately distinguishing good from bad is anything but simple.

Ultimately, semiconductor testing can be summarized in one sentence:

The process that determines with what confidence and at what price this chip can be sold to customers.

Next time you see “yield” mentioned in semiconductor news, perhaps take a moment to think about the test data behind that number—and the massive system working to protect it.

Damnang’s Substack

Discussion about this post

Ready for more?