

# Lecture 4: Failures and Errors

Instructor: M. Tahoori

Copyright 2010, M. Tahoori

TDS I: Lecture 4

1



## **Definitions**

- Chip is Defective if
  - it Doesn't Function
    - as Specified, or
    - as Designed due to Presence of a Failure
- Error
  - Incorrect Signal Value
- Failure
  - Deviation from Designed Characteristics
- Fault
  - Models Effect of Failure on Logical Signals

Copyright 2010, M. Tahoori

TDS I: Lecture 4

2



#### **Error**

- Incorrect Signal Value
  - Caused by Failure or Coupled Disturbance
  - Effect of failure
- Transient Error Coupled Disturbance
  - Conductive Coupling:
    - Power Supply, Ground Bounce
  - Capacitive Coupling
    - Adjacent Conductors
  - Electromagnetic Coupling
    - EMI
  - Radiation
    - α-Particles, cosmic rays

Copyright 2010, M. Tahoori

TDS I: Lecture 4

Evidence of Manufacturing Defects

Wetal2 extrusion/
ILD2 crack

Metal 1 Shelving

M4-M4 Short

Poly stringer

M4 Void Formations

Copyright 2010, M. Tahoori

TDS I: Lecture 4

4



#### Failure Mode

- Cause of Rejection of Failed Device
  - "recognizable electrical symptom by which the failure is observed"
- Failure Modes
  - Catastrophic
    - Open Interconnect
    - Shorted Connection
  - Degradation
    - Parameter out of Specification
    - Excess supply current
  - Permanent
    - Failure is Always Present
  - Temporary (Intermittent)
    - Failure is Not Always Present
    - Can Depend on Temperature, Voltage, Vibration

Copyright 2010, M. Tahoori

TDS I: Lecture 4

Lecture 4





## CMOS Short-circuit Failure Modes

- Gate Oxide Short
  - defect in insulating gate oxide
- Bridging Short
  - Incorrect additional connection between logic nodes
- Gate Circuit Internal Short
  - Incorrect additional connection between internal nodes

Copyright 2010, M. Tahoori

TDS I: Lecture 4

Lecture 4



## **CMOS Open-circuit Failure Modes**

- Imperfect conduction of an interconnect
  - complete-open defect
  - resistive-open defect (partial open)
  - tunneling-open defect
- More common with copper interconnect
- Causes:
  - Electromigration,
  - mask pattern defects,
  - imperfect metalization,

Copyright 2010, M. Tahoori

TDS I: Lecture 4

9



#### Failure Mechanism

- Basic Chemical or Physical Failure Cause
  - "specific microscopic physical, chemical, metallurgical, environmental phenomena or processes
    - cause device degradation or malfunction"
- General Categories
  - Surface and Bulk Effects
  - Metallization and Metal-Semiconductor
  - Package Related
- Yield Detractor
- Infant Mortality
- Reliability Limiter

Copyright 2010, M. Tahoori

TDS I: Lecture 4

10







#### Life Cycle Failure Rate

- Three Time Periods
  - Early Failures -- Infant Mortality
    - Manufacturing Defects or Damage
    - One to Twenty Weeks
  - Normal Lifetime Failures (Stress or Random)
    - Low Constant or Slightly Decreasing λ
    - Stress: voltage, temperature, humidity, vibration
    - λ is failure rate
      - measured in FITs, number of units that fail in 10<sup>9</sup> hours
  - Wearout Failures
    - Rapidly Increasing Failure Rate After 10-20 Years
      - Not Factor for Microelectronics

Copyright 2010, M. Tahoori

TDS I: Lecture 4

13



## Reliability Or Early Failure Screening

- "Process designed to detect incipient or latent flaws which if undetected would, in all likelihood, manifest themselves as early field failures."
- Reliability Or Environmental Stress Screening
  - "Process which implies the application of a specific type of environmental stress, on an accelerated basis, but within design capability, in an attempt to surface latent or incipient hardware flaws, which if undetected would in all likelihood manifest themselves in the early life of the system."

Copyright 2010, M. Tahoori

TDS I: Lecture 4

14



## Reliability Or Early Failure Screening

- Eliminate Marginal Devices
- Indirect Reject
  - Low yield wafers,
  - wafers with high oxide defect density,
  - dies in vicinity of bad dies

Copyright 2010, M. Tahoori

TDS I: Lecture 4

15



# Reliability Or Early Failure Screening

- Direct Age Whole Population of Parts
  - by an Accelerated Stress
  - Cause weakest parts to fail
- Stress to Induce Early Failure of Marginal Devices
  - SHOVE Short duration, high voltage operation
    - oxide failures
  - High temperature or voltage Burn-in
  - Stabilization Bake, Temperature Cycle
  - Thermal Shock, Vibration
- IDDQ, VLV, MinVdd

Copyright 2010, M. Tahoori

TDS I: Lecture 4

16



## **Accelerated Life Testing**

- Simulate Long-term Operation
  - Used to Predict Field Failure Rates
  - Devices Heated to Accelerate Failure Rate
  - Static or Dynamic Input Bias

Copyright 2010, M. Tahoori

TDS I: Lecture 4

17



## **Acceleration Calculations**

#### Burn In and Life Test

- Elevated temperature
  - Electromigration
  - Dielectric wearout
- Arrheniius Equation -- Mechanism Reaction Rate
- $\lambda = A e^{-Ea/kT}$
- Thermal Acceleration Factor, TAF, [Hnatek 95]
- TAF =  $t_n / t_{BI} = \exp [(E_a/K) (1/T_n 1/T_{BI})]$ 
  - T<sub>n</sub>, T<sub>BI</sub> are temperatures in °K
  - t<sub>n</sub>, t<sub>BI</sub> are MTBFs at T<sub>n</sub> and T<sub>BI</sub> respectively
  - E<sub>a</sub> is activation energy in eV,
  - K is Boltzmann's constant (8.62 x 10<sup>-5</sup> eV/°K)
- Activation energy depends on failure mechanism

Copyright 2010, M. Tahoori

TDS I: Lecture 4

18



#### **Acceleration Calculations**

- Example
  - TAF =  $t_n / t_{BI} = exp [ (E_a/K) (1/T_n 1/T_{BI}) ]$
  - K is Boltzmann's constant (8.62 x 10<sup>-5</sup> eV/°K)
  - $E_a = 0.6 \text{ eV}$
  - $T_n = 50^{\circ}C (323^{\circ}K)$
  - $T_{BI} = 125^{\circ}C (398^{\circ}K)$
  - $t_{BI} = 168 \text{ hours } (7 \text{ days})$
  - $t_n = 9.840 \text{ hours (1.1 years)}$

Copyright 2010, M. Tahoori

TDS I: Lecture 4

19



# **Acceleration Calculations**

#### Burn In and Life Test

- Elevated voltage
- Voltage Acceleration Factor, [Hnatek 95]
  - VAF = e [ C ( Vs Vo ) ]
  - V<sub>s</sub> = Applied stress voltage
  - V<sub>o</sub> = Standard operating voltage
  - C = constant, depends on dielectric type
    - C = 1.8 for dielectrics, 0.41 for junction defects

Copyright 2010, M. Tahoori

TDS I: Lecture 4

20



#### Burn In Issues

- Damage to good parts
- Non-accelerated failure mechanisms
  - Tunneling opens
- Thermal Acceleration difficult
  - Thermal runaway (exponential increase in current)
  - Many chips, oven is refrigerator
- Voltage Acceleration Factor decreasing
  - Reduced voltage margins
- Burn in fallout
  - Higher than customer fails (factor of 10)

Copyright 2010, M. Tahoori

TDS I: Lecture 4

21



#### Summary

- Many Failure Mechanisms
  - Yield Detractors
    - Improve Process to Minimize Particles Remain
  - Reliability Limiters
    - Control by Design and Improve Process
      - Hot Electrons
      - Time Dependent Dielectric Breakdown
      - Electromigration
- Reliability Screening
  - Eliminate Marginal Devices
- Accelerated Life Test
  - Predict Reliability

Copyright 2010, M. Tahoori

TDS I: Lecture 4

22