RUE Logo

Module 6.6 - Thermal Simulation & Validation

Using simulation tools and measurement techniques to verify thermal design

Checkpoint 1: CFD or FEA Simulation Performed Major

For designs with significant thermal challenges (>5W total, high ambient, sealed enclosures, or critical components), a computational fluid dynamics (CFD) or finite element analysis (FEA) simulation should be performed to predict temperatures before hardware is built.

Simulation Tool Selection

ToolTypeStrengthsBest ForCost Level
Ansys IcepakCFDFull conjugate heat transfer, radiationSystem-level, enclosures$$$$
Siemens FlothermCFDElectronics-specific, ECXML importBoard + enclosure analysis$$$$
6SigmaETCFDFast setup, EDA importRapid board-level analysis$$$
Ansys MechanicalFEAConduction + structural couplingThermo-mechanical stress$$$$
SOLIDWORKS FlowCFDCAD-integrated, accessibleEnclosure airflow design$$
SimScaleCloud CFDNo hardware needed, collaborativeInitial studies, small teams$$
OpenFOAMCFDFree, flexible, scriptableResearch, budget-constrainedFree

Simulation Setup Process

  1. Import geometry: Import PCB layout from EDA tool (IDF, STEP, or ECXML format). Include enclosure, heatsinks, and nearby boards. Simplify geometry -- remove features smaller than mesh resolution.
  2. Define material properties: Assign thermal conductivities: PCB (anisotropic: in-plane ~20-40 W/m·K, through-plane ~0.3-0.5 W/m·K depending on copper %), copper layers, solder, TIMs, enclosure material.
  3. Assign heat sources: Apply power dissipation to each component. Use worst-case power values from Module 6.1 power budget.
  4. Set boundary conditions: Define ambient temperature, inlet/outlet conditions (velocity or pressure), wall conditions (adiabatic, temperature, or convection coefficient).
  5. Mesh the domain: Refine mesh around heat sources and thin layers. PCB layers may need local refinement (1-3 cells through each copper/prepreg layer).
  6. Solve and post-process: Run steady-state solution. Extract junction temperatures, temperature distributions, airflow patterns, and heat flux paths.

PCB Thermal Model (Effective Properties)

Anisotropic PCB Thermal Conductivity:

In-plane (x,y): k_xy = Σ(k_i × t_i) / Σ(t_i)
Through-plane (z): k_z = Σ(t_i) / Σ(t_i / k_i)

Example: 4-layer, 1.6mm board
Layer 1: Cu 35µm (k=385), Prepreg 200µm (k=0.3)
Layer 2: Cu 35µm (k=385), Core 800µm (k=0.3)
Layer 3: Cu 35µm (k=385), Prepreg 200µm (k=0.3)
Layer 4: Cu 35µm (k=385)

Assuming 60% copper fill on inner, 40% on outer:
k_Cu_effective_inner = 0.6 × 385 + 0.4 × 0.3 = 231.1 W/m·K
k_Cu_effective_outer = 0.4 × 385 + 0.6 × 0.3 = 154.2 W/m·K

k_xy = (154.2×35 + 0.3×200 + 231.1×35 + 0.3×800 + 231.1×35 + 0.3×200 + 154.2×35) / 1600
k_xy = (5397 + 60 + 8089 + 240 + 8089 + 60 + 5397) / 1600
k_xy = 27,332 / 1600 = 17.1 W/m·K

k_z = 1600 / (35/154.2 + 200/0.3 + 35/231.1 + 800/0.3 + 35/231.1 + 200/0.3 + 35/154.2)
k_z = 1600 / (0.227 + 667 + 0.151 + 2667 + 0.151 + 667 + 0.227)
k_z = 1600 / 4001.8 = 0.4 W/m·K

Anisotropy ratio: k_xy/k_z = 17.1/0.4 = 43:1
Full CFD simulation in Flotherm: PCB imported from Altium via IDF export. Board modeled as a detailed layer stack-up with actual copper coverage percentages extracted from Gerber files. All 15 heat-generating components assigned power from measured/calculated values. Enclosure included with ventilation openings. Results: Max Tj predicted = 112°C for DC-DC converter IC. Measured on prototype: 108°C. Correlation within 4°C.
"Simulation" done by modeling the entire PCB as a uniform 0.3 W/m·K block (FR4 only, ignoring copper). All heat applied as a single lumped source. No enclosure modeled. Results showed 95°C maximum temperature. Actual measurement showed 135°C -- 40°C error because copper spreading was not captured and the enclosure trapped heat.

Checkpoint 2: Boundary Conditions Realistic Critical

Simulation accuracy depends entirely on the quality of boundary conditions. Unrealistic boundary conditions produce misleading results that create false confidence in the thermal design.

Key Boundary Conditions to Define

BoundaryCommon MistakeCorrect Approach
Ambient temperatureUsing 25°C (lab conditions)Use worst-case: 55°C industrial, 85°C automotive
AirflowAssuming free air convection everywhereModel actual enclosure, vents, fans, and obstructions
RadiationIgnoring radiation entirelyInclude radiation (30-50% of heat removal in natural convection)
PCB mountingFloating PCB in infinite airInclude standoffs, mounting points, thermal paths to chassis
Adjacent boardsIgnoring nearby heat sourcesModel adjacent PCBs, power supplies, heat from other systems
Solar loadingNot considered for outdoorAdd 1000 W/m² solar flux on exposed surfaces (worst case)
AltitudeAssuming sea levelReduce air density for elevation: ρ = ρ₀ × e^(-h/8500)

Convection Boundary Conditions

Natural Convection (typical h values for simulation):
Vertical plate in open air: h = 5-12 W/(m²·K)
Horizontal plate (hot side up): h = 6-15 W/(m²·K)
Horizontal plate (hot side down): h = 3-6 W/(m²·K)
Inside sealed enclosure: h = 3-8 W/(m²·K)
Narrow channel (board-to-board <10mm): h = 2-5 W/(m²·K)

Forced Convection:
Laminar flow over flat surface: h = 10-50 W/(m²·K)
Turbulent flow (typical electronics): h = 25-100 W/(m²·K)
Impingement jet (fan directly on component): h = 100-500 W/(m²·K)

When using CFD, DO NOT manually set h values.
Instead, model the actual geometry and let the solver calculate h from first principles.
  1. Define the operating environment: indoor/outdoor, sealed/vented, orientation (vertical/horizontal).
  2. Set ambient temperature to the maximum spec (not typical). For validation runs, use actual test conditions.
  3. If enclosed: model the enclosure geometry, vent locations, and vent free-area percentages.
  4. If fans present: use the full fan curve (P-Q), not just free-air CFM. Place fan at actual mounting location.
  5. Enable radiation for natural convection cases. Set emissivity: solder mask = 0.9, bare metal = 0.05-0.1, plastic = 0.9.
  6. For outdoor applications: add solar radiation as a heat flux on exposed surfaces (account for enclosure color/absorptivity).
Outdoor telecom equipment: Ambient = 55°C (desert specification) + 10°C solar loading contribution on enclosure. Modeled sealed aluminum enclosure with external fins. Internal air modeled as buoyancy-driven flow. Board mounted vertically on standoffs (thermal path through standoffs modeled). Altitude derating applied for 3000m installation. Result: conservative prediction confirmed by thermal chamber testing at 55°C.
Board simulated with "open air" boundary on all sides at 25°C. In reality, board is inside a sealed plastic enclosure with only 5mm clearance to the cover. Trapped air reaches 45°C above ambient before any board heat is considered. Simulation under-predicts temperatures by 40°C or more.

Checkpoint 3: Worst-Case Ambient Considered Critical

The simulation must use the worst-case ambient temperature for the product's intended operating environment. This is not the lab temperature (20-25°C) but the maximum temperature specified in the product requirements.

Standard Operating Environments

Environment ClassMax AmbientTypical StandardExamples
Office/Home35-40°CIEC 60068 (Mild)Laptops, routers, consumer electronics
Industrial (indoor)55°CIEC 60068-2-2Factory controllers, rack equipment
Industrial (extended)70°CMIL-STD-810Motor drives, outdoor cabinets
Automotive (cabin)85°CSAE J1211Infotainment, body controllers
Automotive (underhood)125°CAEC-Q100 Grade 1Engine controllers, sensors
Automotive (exhaust)150°CAEC-Q100 Grade 0Exhaust sensors, turbo electronics
Military (ground)71°CMIL-STD-810HGround vehicle electronics
Aerospace-55 to +85°CRTCA DO-160Avionics, satellites
Enclosure Temperature Rise Estimation:

For sealed enclosures, internal ambient rises above external ambient:
ΔT_internal = P_total / (h_effective × A_enclosure)

Example: 20W total in sealed box (300×200×100mm)
A_enclosure = 2×(0.3×0.2 + 0.3×0.1 + 0.2×0.1) = 0.22 m²
h_effective = 8 W/(m²·K) (natural convection + radiation outside)
ΔT_internal = 20 / (8 × 0.22) = 11.4°C

If external ambient = 55°C:
Internal ambient around PCB ≈ 55 + 11.4 = 66.4°C
Use 66.4°C (not 55°C!) as the ambient for component thermal analysis
Product specification states: "Operating ambient: -20 to +60°C." Thermal simulation performed at 60°C ambient. Additional simulation at 60°C + 8°C enclosure rise = 68°C effective ambient at PCB level. Transient simulation shows startup at -20°C does not cause condensation issues on cold components.
All thermal analysis done at "25°C ambient, natural convection." Product deployed in Arizona server closet where ambient regularly reaches 45°C. With sealed cabinet, internal temperatures reach 65°C. Multiple components exceed ratings in the field. Warranty claims spike in summer months.

Checkpoint 4: Thermal Camera Validation Planned Major

Infrared thermal imaging provides rapid, non-contact temperature measurement across the entire PCB surface. A validation plan should define when, how, and under what conditions thermal imaging will be performed.

Thermal Camera Setup for PCB Measurement

  1. Camera selection: Minimum resolution: 320×240 pixels. Thermal sensitivity: ≤0.05°C NETD. Accuracy: ±2°C or ±2%. Common: FLIR E96, Testo 885, InfraTec VarioCAM.
  2. Surface preparation: Apply a uniform coating of known emissivity to the board surface. Options: matte black spray paint (ε≈0.95), electrical tape strips (ε≈0.95), or calibrate for solder mask (ε≈0.85-0.90).
  3. Camera positioning: Place camera perpendicular to board surface. Maintain distance for correct field-of-view (entire board visible). Avoid reflections from hot objects in the environment.
  4. Operating conditions: Run the board at the worst-case operating point. Wait for thermal steady-state (minimum 15-30 minutes for PCBs, longer for systems with large thermal mass).
  5. Emissivity correction: Shiny copper surfaces (ε≈0.05) will read incorrectly low. Use point measurements with thermocouples to calibrate absolute values.
  6. Documentation: Record camera model, distance, ambient temp, operating point, time to steady-state, and emissivity settings for each capture.

Emissivity Values for PCB Materials

SurfaceEmissivity (ε)IR Camera AccuracyNotes
Green solder mask0.85-0.92GoodMost common, relatively uniform
Black solder mask0.90-0.95ExcellentBest for thermal imaging
White solder mask0.80-0.88GoodSlightly less predictable
Bare copper (oxidized)0.4-0.7PoorVariable, unreliable reading
Bare copper (polished)0.03-0.07UnusableReflects surroundings
ENIG (gold finish)0.02-0.05UnusableMirror-like, reflects IR
IC package (black epoxy)0.90-0.95ExcellentReliable for package top reading
QFN exposed pad (solder)0.05-0.1UnusableNeed thermocouple for this
Kapton tape (applied)0.95ExcellentApply over reflective surfaces
Temperature Correction for Emissivity Error:
T_actual = ((T_measured⁴ × ε_set - (1-ε_actual) × T_background⁴) / ε_actual)^0.25

Simplified (for small errors):
Error ≈ (ε_set - ε_actual) / ε_actual × (T_object - T_background)

Example: Copper pad measured with ε=0.9 (solder mask setting)
Actual ε of oxidized copper = 0.5
T_measured = 50°C, T_background = 25°C
Error ≈ (0.9 - 0.5) / 0.5 × (50-25) = 20°C
Actual temperature ≈ 50 + 20 = 70°C
Thermal camera under-reads shiny surfaces by a LOT!
Thermal validation protocol: Board operated at maximum load for 45 minutes in a thermal chamber at 55°C. Board sprayed with Krylon #1602 ultra-flat black paint (ε=0.95) for uniform emissivity. FLIR T1030sc camera positioned 300mm above board. Three thermocouples (K-type, 36AWG) attached to IC packages for absolute calibration. Camera images and thermocouple data logged simultaneously. Delta between camera and TC: <3°C after emissivity correction.
Quick thermal image taken with phone attachment thermal camera (FLIR One, 80×60 pixels) in ambient lab conditions (25°C). Board running at "normal" power (not worst case). No emissivity correction applied. Engineer declares "looks fine, nothing over 60°C" -- but bare copper areas are actually at 95°C and showing as cool due to low emissivity.

Checkpoint 5: Thermocouple Measurement Points Identified Minor

Thermocouples provide accurate point measurements that complement thermal imaging. Key measurement locations should be identified during the design phase so that prototype boards can be instrumented effectively.

Priority Measurement Points

PriorityLocationPurposeTC Type/Size
1Hottest IC package topValidate junction temp calculationK-type, 36AWG (thin)
2Electrolytic cap bodyVerify lifetime calculationK-type, 36AWG (thin)
3Heatsink baseVerify TIM performanceK-type, 30AWG
4PCB hotspot (between vias)Board-level spreading checkK-type, 36AWG, soldered
5Air inlet temperatureActual ambient referenceK-type, 24AWG (shielded)
6Air outlet temperatureTotal heat load verificationK-type, 24AWG (shielded)
7MOSFET case/drain tabSOA margin verificationK-type, 36AWG (attached to tab)
8Power inductor bodyCore + copper loss checkK-type, 30AWG (kapton taped)

Thermocouple Attachment Methods

Best Practices for TC Attachment:

For IC package tops:
- Use 36AWG K-type thermocouple (minimize thermal mass)
- Attach with thermal epoxy (Arctic Alumina) or kapton tape
- Ensure junction is centered on package top (hottest point)
- Route TC wire along board surface for 10mm to minimize conduction error

For PCB surface:
- Solder TC junction directly to a copper pad (best contact)
- Alternatively: epoxy to solder mask surface
- Do not use tape that insulates the TC from the surface

For air temperature:
- Shield TC from radiation (wrap in aluminum foil tube)
- Place in representative location (not in stagnant pocket)
- Ensure adequate air flow over the junction

Error budget:
K-type accuracy: ±1.5°C or ±0.4% (standard)
Attachment error: ±1-3°C (depends on method)
Total measurement uncertainty: ±2-5°C typical
  1. During schematic review, identify the 5-8 critical thermal measurement points.
  2. Add unpopulated TC pads on the PCB layout at these locations (small SMD pads for solder attachment).
  3. Document measurement plan: location, TC type, attachment method, expected temperature range.
  4. Specify measurement equipment: data logger (e.g., Pico TC-08), sampling rate (1 sample/second minimum).
  5. Define test conditions: ambient temperature, operating mode, duration, and acceptance criteria for each point.
  6. Plan for measurement during thermal chamber testing: TC wires must route through chamber port without air leaks.
Prototype PCB includes 8 dedicated thermocouple attachment pads (2mm × 2mm copper pads, no solder mask, labeled TC1-TC8 in silkscreen). Test procedure document specifies: TC1 on FPGA package center, TC2 on input electrolytic body, TC3 on heatsink base, TC4-TC5 on PCB hotspots, TC6 on power inductor, TC7 air inlet, TC8 air outlet. Pico TC-08 logger records all channels at 1Hz for 60 minutes.
No measurement plan created. After prototype boards arrive, engineer realizes there is no way to attach a thermocouple to the critical MOSFET because it's covered by a heatsink. Removing the heatsink changes the thermal conditions. Temperature "measurement" is done by touching the board with a finger and declaring "it feels warm but not hot."

Checkpoint 6: Simulation vs. Measurement Correlation Major

After prototype measurement, the simulation model must be correlated with actual data. Good correlation (within ±10-15%) validates the model for use in design optimization and variant analysis. Poor correlation requires model debugging.

Correlation Process

  1. Match test conditions exactly: Set simulation ambient to measured ambient (not specification). Match power dissipation to measured input power. Include actual fan speed, door/cover position, etc.
  2. Compare at all measurement points: Create a correlation table with predicted vs. measured temperatures at each TC location.
  3. Calculate error metrics: For each point: Error = T_sim - T_meas. Acceptable: |Error| < 10°C or <15% of temperature rise.
  4. Identify systematic bias: If all predictions are high, check power input or convection coefficients. If all low, check for missing heat sources or thermal paths.
  5. Adjust model: Common corrections: TIM conductivity (often worse than datasheet due to voids), board copper percentage (estimate vs. actual), component power (actual vs. datasheet).
  6. Validate corrected model: After adjustment, verify against a second operating condition (different power level or ambient) without further tuning.

Correlation Quality Metrics

MetricExcellentAcceptablePoor (requires debug)
Max absolute error<5°C<10°C>15°C
Average error (bias)<3°C<5°C>8°C
RMS error<4°C<8°C>12°C
Temperature rise error<10%<15%>25%
Hot spot locationExact matchWithin 5mmDifferent component
Temperature distribution shapeGood matchSimilar patternDifferent pattern
Correlation Table Example:

Location | Simulated | Measured | Error | ΔT_rise_err
─────────────────────────────────────────────────────────────
IC1 (DC-DC) | 98°C | 103°C | -5°C | -7%
IC2 (MCU) | 78°C | 75°C | +3°C | +6%
IC3 (LDO) | 112°C | 108°C | +4°C | +5%
Cap C1 (elec.) | 72°C | 68°C | +4°C | +9%
Inductor L1 | 88°C | 92°C | -4°C | -6%
PCB hotspot | 85°C | 82°C | +3°C | +5%
Air outlet | 42°C | 40°C | +2°C | +13%
─────────────────────────────────────────────────────────────
Ambient: 25°C (both sim and test)
RMS Error: 3.7°C -- EXCELLENT CORRELATION
Max Error: 5°C (IC1) -- ACCEPTABLE

Common Reasons for Poor Correlation

SymptomLikely CauseSolution
All predictions too highPower input overestimatedMeasure actual power consumption
All predictions too lowMissing heat source or blocked airflowCheck for unmeasured losses (e.g., cable heating)
One point very wrong, others OKLocal model error (TIM, via)Check thermal interface at that component
Good avg, high spreadAirflow pattern incorrectAdd flow visualization (smoke) or adjust turbulence model
Transient timing wrongThermal mass incorrectVerify component mass, heatsink mass, PCB density
Correlation report shows: 8 measurement points, RMS error = 4.2°C, maximum error = 7°C on electrolytic cap (discovered cap is partially shielded by adjacent connector, not modeled). Model updated to include connector geometry. Re-simulation matches within 3°C. Corrected model used to predict worst-case (85°C ambient) -- all components within specification with >15°C margin.
Simulation predicts 85°C on the main IC. Measurement shows 115°C. Engineer adjusts "contact resistance" parameter until simulation matches measurement but does not understand why. The real issue is that thermal vias were not fabricated as filled (manufacturing error). The "tuned" model would give wrong predictions for a board revision where vias are correctly filled.
  • Power verification: Measure actual input power with a DMM at the power connector. Compare to simulation total. They should match within 10%.
  • Energy balance: Check that Q_out (from air temp rise × flow rate) equals Q_in (electrical input). This validates your measurement setup.
  • Sensitivity study: Run the simulation with ±20% on key parameters (TIM k, copper %, power) to understand which inputs dominate uncertainty.
  • Two-point correlation: If possible, correlate at two different operating points (half power and full power). A model that matches both is much more trustworthy.