Event ID 129: Disk I/O Timeout
Complete troubleshooting guide for Exchange Server Event ID 129 disk I/O timeout errors causing database hangs, slow performance, and potential data integrity issues.
Table of Contents
Error Overview
Event ID 129: Disk I/O Timeout (Reset to device)
"Reset to device, \Device\RaidPort0, was issued."
Or: "The IO operation at logical block address 0x12345678 for Disk 1 (PDO name: \Device\00000001) was retried."
What This Error Means
Event ID 129 is a critical storage event indicating that a disk I/O operation exceeded the timeout threshold (typically 30 seconds) and Windows had to reset the I/O path to recover. For Exchange Server, this is a serious warning that storage cannot keep up with demand, potentially leading to database dismounts, corruption, or data loss.
Severity Indicators
- • Single event: Investigate promptly
- • Multiple events/hour: Critical issue
- • Events + DB dismount: Emergency
- • During backup: Verify backup integrity
- • After hardware change: Check config
Impact on Exchange
- • Database may dismount
- • Transaction logs may fail to write
- • Users experience hangs
- • Mail flow may stop
- • DAG replication may lag
Critical Warning
Event ID 129 is one of the most serious storage events for Exchange servers. Even a single occurrence indicates storage that momentarily failed to respond. Multiple events require immediate investigation to prevent database corruption.
Symptoms & Detection
User-Reported Symptoms
- ✗Outlook completely freezes for 30+ seconds
- ✗OWA pages fail to load or timeout
- ✗Sudden disconnection from mailbox
- ✗Sent emails appear to disappear
- ✗Calendar updates don't save
Administrator Detection
- →Event ID 129 in System log
- →Event ID 153 (I/O retries)
- →Database dismount events
- →Disk latency spikes in PerfMon
- →Storage alerts from SAN management
Event Log Entry Example
Log Name: System Source: storvsc (or storport, disk) Event ID: 129 Level: Warning Description: Reset to device, \Device\RaidPort0, was issued. Additional Information: - Device: \Device\Harddisk1\DR1 - Bus Type: iSCSI / Fibre Channel / SAS - Port: 0 - Path: 0 - Target: 0 - LUN: 1 Correlation with Exchange: Shortly after this event, you may see: - MSExchangeIS Event 1002 (database dismounted) - ESE Event 490 (log write failure) - MSExchangeRepl Event 2024 (log copying failed)
Common Causes
Storage Array Overwhelmed
The SAN, NAS, or local storage cannot process I/O requests fast enough, causing them to exceed the timeout threshold. This is often due to insufficient IOPS capacity or competing workloads.
Physical Disk Failure
A failing disk in the array causes I/O operations to hang while the controller attempts retries. Even with RAID protection, a degraded array has reduced performance and may timeout under load.
HBA/Storage Path Issues
Problems with the Host Bus Adapter, fiber channel switches, iSCSI network, or multipath configuration can cause I/O delays. Firmware bugs, cable issues, or misconfigured multipathing are common culprits.
Driver or Firmware Issues
Outdated or buggy storage drivers, HBA firmware, or storage array firmware can cause I/O handling problems that result in timeouts.
Virtual Machine Storage Contention
For virtualized Exchange servers, contention on shared datastores, insufficient storage reservations, or hypervisor I/O scheduling can cause timeouts.
Diagnostic Steps
Step 1: Check Event Log for I/O Issues
# Search for Event ID 129 and related storage events
$events = Get-WinEvent -FilterHashtable @{
LogName = 'System'
Id = 129, 153, 51, 11, 15, 9
StartTime = (Get-Date).AddDays(-7)
} -ErrorAction SilentlyContinue | Sort-Object TimeCreated -Descending
Write-Host "=== Storage Events (Last 7 Days) ===" -ForegroundColor Cyan
$events | Group-Object Id | Select-Object @{N='EventID';E={$_.Name}}, Count |
Format-Table -AutoSize
# Show recent Event ID 129 details
Write-Host "`n=== Recent Event ID 129 Details ===" -ForegroundColor Yellow
$events | Where-Object {$_.Id -eq 129} | Select-Object -First 10 |
ForEach-Object {
Write-Host "Time: $($_.TimeCreated)"
Write-Host "Device: $($_.Properties[0].Value)"0].Value)"
Write-Host "---"
}
# Check for correlation with Exchange events
$exchangeEvents = Get-WinEvent -FilterHashtable @{
LogName = 'Application'
ProviderName = 'MSExchangeIS', 'ESE'
Level = 2, 3 # Error, Warning
StartTime = (Get-Date).AddDays(-7)
} -ErrorAction SilentlyContinue
Write-Host "`n=== Exchange Storage-Related Events ===" -ForegroundColor Yellow
$exchangeEvents | Where-Object {$_.Message -match "I/O|disk|storage|write|read"} |
Select-Object TimeCreated, Id, Message | Select-Object -First 10 |
Format-Table -AutoSize -WrapStep 2: Identify Affected Disk/LUN
# List all disks and their details
Get-Disk | Select-Object Number, FriendlyName, SerialNumber, Size, PartitionStyle,
OperationalStatus, HealthStatus | Format-Table -AutoSize
# Get physical disk details
Get-PhysicalDisk | Select-Object FriendlyName, SerialNumber, MediaType, Size,
HealthStatus, OperationalStatus, SpindleSpeed | Format-Table -AutoSize
# Map disks to volumes and Exchange databases
Get-Volume | Where-Object {$_.DriveLetter} | ForEach-Object {
$vol = $_
$partition = Get-Partition -DriveLetter $vol.DriveLetter -ErrorAction SilentlyContinue
$disk = if ($partition) { Get-Disk -Number $partition.DiskNumber } else { $null }
[PSCustomObject]@{
DriveLetter = $vol.DriveLetter
Label = $vol.FileSystemLabel
SizeGB = [math]::Round($vol.Size/1GB,2)
DiskNumber = $partition.DiskNumber
DiskFriendlyName = $disk.FriendlyName
HealthStatus = $disk.HealthStatus
}
} | Format-Table -AutoSize
# Check which databases are on which volumes
Get-MailboxDatabase | Select-Object Name,
@{N='DbVolume';E={(Split-Path $_.EdbFilePath.PathName).Substring(0,2)}},
@{N='LogVolume';E={(Split-Path $_.LogFolderPath.PathName).Substring(0,2)}} |
Format-Table -AutoSizeStep 3: Monitor Real-Time Disk Performance
# Real-time disk latency monitoring
$server = $env:COMPUTERNAME
$duration = 60 # seconds
$interval = 5 # seconds
Write-Host "Monitoring disk latency for $duration seconds..." -ForegroundColor Cyan
$counters = @(
"\$server\PhysicalDisk(*)Avg. Disk sec/Read",
"\$server\PhysicalDisk(*)Avg. Disk sec/Write",
"\$server\PhysicalDisk(*)Current Disk Queue Length",
"\$server\PhysicalDisk(*)% Disk Time"
)
$samples = Get-Counter -Counter $counters -SampleInterval $interval -MaxSamples ($duration/$interval)
# Analyze results
$results = @{}
foreach ($sample in $samples) {
foreach ($cs in $sample.CounterSamples) {
if ($cs.InstanceName -ne "_total") {
$key = "$($cs.InstanceName)_$($cs.Path.Split('')[-1])"$cs.Path.Split('')[-1])"
if (-not $results[$key]) { $results[$key] = @() }
$results[$key] += $cs.CookedValue
}
}
}
Write-Host "`n=== Disk Latency Summary ===" -ForegroundColor Yellow
foreach ($key in $results.Keys | Sort-Object) {
$avg = ($results[$key] | Measure-Object -Average).Average
$max = ($results[$key] | Measure-Object -Maximum).Maximum
$color = "Green"
if ($key -match "sec/" -and $max -gt 0.05) { $color = "Yellow" }
if ($key -match "sec/" -and $max -gt 0.1) { $color = "Red" }
Write-Host ("{0}: Avg={1:N4}, Max={2:N4}"1:N4}, Max={2:N4}" -f $key, $avg, $max) -ForegroundColor $color
}
Write-Host "`nNote: Values for 'sec/Read' and 'sec/Write' are in seconds." -ForegroundColor Cyan
Write-Host "Target: < 0.020 (20ms), Alert: > 0.050 (50ms), Critical: > 0.100 (100ms)"020 (20ms), Alert: > 0.050 (50ms), Critical: > 0.100 (100ms)"Step 4: Check Storage Path and MPIO
# Check MPIO configuration
$mpioFeature = Get-WindowsFeature -Name Multipath-IO
if ($mpioFeature.Installed) {
Write-Host "MPIO Feature: Installed" -ForegroundColor Green
# Get MPIO devices
Get-MSDSMAutomaticClaimSettings
Get-MPIOAvailableHW
# Check paths
mpclaim -s -d
# Check load balance policy
Get-MSDSMGlobalDefaultLoadBalancePolicy
} else {
Write-Host "MPIO Feature: Not Installed" -ForegroundColor Yellow
Write-Host "For SAN storage, MPIO should be installed and configured"
}
# Check HBA information
Get-InitiatorPort | Select-Object PortAddress, ConnectionType, OperationalStatus | Format-Table
# For iSCSI
if (Get-Service -Name MSiSCSI -ErrorAction SilentlyContinue) {
$iscsiTargets = Get-IscsiTarget
Write-Host "`n=== iSCSI Targets ===" -ForegroundColor Cyan
$iscsiTargets | Format-Table NodeAddress, IsConnected
# Check iSCSI sessions
Get-IscsiSession | Select-Object TargetNodeAddress, IsConnected, IsPersistent,
NumberOfConnections | Format-Table
}Pro Tip
When Event ID 129 occurs, immediately check your storage array's management console. Look for disk failures, rebuild operations, replication syncs, or other activities that might be consuming I/O capacity. The server-side view often doesn't show the root cause.
Quick Fix
Emergency Response Steps
When Event ID 129 is actively occurring, take these immediate steps:
# Step 1: Check if databases are still mounted
Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus |
Select-Object Name, Status, CopyQueueLength, ReplayQueueLength |
Format-Table -AutoSize
# If databases are dismounted, don't remount until storage is stable!
# Step 2: In DAG environment, move databases to another server
# Check which server has best storage health
$dagServers = (Get-DatabaseAvailabilityGroup).Servers
foreach ($srv in $dagServers) {
Write-Host "=== $srv ===" -ForegroundColor Cyan
Invoke-Command -ComputerName $srv -ScriptBlock {
Get-WinEvent -FilterHashtable @{LogName='System';Id=129;StartTime=(Get-Date).AddHours(-1)} -ErrorAction SilentlyContinue |
Measure-Object | Select-Object -ExpandProperty Count
} | ForEach-Object { Write-Host "Event ID 129 count (last hour): $_"$_" }
}
# Move databases away from problematic server
# Move-ActiveMailboxDatabase "Database01" -ActivateOnServer "EXCH02" -SkipLagChecks -Confirm:$false"Database01" -ActivateOnServer "EXCH02" -SkipLagChecks -Confirm:$false
# Step 3: Reduce I/O load temporarily
# Stop non-critical services
Stop-Service MSExchangeSearch -Force # Content indexing
Stop-Service MSExchangeTransportLogSearch -Force
# Step 4: If using virtualization, check for storage contention
# For VMware: Check datastore latency in vSphere
# For Hyper-V: Check physical host storage health
# Step 5: Contact storage team/vendor immediately
Write-Host "`n=== IMMEDIATE ACTIONS ===" -ForegroundColor Red
Write-Host "1. Contact storage administrator"
Write-Host "2. Check SAN/NAS management console"
Write-Host "3. Look for failed disks or degraded RAID"
Write-Host "4. Check for competing I/O workloads"
Write-Host "5. Do NOT remount databases until storage is stable"Critical: Event ID 129 is a serious storage event. Do not simply remount dismounted databases without first resolving the underlying storage issue. Forcing databases online on failing storage risks corruption and data loss.
Detailed Solutions
Solution 1: Increase Disk Timeout Value
Increase the disk timeout to prevent premature resets (use as temporary measure while fixing root cause):
# WARNING: This masks the symptom, not the cause
# Only use while actively working on storage fix
# Check current timeout value (default is 30 seconds)
$currentTimeout = (Get-ItemProperty "HKLM:SYSTEMCurrentControlSetServicesDisk").TimeOutValue
Write-Host "Current disk timeout: $currentTimeout seconds"
# Increase timeout to 60 seconds (registry change)
Set-ItemProperty -Path "HKLM:SYSTEMCurrentControlSetServicesDisk" -Name "TimeOutValue" -Value 60
# For SAN storage, the timeout may be in the HBA driver settings
# Check vendor documentation for your specific HBA
# Verify the change
Get-ItemProperty "HKLM:SYSTEMCurrentControlSetServicesDisk" | Select-Object TimeOutValue
Write-Host "`nIMPORTANT:" -ForegroundColor Yellow
Write-Host "- This requires a reboot to take effect"
Write-Host "- This is a TEMPORARY measure while fixing storage"
Write-Host "- Extended timeout means longer hangs when issues occur"
Write-Host "- Always investigate and fix the root cause!"Warning: Increasing timeout is a band-aid, not a fix. It simply means Windows will wait longer before declaring a timeout. The underlying storage issue must still be resolved.
Solution 2: Fix Storage Array Issues
Address the root cause in your storage infrastructure:
SAN/NAS Checklist
- ✓ Check for failed or failing disks - replace immediately
- ✓ Verify RAID status - no degraded arrays
- ✓ Check controller CPU and cache utilization
- ✓ Verify no hot spots (all disks balanced)
- ✓ Check for replication/snapshot overhead
- ✓ Verify network paths (for iSCSI) or FC switch health
- ✓ Update storage firmware to latest stable version
# For iSCSI storage - check network health
Test-NetConnection -ComputerName "storage-target-ip"-ip" -Port 3260
# Check for network errors on storage NICs
Get-NetAdapterStatistics | Where-Object {$_.Name -match "iSCSI|Storage"} |
Select-Object Name, ReceivedUnicastPackets, ReceivedDiscards,
OutboundDiscards, OutboundErrors | Format-Table -AutoSize
# For FC storage - check HBA status
# Use vendor tools (Emulex, QLogic, etc.)
# Or check Device Manager for HBA status
# Verify multipath health
if (Get-Command mpclaim -ErrorAction SilentlyContinue) {
Write-Host "=== MPIO Path Status ===" -ForegroundColor Cyan
mpclaim -s -d | Out-String | Write-Host
}Solution 3: Update Drivers and Firmware
Ensure all storage-related components have current, compatible drivers:
# Check current storage driver versions
Get-WmiObject Win32_PnPSignedDriver |
Where-Object {$_.DeviceClass -eq "DiskDrive" -or $_.DeviceClass -eq "SCSIAdapter" -or $_.DeviceName -match "HBA|Fibre|iSCSI"} |
Select-Object DeviceName, DriverVersion, DriverDate | Format-Table -AutoSize
# Check storport miniport drivers
Get-WmiObject Win32_PnPSignedDriver |
Where-Object {$_.InfName -match "disk|storport"} |
Select-Object DeviceName, DriverVersion | Format-Table
# Driver update checklist:
Write-Host "=== Driver Update Checklist ===" -ForegroundColor Yellow
Write-Host "1. HBA drivers - Download from vendor (QLogic, Emulex, etc.)"
Write-Host "2. Storage firmware - Check SAN vendor support portal"
Write-Host "3. Windows storage drivers - Windows Update or vendor"
Write-Host "4. Multipath DSM - From storage vendor"
Write-Host ""
Write-Host "After updates:"
Write-Host "- Test in non-production first"
Write-Host "- Schedule maintenance window for production"
Write-Host "- Have rollback plan ready"
# Check for pending driver updates in Windows Update
Get-WindowsUpdateLog # Creates readable log from ETL filesSolution 4: Migrate to Better Storage
If current storage cannot meet Exchange demands, migrate to higher-performance storage:
# Plan migration to new storage
# Best practices for Exchange storage:
Write-Host "=== Exchange Storage Requirements ===" -ForegroundColor Cyan
Write-Host ""
Write-Host "Performance Requirements:"
Write-Host " - Database reads: < 20ms latency"
Write-Host " - Database writes: < 20ms latency"
Write-Host " - Log writes: < 10ms latency"
Write-Host " - IOPS: Plan for peak, not average"
Write-Host ""
Write-Host "Recommended Storage Types:"
Write-Host " 1. All-flash SAN (best performance)"-flash SAN (best performance)"
Write-Host " 2. Hybrid SAN with SSD tier"
Write-Host " 3. Direct-attached SSD (excellent for small deployments)"-attached SSD (excellent for small deployments)"
Write-Host " 4. NVMe storage (highest performance)"
Write-Host ""
Write-Host "Migration Steps:"
Write-Host " 1. Provision new storage with required capacity"
Write-Host " 2. Run JetStress to validate performance"
Write-Host " 3. Create new database on new storage"
Write-Host " 4. Move mailboxes to new database"
Write-Host " 5. Decommission old database"
# For DAG environments - add new database copies
# New-MailboxDatabase -Name "DB_NewStorage" -Server EXCH01 -EdbFilePath "S:DBDB.edb" -LogFolderPath "S:Logs"-Name "DB_NewStorage" -Server EXCH01 -EdbFilePath "S:DBDB.edb" -LogFolderPath "S:Logs"
# Then move mailboxes:
# Get-Mailbox -Database "OldDatabase" | New-MoveRequest -TargetDatabase "DB_NewStorage"-Database "OldDatabase" | New-MoveRequest -TargetDatabase "DB_NewStorage"Danger Zone
Never ignore Event ID 129. Each occurrence represents a moment when your storage failed to respond within 30 seconds. Continued operation on failing storage risks database corruption that may not be immediately apparent but can cause data loss.
Verification Steps
Verify Storage Issue Resolution
# Comprehensive storage health verification
$server = $env:COMPUTERNAME
$checkHours = 24
Write-Host "=== Storage Health Verification ===" -ForegroundColor Cyan
Write-Host "Checking last $checkHours hours...`n"
# Check for Event ID 129
$events129 = Get-WinEvent -FilterHashtable @{
LogName = 'System'
Id = 129
StartTime = (Get-Date).AddHours(-$checkHours)
} -ErrorAction SilentlyContinue
if ($events129) {
Write-Host "Event ID 129 Count: $($events129.Count)"$events129.Count)" -ForegroundColor Red
Write-Host "STORAGE ISSUES PERSIST - Continue investigation" -ForegroundColor Red
} else {
Write-Host "Event ID 129 Count: 0"0" -ForegroundColor Green
Write-Host "No disk timeout events detected" -ForegroundColor Green
}
# Check disk latency
Write-Host "`n=== Current Disk Latency ===" -ForegroundColor Yellow
$latencyCounters = Get-Counter @(
"\$server\PhysicalDisk(*)Avg. Disk sec/Read",
"\$server\PhysicalDisk(*)Avg. Disk sec/Write"
) | Select-Object -ExpandProperty CounterSamples | Where-Object {$_.InstanceName -ne "_total"}
foreach ($counter in $latencyCounters) {
$latencyMs = [math]::Round($counter.CookedValue * 1000, 2)
$color = if ($latencyMs -lt 20) {"Green"} elseif ($latencyMs -lt 50) {"Yellow"} else {"Red"}
Write-Host "$($counter.InstanceName) - $($counter.Path.Split('')[-1]): $latencyMs ms"$counter.Path.Split('')[-1]): $latencyMs ms" -ForegroundColor $color
}
# Check Exchange database status
Write-Host "`n=== Exchange Database Status ===" -ForegroundColor Yellow
Get-MailboxDatabaseCopyStatus * | Select-Object Name, Status, CopyQueueLength, ReplayQueueLength |
Format-Table -AutoSize
# Check Exchange I/O counters
$exchangeIO = Get-Counter @(
"\$server\MSExchange Database ==> Instances(*)I/O Database Reads (Attached) Average Latency",
"\$server\MSExchange Database ==> Instances(*)I/O Database Writes (Attached) Average Latency"
) | Select-Object -ExpandProperty CounterSamples | Where-Object {$_.CookedValue -gt 0}
Write-Host "`n=== Exchange Database I/O Latency ===" -ForegroundColor Yellow
foreach ($counter in $exchangeIO) {
$latency = [math]::Round($counter.CookedValue, 2)
$color = if ($latency -lt 20) {"Green"} elseif ($latency -lt 50) {"Yellow"} else {"Red"}
Write-Host "$($counter.InstanceName): $latency ms"$latency ms" -ForegroundColor $color
}✓ Success Indicators
- • No Event ID 129 for 24+ hours
- • Disk latency < 20ms
- • All databases mounted
- • DAG replication healthy
⚠ Warning Signs
- • Occasional Event ID 129
- • Latency spikes during peak
- • Queue lengths building
- • RAID degraded state
✗ Failure Indicators
- • Continuous Event ID 129
- • Databases dismounting
- • Latency > 100ms
- • Disk errors in storage
Prevention Strategies
Proactive Monitoring
- ✓Monitor Event ID 129
Alert on ANY occurrence - treat as critical
- ✓Track disk latency trends
Alert if average exceeds 20ms
- ✓Monitor storage array health
SNMP traps or vendor monitoring tools
- ✓Regular health checks
Weekly review of storage metrics
Event ID 129 Alert Script
# Critical event monitoring script
# Run every 5 minutes via Task Scheduler
$server = $env:COMPUTERNAME
$checkMinutes = 10
$events = Get-WinEvent -FilterHashtable @{
LogName = 'System'
Id = 129
StartTime = (Get-Date).AddMinutes(-$checkMinutes)
} -ErrorAction SilentlyContinue
if ($events) {
$count = $events.Count
# CRITICAL ALERT
$alertBody = @"
CRITICAL: Disk I/O Timeout Detected
Server: $server
Event ID 129 Count: $count (last $checkMinutes minutes)
Action Required:
1. Check storage array immediately
2. Review disk health status
3. Check for failed/failing disks
4. Verify multipath status
This event indicates storage failure!
"129 Count: $count (last $checkMinutes minutes)
Action Required:
1. Check storage array immediately
2. Review disk health status
3. Check for failed/failing disks
4. Verify multipath status
This event indicates storage failure!
"@
# Send alert (customize for your environment)
$params = @{
To = "exchange-admins@domain.com"
From = "monitoring@domain.com"
Subject = "CRITICAL: Event ID 129 on $server"$server"
Body = $alertBody
SmtpServer = "smtp.domain.com"
Priority = "High"
}
Send-MailMessage @params
# Log the alert
Write-EventLog -LogName Application -Source "ExchangeMonitoring" -EventId 9999 -EntryType Error -Message $alertBody
}When to Escalate
Escalate Immediately When:
- →Multiple Event ID 129 events occur within an hour
- →Databases are dismounting or failing to mount
- →Storage array reports disk failures or degraded status
- →Root cause cannot be identified
- →Issue persists after driver/firmware updates
Need Urgent Exchange Storage Help?
Event ID 129 is a critical storage event that can lead to database corruption. Our Exchange specialists can help diagnose the issue, implement fixes, and ensure your data is protected.
15 Minutes average response time for storage emergencies
Frequently Asked Questions
Can't Resolve DISK_IO_TIMEOUT?
Exchange errors can cause data loss or extended downtime. Our specialists are available 24/7 to help.
Emergency help - Chat with usMedha Cloud Exchange Server Team
Microsoft Exchange Specialists
Our Exchange Server specialists have 15+ years of combined experience managing enterprise email environments. We provide 24/7 support, emergency troubleshooting, and ongoing administration for businesses worldwide.