Encountering the dreaded “SCSI_DeviceClusteringClearState” PSOD (Purple Screen of Death) during vMotion operations in your VMware ESXi environment? This is a common yet challenging issue that administrators face when running Microsoft Cluster Service (MSCS) or Oracle RAC virtual machines.
The “SCSI_DeviceClusteringClearState” PSOD primarily occurs in ESXi 6.0 and 6.5 environments when performing vMotion operations on MSCS or Oracle RAC virtual machines. This critical system error is triggered by misconfigurations of non-RDM disks in physical bus sharing mode.
1. ESXi “SCSI_DeviceClusteringClearState” Error : Symptoms and Environment Analysis
Key Symptoms
When this PSOD occurs, you’ll typically see a backtrace similar to this:
Backtrace for current CPU #xx, worldID=xyxyxy, fp=0x2005
0xyyyzyyyxyzzy:[0xxxxxxyxxxxxx]SCSI_DeviceClusteringClearState@vmkernel#nover+0x8
0xyyyzyyyxyyyy:[0xxxzxxxxxxxxx]VSCSI_DestroyDevice@vmkernel#nover+0x2b8
Affected Environments
Component | Details |
---|---|
ESXi Versions | ESXi 6.0, 6.5 |
VM Types | MSCS VM, Oracle RAC VM, VVOLs |
SCSI Bus Sharing | Physical Mode |
Cluster Configuration | CAB (Cluster Across Box) |
Disk Types | Shared non-RDM disk (VMDK, VVOL) |
This issue has a higher occurrence rate in the following scenarios:
- vMotion operations in physical bus sharing mode
- Clustering node VMs containing shared non-RDM disks
- CAB configurations with SCSI bus sharing set to Physical
2. ESXi “SCSI_DeviceClusteringClearState” : Root Cause Analysis
The core cause of this PSOD is misconfiguration of non-RDM disks in physical bus sharing mode during vMotion.
Breaking it down further:
- SCSI-3 Persistent Reservations Conflict: MSCS uses SCSI-3 Persistent Reservations to control access to shared disks, but during vMotion, this reservation information isn’t properly transferred
- Non-RDM Disk Handling Error: Regular VMDKs or VVOLs are improperly handled in physical bus sharing environments
- VMkernel Device State Cleanup Failure: Exception occurs during SCSI device state cleanup after vMotion completion
3. Official Patch Resolution
ESXi Patches
VMware has provided official patches to address this issue:
ESXi Version | Patch Name | Reference |
---|---|---|
ESXi 6.5 | ESXi650-201811002 | VMware Official Documentation |
ESXi 6.0 | ESXi600-201909001 | VMware Official Documentation |
Patch Application Process
- Connect to vSphere Client
- Navigate to Host > Update Manager
- Download and install the respective patch
- Reboot the host
Important: Always backup your entire environment before applying patches and perform updates during scheduled maintenance windows.
4. Workaround Solutions
For environments where immediate patch application isn’t feasible, consider these alternative approaches.
Method 1: Shared Storage Configuration Change
Reconfigure to use supported shared storage configurations for MSCS:
Recommended Configurations:
- Single Host Cluster: Use one or more shared eagerzeroedthick virtual disks
- Physical RDM: Use RDMs in physical compatibility mode
- Virtual RDM: Use RDMs in virtual compatibility mode
Method 2: SCSI Controller Separation
Boot Disk SCSI Controller:
- Bus Sharing: None
- Purpose: System disk (C:)
Cluster Shared Disk SCSI Controller:
- Bus Sharing: Physical
- Purpose: Cluster shared disks only
Method 3: vMotion Restriction
As a temporary measure, disable vMotion for MSCS VMs:
- Select the VM in vSphere Client
- Go to Configure > VM Options > vMotion
- Select Disabled
5. Optimized MSCS Configuration
RDM Configuration Best Practices
Component | Recommended Setting |
---|---|
RDM Mode | Physical Compatibility Mode |
Storage Protocol | FC, FCoE, Native iSCSI |
Path Policy | Round Robin (preferred), Fixed, MRU |
Virtual Hardware Version | 11 or higher |
vMotion Network | 10GbE minimum (1GbE not supported) |
Detailed Configuration Steps
Step 1: SCSI Controller Separation
SCSI0: Boot disk (Bus Sharing: None)
SCSI1: Cluster shared disk (Bus Sharing: Physical)
Step 2: RDM Setup
- Select Physical Compatibility mode
- Configure Perennially Reserved flag
- Assign consistent SCSI IDs across all ESXi hosts
Step 3: Network Configuration
- Configure dedicated heartbeat network
- Ensure 10GbE network for vMotion
- Set up DRS Anti-affinity rules
6. Monitoring and Prevention
Log Monitoring
Monitor the following logs regularly to catch early warning signs before PSOD occurrence:
# Monitor VMkernel logs
tail -f /var/run/log/vmkernel.log | grep -i "scsi\|cluster"
# Check vMotion-related logs
tail -f /var/run/log/vmkernel.log | grep -i "migrate"
Regular Check Items
Check Item | Frequency | Method |
---|---|---|
Patch Level | Monthly | vSphere Update Manager |
SCSI Configuration | Quarterly | VM Settings Review |
Storage Health | Weekly | Array Log Check |
vMotion Performance | Real-time | vCenter Monitoring |
Backup Strategy
MSCS environments have special backup considerations:
- Use agent-based backup solutions (VMware backup limitations due to Physical Bus Sharing)
- Implement cluster-aware backup solutions
- Maintain application-level backups
Compatibility Matrix
Check the latest compatibility information at VMware Compatibility Guide.
While the “SCSI_DeviceClusteringClearState” PSOD may seem complex, it’s entirely manageable with the right understanding and systematic approach. The most reliable solution is applying the official patches, but the workarounds presented here provide viable alternatives for environments where immediate patching isn’t possible. Prevention is key. When setting up MSCS or Oracle RAC environments, follow VMware’s recommended configuration guidelines from the start and implement regular monitoring to catch issues early.
For additional technical support, contact VMware official support or visit the Broadcom Support Portal for expert assistance.