Determine which disk is about to fail in Windows Storage Spaces

I have a server in my office to run several dev virtual machines. It runs Windows Server 2012R2 with Hyper-V and Storage Spaces. There are 5 HDDs combined in one storage pool and virtual disk above that pool

  • Layout = Mirror
  • Physical disc redundancy = 1
    (every chunk is stored on 2 random disks)

Recently my virtual machines become somewhat slow, a clear sign that disk is about to fail, but which one?

Storage Spaces

On a picture below I already retied PhysicalDisk5 but Storage Spaces thinks the disk is perfectly fine.

 

Disk properties indicate no issues.

PowerShell

Get-PhysicalDisk | Select-Object -Property FriendlyName, DeviceID, SerialNumber, Model
Get-PhysicalDisk | Select-Object -Property *
Get-PhysicalDisk | Select-Object -first 1 | Select-Object -Property *

Powershell does not show any useful info too

 

 

Solution

The solution was to use HDSentinel (or probably any other disk tool that can read bad sectors count) There is a portable 30 day trial version of it, so you can solve rare disk problems like that without extra cost: https://www.hdsentinel.com/download.php

 

What to do

If all virtual disks are healthy you are fine.

  1. get disk model and serial number
    Get-PhysicalDisk | Select-Object -Property FriendlyName, DeviceID, SerialNumber, Model
  2. turn server off and unplug bad disk
  3. put new disk. New disk must have the same or greater capacity.
  4. in storage spaces try to delete "lost communication" disk and choose the new one

 

If your pool is already Degraded you need to be gentle and careful.

  1. Using PowerShell mark disk as retired
  2. Right click disk on storage spaces physical disks and try remove it.
  3. If remove was successful repair all virtual disks first.
    If not - there is not enough free space to move data outside of your disk. If all SATA sockets are occupied and you cannot  insert new disk - you are stuck. You need to move away all data and recreate virtual disk :(
  4. Unplug bad disk and replace it with new one

 

Final thoughts

I will never use Physical disc redundancy = 1 again. Relying on a fact that only one disk will fail is not enough. In my case one disk failed and other was full of bad blocks. I was able to be patient and let the disk copy its data outside veeeery slowly. It took a week, no kidding. But next time I might not be that lucky.

Add comment