24

I want to use badblocks to check my HDDs and would appreciate clarification of its operation.

Can someone please explain the best options to use with -b and -c? I have included their definitions from the man page, but am not sure if larger sizes would be beneficial for modern disks with 64MB RAM and 4k sectors.

-b block-size       Specify the size of blocks in bytes. The default is 1024. 
-c number of blocks the number of blocks which are tested at a time. The default is 64

Secondly I would like to know if the write-mode test is any more thorough than the non-destructive read-write mode?

Lastly how many SMART sector re-allocations are acceptable / should drives with non-zero reallocation counts be immediately replaced?

Giacomo1968
  • 58,727
Yoren
  • 241

6 Answers6

23

Question 1:

With regards to the -b option: this depends on your disk. Modern, large disks have 4KB blocks, in which case you should set -b 4096. You can get the block size from the operating system, and it's also usually obtainable by either reading the disk's information off of the label, or by googling the model number of the disk. If -b is set to something larger than your block size, the integrity of badblocks results can be compromised (i.e. you can get false-negatives: no bad blocks found when they may still exist). If -b is set to something smaller than the block size of your drive, the speed of the badblocks run can be compromised. I'm not sure, but there may be other problems with setting -b to something smaller than your block size, since it isn't verifying the integrity of an entire block, it might still be possible to get false-negatives if it's set too small.

The -c option corresponds to how many blocks should be checked at once. Batch reading/writing, basically. This option does not affect the integrity of your results, but it does affect the speed at which badblocks runs. badblocks will (optionally) write, then read, buffer, check, repeat for every N blocks as specified by -c. If -c is set too low, this will make your badblocks runs take much longer than ordinary, as queueing and processing a separate IO request incurs overhead, and the disk might also impose additional overhead per-request. If -c is set too high, badblocks might run out of memory. If this happens, badblocks will fail fairly quickly after it starts. Additional considerations here include parallel badblocks runs: if you're running badblocks against multiple partitions on the same disk (bad idea), or against multiple disks over the same IO channel, you'll probably want to tune -c to something sensibly high given the memory available to badblocks so that the parallel runs don't fight for IO bandwidth and can parallelize in a sane way.

Question 2:

Contrary to what other answers indicate, the -w write-mode test is not more or less reliable than the non-destructive read-write test, but it is twice as fast, at the cost of being destructive to all of your data. I'll explain why:

In non-destructive mode, badblocks does the following:

  1. Read existing data, checksum it (read again if necessary), and store it in memory.
  2. Write a predetermined pattern (overrideable with the -p option, though usually not necessary) to the block.
  3. Read the block back, verifying that the read data is the same as the pattern.
  4. Write the original data back to the disk.
    • I'm not sure about this, but it also probably re-reads and verifies that the original data was written successfully and still checksums to the same thing.

In destructive (-w) mode, badblocks only does steps 2 and 3 above. This means that the number of read/write operations needed to verify data integrity is cut in half. If a block is bad, the data will be erroneous in either mode. Of course, if you care about the data that is stored on your drive, you should use non-destructive mode, as -w will obliterate all data and leave badblocks' patterns written to the disk instead.

Caveat: if a block is going bad, but isn't completely gone yet, some read/write verification pairs may work, and some may not. In this case, non-destructive mode may give you a more reliable indication of the "mushiness" of a block, since it does two sets of read/write verification (maybe--see the bullet under step 4). Even if non-destructive mode is more reliable in that way, it's only more reliable by coincidence. The correct way to check for blocks that aren't fully bad but can't sustain multiple read/write operations is to run badblocks multiple times over the same data, using the-p option.

Question 3:

If SMART is reallocating sectors, you should probably consider replacing the drive ASAP. Drives that lose a few sectors don't always keep losing them, but the cause is usually a heavily-used drive getting magnetically mushy, or failing heads/motors resulting in inaccurate or failed reads/writes. The final decision is up to you, of course: based on the value of the data on the drive and the reliability you need from the systems you run on it, you might decide to keep it up. I have some drives with known bad blocks that have been spinning with SMART warnings for years in my fileserver, but they're backed up on a schedule such that I could handle a total failure without much pain.

Zac B
  • 2,903
  • 4
  • 27
  • 39
4

1) If your modern disk uses sector size other then 512b - then you need to set that size with -b option (i.e. -b 4096). Without that option your check will run much slower as each real sector will be tryied multiple times (8 times in case of 4k sector). Also as mentioned Olivier Dulac in comment to question - block is indeed 1 block, and not 1/2 or 1/4th or even 2 (or more) blocks.

Option -c imply on how many sectors tryid at once. It could have some implication on performance and value of that performance could depend on specific disk model.

2) write-mode test - In my understanding it will only check if you have hard-bad error or soft-bad error (aka Silent Data Degradation, bit rot, decay of storage media, UNC sectors)

3) I would not trust to SMART report at point in time. It is more important how values changes through time. Also here is research by Google Failure Trends in a Large Disk Drive Population and here is some discussion of it. Here is cite from research:

Despite this high correlation, we conclude that models based on SMART parameters alone are unlikely to be useful for predicting individual drive failures.

Regarding mentions by other for disk replacement - you may have not hard-bad disk problem but Silent Data Degradation (bit rot, decay of storage media,UNC sectors). In that case it has no sense to replace disk, but instead it is useful to perform read/write of same data back to disk. You could look here how it could be resolved.

If you have hard-bad error you could try to repartition drive in the way that bad area is located out of any partitions. For me that approach was useful and such bad drive was used for long time without any problems.

Broomerr
  • 116
1

I would leave -b and -c as default unless you have a specific reason to change them. You could probably set -b to 4096 if your disk has 4k block sizes.

I would suggest you first run badblocks with non-destructive rw test. If it finds any bad sectors, the disk is broken and should be replaced. If it does NOT find any bad blocks on non-destructive, but you still suspect it to have badblocks, then run the destructive rw test.

Lastly how many SMART sector re-allocations are acceptable / should drives with non-zero reallocation counts be immediately replaced?

I would replace the drive as soon as sectors are being replaced.

Thomas
  • 455
0

This kind of goes to the difference between badblocks read mode (non-destructive) and write mode (destructive):

A drive will only reallocate a bad sector when a write fails. Read errors for files only get "corrected" when an attempt is made to re-write the file. Otherwise. the bad block remains part of the file on the assumption that you might be able to recover something. Read errors for partition tables can only be "corrected" by running bad blocks in write mode and recreating the partition

So, read mode will tell you where the bad blocks are but can't do anything about them. Write mode tests the health of each sector and will cause the disk to re-allocate a bad block but at the expense of destroying the data. Take your pick.

0

To answer your second question about remapped sectors, it depends. I'm speaking from the context of a home user who (occasionally) monitors this kind of stuff.

  • How critical is the data stored on the drive?
  • What is lost if the drive suddenly goes belly up?
  • Is the data backed up elsewhere?
  • Is the drive a member of a RAID where loss of the drive has minimal impact?
  • Is the number of remapped sectors growing?

Here are two situations I faced. I had a RAID5 of 6 200GB drives. After a power failure that resulted in flickering lights, one drive showed 14 remapped sectors and logged several errors. I watched the drive and no more errors were logged and the remapped sector count remained stable. I concluded that the drive suffered due to a power transient and was not otherwise failing. I continued using it for years. The original RAID5 was retired but I have two of those drives in service with about 10 years of power on hours. They have a handful of remapped sectors. I use two of them mirrored to store incremental backup dumps from my primary backup. That way the main backup is seeing (mostly) read operations and the writes are going to different devices. If one of these ancient drives fails, The other should keep on going. If both fail, I replace them with something else and rerun the backup script. Impact if one of these drives fails is near zero so I don't worry about remapped sectors.

I had a 2TB HDD that was one of a pair of mirrored drives and which started to grow remapped sectors. At first it was dozens, then hundreds, then thousands. This was over a period of years. The other drive in the pair remained healthy and in fact, the slowly failing derive was not dropped from the array. Eventually I replaced both drives with 6TB drives and the growing remapped sector count became a non-issue. I still have the drive and it still "works," even with about 4500 remapped sectors. I have put drives like this in a test system (as a RAID member) to see what happens when one actually dies. I have had a couple opportunities to work with this and in all circumstances the replacement went without drama.

I did have a drive fail on my primary backup file server. It produced no advanced warning, it just stopped responding to SATA commands. It was a member of a ZFS RAIDZ2 and I replaced it without any drama. In fact, on my test server I have replaced failing drives without power cycling or rebooting the server.

One more point to note, I have on site and off site backups of all important data. If any one system is lost, there are two copies of the data elsewhere.

HankB
  • 121
0

As few sugested 4096 as good number for block size to use. But I would like to add up, in some scenarios it might not be enough. You might to use bigger block sizes just avoid triggering bug within badblocks tool. For some reason the badblocks shipped with 2019 Synology is likely too old (can't make it to print its version). The fdisk was reporting 512bytes sectors on my 18TB, so the 4096 felt big enough, but the checks passed significantly faster than on my 16TB. And then calculating what the badblocks reported and multiplied it with the block size showed only fraction of the size. First it felt to me like badblocks couldn't detect the full size of the drive.

But now it looks like I made a 32-bit int to overflow in some old version of the tool (and despite using 64-bit architecture, my badblocks is using 32-bit number as the counter).

ash-4.3# badblocks -b 4096 -p 3 -s -v -w /dev/sdq
Checking for bad blocks in read-write mode
From block 0 to 99614719
Testing with pattern 0xaa: 0.20% done, 0:03 elapsed. (0/0/0 errors)

99614719blocks * 4096bytes is around 408Gibibytes

then tried with 512 block size, not whole drive again, and then tried 8192

ash-4.3# badblocks -b 8192 -p 3 -s -v -w /dev/sdq
Checking for bad blocks in read-write mode
From block 0 to 2197291007
Testing with pattern 0xaa:   1.04% done, 11:20 elapsed. (0/0/0 errors)

2197291007blocks * 8192bytes => 18000Gibibytes

And seeing that log2(2197291007) = 31.033, meaning that with 8192 block size it's a number OVER 31-bit long. If I would half the block size to 4096 then the block counter would have to double and get over 32-bit. And it looks like that in my case it overflowed.

Again this might be non-issue with distros which do not ship with archaic versions of tools (I assume that only my badblocks has that counter as 32-bit and newer version uses 64-bit counter), or if you use smaller drives, but at certain edge-cases you might need to consider a minimum block size just to avoid getting into problems.