Erasing a Hard Disk
Why Would I Want to Erase my Hard Disk?
Data is stored inside of a file system on your hard
Deleting a file using your operating system's file manager may move the
data outside of the file system, but not necessarily remove it from the
disk. For example, in Microsoft Windows, emptying the recycle bin
permanently deletes the files' entry in the filesystem while leaving
the file itself untouched. The space where the file was located is then
marked as unused space, and the data will remain on the disk until it
is overwritten by another file. The reason your computer does this is
speed - it takes a lot less time to delete the few numbers making up
the file system entry than to delete the file itself, especially if
it's a really large file.
Data remanence isn't normally an issue in everyday use,
it's actually somewhat handy in that it's sometimes possible to recover
accidentally deleted files using software designed for that purpose. It
only becomes a problem when a drive has been used to store sensitive
information like account details, customer records, etc... and is then
repurposed or sold. Reformatting the disk will erase the file system
and write a new one on top of it, but all of the old files are still
there an abled to be recovered by the aforementioned software until the
space occupied by the files is explicitly written over.
File System Corruption
Sometimes after a file system failure, reformatting a
isn't enough to get it working properly again. In some cases, the
program used to reformat the disk isn't completely writing over the old
filesystem. Zeroing the disk will sometimes fix the problem, and is a
less severe option than a low-level format.
The way I erase a disk is to boot the computer from a
CD" such as Knoppix. Download it,
write it to a CD or flash drive then boot your computer from that.
After bootup you'll see a menu, select Graphical Programs then Full X
Session. Once everything's
loaded, click on the menu icon in the lower left corner of the screen,
click on Accessories, then Root Terminal.
Unless your computer only has one hard disk with one
on it you'll need to know which disk you want to erase. Knoppix uses
device nodes to refer to drives:
/dev/sda is the primary drive, /dev/sda1 is the first partition on the
primary drive, /dev/sdb is the secondary drive, and so on. If you can't
figure out your drive's name, type df -h to bring up a list of all of
the disks and partitions. Hard disks will be listed at the bottom and
it should look something like this.
Used Avail Use
...temporary filesystems edited out...
You'll be using the dd program to erase the disk.
Zeroing the disk
Here's an example of how to write zeroes over all of the
the primary hard disk.
dd if=/dev/zero of=/dev/sda bs=1M
The virtual device node /dev/zero produces a stream of
zeroes and is
used as the input file, /dev/sda
is the primary hard disk and is used
as the ouput file, and the block size is 1MB. Here's another example,
this time erasing only the first partition on the primary hard disk.
dd if=/dev/zero of=/dev/sda1 bs=1M
The procedure is the same for any other disk or
partition. Please note,
however that this process can take a long time. The last disk I used
this on was a 100GB ATA-100 disk, which took about an hour. Smaller
capacity disks and faster interfaces mean shorter wait times, larger
capacity disks and slower interfaces mean longer wait times. There
won't be any progress indicator, but you can estimate the amount of
time it will take by dividing the disk's capacity by its maximum
sustained write rate. For example, my ATA-100 disk had a capacity of
102,400MB and a maximum sustained write rate of 35MB/s:
102,400/35=2,925 seconds or 48 minutes.
Securely Erasing the Disk
In cases where the data on the disk is really valuable,
you may want to
go one step further and overwrite everything with random data. It
sounds paranoid, but there is a reason for this. The controller inside
the disk drive uses a low level formatting to find and retrieve data.
When a sector produces an error, the controller marks it as bad and
ignores it. Heavily used disks may have a large number of bad sectors,
and continuous strings of bad sectors may contain readable data that
can't be erased, although it's likely to be fragmentary and lacking
context as to what type of data it is. The bad sectors can usually only
be read by software specific to that particular make and model of
drive, but if the data is valuable enough someone may be interested in
doing all that. Random data hides any unerasable data in a sea of
garbage. Here's how to fill the primary drive with random numbers.
dd if=/dev/urandom of=/dev/sda bs=1M
The only change is replacing /dev/zero with
/dev/urandom, which is the
random number generator in Linux. Unfortunately, generating random
numbers takes longer than generating zeroes. The machine that I talked
about in the last section, an old IBM notebook, could fill the disk
with zeroes at 35MB/s, but random numbers at only 2.7MB/s. This
procedure on its 100GB disk took about ten hours.
Special Case: SSDs
The solid state drives that have become popular in the
last few years
are a special case. An SSD uses an array of EEPROM chips rather than a
magnetic disk as in a normal hard drive. Because EEPROMs have a finite
number of times that they can be written and erased, the drive's
controller uses a wear leveling algorithm to evenly distribute writes
across the various cells of the EEPROMs to ensure that no one cell
receives a large enough number of writes that it becomes unusable and
has to be removed from the array. Because of this, using dd to erase an
SSD may not work, and might actually damage the drive. Fortunately,
most SSD makers provide a program that can be used to reset all of the
cells to their original state, a process that only takes a minute or so.
For all intents and purposes, writing random data to a
modern disk once
is enough to render any remnant data on the disk unrecoverable. The
idea of writing multiple passes of random data to a disk originated
during a time when hard disks used stepper motors to position the head
over the data tracks, a less accurate method of positioning than
today's voice coil controlled drives, leaving data that was overwritten
once or more still readable. For any consumer level data, or even most
corporate data, a single pass with random numbers, or zeroing followed
by random numbers should be enough. Outside of heroic measures like
removing the disk from the drive and reading it with a scanning
tunneling microscope any data is effectively gone forever For the
absolute most critical data: million dollar accounts, trade secrets,
etc... it may be worth it to someone to do that and the drive will need
to be destroyed physically, but even then such an event is highly