My appearance in
                1996Wayne Stegall


Copyright © 2014 by Wayne Stegall
Created April 16, 2014.  See Document History at end for details.


(Un)deleted Files

Your deleted files linger


Many did their taxes just recently on an internet computer then deleted their personal data to protect it from hackers.  But if they did not take special precautions it is still there.  Many people believe that a file is gone when they delete it from their computer.  Instead, all or nearly all of their deleted data remains.  How?

First, operating systems do not ordinarily erase data on deletion of a file.  Instead, it is only unlinked from the file system and left intact until the unallocated spaced is needed to write new data.  If this is so, a file is not actually gone until it is subsequently written over.  Even then, it may sit a long time in the recycle bin before it is even unlinked.  In some cases the file system may act further to preserve the data.  Older file systems that always wrote new data to the lowest addresses of the hard disk created high odds that deleted files would be overwritten soon.  Now modern file systems tend more to write on unused areas because of schemes to minimize fragmentation.

However it might not be enough even to write over unwanted data to be rid of it.  Due to the nonlinear properties of magnetic storage, overwritten data may be still seen by specialized equipment.  The idea is that writing the same value to a location produces a stronger value of that type than if written over the opposite value.  Consider that the signal represented in figure 1 below contains both new data and overwritten data.  Can you read what is underneath?

Figure 1:  Binary magnetic sequence showing two layers of data
signal

The following table gives the interpretation.

Table 1:  Translation of magnetic signal to visible and hidden layers.

Magnetic signal
 
Top Layer
 
Hidden Layer
Soft 0

0

1
Hard 0

0

0
Soft 1

1

0
Hard 1

1

1

Now both layers of data in figure 1 can be read.  The top layer is 01101001 and the erased layer underneath is 11001100.

Perhaps that is not all.  An axiom in communications theory says that the data capacity of a channel is the product of its signal-to-noise ratio and its bandwidth.1

C = B log2(1 + S/N) bits per second.

In digital terms this means that one bit of resolution can be assigned to each 6dB of SNR, each bit representing a layer of information.  Ignoring bandwidth, a reasonable presumption of 48dB of SNR for a magnetic system would infer a total of 8 layers of information.  Disk drives however may not utilize their full bandwidth.  Consider a likely manufacturing scenario where only large capacity platters are manufactured then are assigned to different capacity levels based on reliability testing.  A 320GB drive may have a 1TB platter that only tested errorless at the final capacity.  In a case like this the 3x excess bandwidth suggests 8x3 possible layers of data.  This would require the paranoid to overwrite their data 24 times.

The fact that hard disks provide an analog output infers that this kind of retrieval of data is at least attemptable if not an established practice among some.  However, because the mechanism for peeling the first two layers depends on the non-linearity of the magnetic medium, any attempt to peel more would require very advanced technology or even be impossible.  Perhaps advanced digital signal processing algorithms would be required.  If anyone can do this or know whether it is possible, it would be the National Security Agency given their assignment to computer security and computer intelligence, and their parallel skills in advanced cryptology.

What to do?

Protection from outside hackers who only have access to the file system would be secured my merely ensuring that data is actually written over.  Programs are available like Linux's shred that will overwrite a file before unlinking it from the file system.  Where someone can actually get your hard drive in hand, anything is possible that they are technically capable of.  Formatting the hard drive may not overwrite data but instead just unlink all the files and reset the file system leaving the content of files intact.  Some programs write alternating 0's and 1's often in two layers, first hex 55 (0101) then hex AA (1010), and claim maximum security.  Presumably they believe that only one or two layers can be retrieved from under the data.  The fact that Linux's shred program has a option to overwrite a file 25 times suggest that someone else has anticipated the possibility that that many layers of data remain.  If nothing else hard drive manufacturers usually offer diagnostic programs for download that have the ability to write zeros to the entire drive.

Don't dispose of or give away your computer without erasing its hard drive.


1Ferrel G. Stremler, Introduction to Communication Systems (Reading MA 1982), p. 505.  The log2() function in the equation only a conversion of units to bits per second and is not necessarily a part of the theory.

Document History
April 16, 2014  Created.