Jump to content

Deleted file instantly overwritten by 6-year-old file?


Nurick

Recommended Posts

Since moving from Windows XP to Windows 7 in March, I've been having very little luck with Recuva.  Files I try to recover are shown as overwritten nine out of ten times, seemingly regardless of the size, age or original location of the file I'm trying to recover.

 

Tonight I was trying to recover a 50MB file I had just deleted from a 16gb thumb drive.  Minutes after deleting the file (and without any further use of the drive) I wanted to get it back.  Using Deep Scan for all deleted files, 33 files were found.  The one I wanted, and 31 of the others, were all marked as unrecoverable because they were overwritten.  Ironically, the only deleted file marked recoverable (in excellent condition) was a two-year-old file that is by far the largest of the 33, at 770MB.  28 of the files -- including the one I wanted to get back -- were marked as overwritten by one specific 867MB file (an mp4 video, call it ABC.mp4) that is about 6 years old, and that I haven't accessed in at least a year.

 

As a test I then tried deleting two more files I no longer need (a 53MB mp4 and 321MB mp4) and then ran Recuva.  Sure enough they were both shown as unrecoverable because they were overwritten by ABC.mp4.

 

This is  FAT32 drive, but subsequent experimentation with an NTFS thumb showed a similar pattern.

 

Any thought on what might be going on?  I also tried a different recovery program which similarly showed the just-deleted files as overwitten (although that program doesn't identify the overwriting file), so it may not be a Recuva-specific issue.

 

EDIT:  To experiment a little further, I created a 4kb text file, saved it, and immediately deleted it.  It is shown as recoverable.  But now that one previously recoverable file (the 770MB one) is also shown as unrecoverable because it was overwritten.  But NOT by ABC.mp4.  Instead the overwriting file is that 4kb text file I had just created and deleted (and there was 1.36GB of free space when I did so).

Link to comment
Share on other sites

  • Moderators

This seems to be a characteristic of FAT drives. FAT uses a sequential list of cluster addresses (the FAT) which links the clusters used by a file together. On file deletion these cluster chains are broken/reset and I believe that this results in the message that an old file has overwritten a newer file. It's the FAT that says this, the file is not actually overwritten. I have not (yet) spent the many hours needed to grasp the idiosyncrasies of FAT to enable a more detailed explanation.

 

I've never seen this with NTFS.

Link to comment
Share on other sites

Thanks.  Even though I'm still having a lot less success with Recuva and other recovery programs than I used to in XP (maybe related to how I migrated my data to Win7??), looking into it a bit further shows that the particularly-strange behavior like that described in my first post seems unique to the FAT32 thumb drive. 

Link to comment
Share on other sites

  • Moderators

This is what I think happens. I'm very open to corrections or further info. It's also quite simplified to keep the length down and the interest up.

 

The FAT32 file system is essentially two parts, the root directory and the FAT itself. The FAT is a contiguous area constructed of 32-bit entries representing every cluster on the disk. The root directory contains entries for files and subdirectories.

 

A directory entry for a file holds the file name, length and starting cluster number. The FAT entry for this starting cluster holds the next cluster number, that entry hold the next cluster, etc: the cluster entries are thus chained together until an EOF is reached. This is how the file's data is retrieved.

 

On file deletion the file name in the directory is flagged as deleted but the other data is preserved. However the cluster entries in the FAT are set to zero.

 

When attempting a deleted file recovery the first cluster in the FAT can be found from the directory entry. The following clusters in the FAT can be assumed to belong to the file continuing to the full file size. However if the cluster allocation was fragmented this assumption is incorrect as clusters from another file will be found before the full size of the file is recovered. The recovery program can skip these clusters and continue from the next zeroed cluster entry, but this entry and subsequent entries may not belong to the file that's being recovered.

 

As you can see the further you go the further the possibility of recovery errors. Perhaps Recuva plays safe when it hits a fragmented file and sends the message that the file is overwritten by etc.

 

This would explain why a new file is 'overwritten' by an old file: any other comments or theories are welcome.

 

A bit later...... I created a few files on a FAT32 flash drive, all around 10 mb and all in one extent. I shift/deleted one file which was allocated 2708 clusters at offset 164018. A Recuva scan showed that the deleted file now had 37 clusters at offset 32946, and was overwritten by a live file. (A repeat exercise on one of the other files was the same.)

 

I then checked the overwriting file. This had 169 clusters at offset 32814, ending at cluster 32983. The overwritten file had 37 clusters at offset 32986, ending at cluster 32983.

 

So it appears that the start cluster of the deleted file is somehow being interpreted as 32946, which is nowhere even close to where it originally was, and is ending when it hits the eof marker for the overwriting file.

 

Everything I've assumed now seems nonsense. The more I see of FAT the more baffled I am. Or is Recuva giving me the runaround?

 

I really don't want to open a hex editor and look at the directory. I think I'm going to lie down now.

Link to comment
Share on other sites

  • Moderators

Well. here I am, sitting in front of a hex editor, with reams of scribbled-on paper all around. Who knows what all those numbers mean.

 

Right, if you thought NTFS was complex, just look at the simple FAT.

 

As posted above, the directory entry for a file contains the starting cluster number. In this way we can go to the first cluster entry in the FAT and follow the cluster chain to retrieve all the data. The field holding the first cluster number is two bytes. This is fine in FAT8, 12 and 16. But FAT32 has cluster numbers which won't fit into two bytes, so a second field is used. In each directory entry offset 0x1A holds the two low order bytes, and offset 0x14 holds the high order bytes. (These fields are also little-endian, so the conversion and conjunction of these fields is brain-numbing.)

 

In my tests I created and shift/deleted a 10 mb text file. What I found was that on deletion the two bytes at offset 0x14 were set to zero, So a cluster address which was 0x00026B8A was truncated to 0x00006B8A. So when Recuva (or any software) follows the first cluster address it goes to a completely different address, and in my case right into the middle of a live file. Recuva then follows the cluster chain of the live file to the eof marker. So instead of Recuva showing 2708 clusters at offset 158602 (as it did when the file was live) it now shows 79 clusters at offset 27530, making recovery impossible.

 

I assume that deleted files with a cluster address that originally was held in two bytes only will retain the correct address and be recoverable, but that would be very few files, or a small flash drive with a large cluster size.

 

This zeroing of the cluster address field is something I haven't seen discussed anywhere, otherwise I wouldn't be scratching my head here. The field at 0x14 is, or was, used for many other purposes, so maybe this is why it is done. Who knows?

 

The implications are that just about any deleted file on a FAT32 drive will not be able to be recovered, unless it is allocated at the very start of the drive and has a two-byte cluster address. Hmmm.

 

Jeez, I could get paid for doing this.

Link to comment
Share on other sites

Wow!  I wonder if that means that the one file that was (initially) shown as recoverable (one of the largest, 770MB) is the first one I wrote to the flash drive?  I can't tell now because I've reformatted that flash drive to NTFS, but I do believe it was one of the older files.

 

Later, when I have more time and and collect more datapoints (I saw some patterns in early experimentation), I'll write about the problems I am having with recovery in NTFS.  But a quick preview of one of the things I'm running into most is that the recovered just-deleted (from the recycle bin) file is often quickly found and marked in excellent condition -- but it is in fact reduced in size (usually to 544 bytes, at least for an mp4), and needless to say is corrupt and unusable. (For certain file types, notably pdf, I'm having much more success.)   And I'm only having the problem on my OS partition, not any of the other four partitions on the hdd.  If you or anyone else has any initial thoughts on this now (before I experiment further and post more detail, as a new thread), any input is of course welcome, but I'm pretty tied up with other things for the next week or two.

Link to comment
Share on other sites

  • Moderators

When a file is sent to the recycler it is renamed to $Rnnnnn.doc, or whatever the file extension was. Another file is created called $Innnnn.doc containing the original file and folder name. The $I file is 544 bytes long, so this is what you are seeing. The $R file is the file itself.

Link to comment
Share on other sites

Thanks.  At least for recycle bin scans, I seem to get just one or the other for a deleted file, either the $Innnnn or the $Rnnnnn, but not both.

 

But moments ago a deep recycle bin scan turned up empty for all my partitions, except for two files (a text file and an mp4) I emptied from the bin a few minutes earlier.  Is this a normal occurrence?  Could that result be cause by the scheduled (weekly) defrag Windows 7 ran last night?  If so, is it best to make defrag less frequent?  (Fyi, after a few more minutes of experimentation -- and maybe Windows or other programs doing some writing behind the scenes -- only one of the two files is still showing, an mp4 which showed as original_filename.mp4, 56MB on the first scan, and now as $IY2KA78.mp4, 544 bytes as expected with the $I.)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.