Examining Data Runs of a Fragmented File in NTFS

Examining Data Runs of a Fragmented File in NTFS

Carlos Cajigas MSc, EnCE, CFCE, CDFE, A+

While examining an acquired image of a flash drive in a recent case, I came

across the need to manually recover a fragmented file from an NTFS formatted volume.

I needed to manually perform this process for two reasons. First, I needed to validate

my software and be confident that it was in fact producing correct results. Secondly, I

wanted to manually replicate the process so that I could develop a deeper

understanding of how a fragmented file is tracked by the Master File Table.

The goal

The plan is to recreate the steps to that will lead to a file becoming fragmented in

an NTFS volume. Once we have successfully written a fragmented file in our test

media, we will look at its MFT record to examine the data runs contained in the data

attribute.

The test

To conduct our test we will be using a 256MB Flash Drive. Since we are going to

be adding data to this media and then examining it with a hex viewer, the first thing that

we need to do to prepare this media is sterilize it. Sterilizing a drive is the process of

writing a known hex value to every sector of a piece of media so that it can overwrite

any and all data that previously resided on that piece of media. For the purposes of this

article, I used Active Kill Disk, which is a light, powerful and free utility. While the media

was being sterilized, I proceeded to the next step.

I navigated to the Desktop of my Windows 7 computer and created a folder

named “Test”

Inside of this folder, I created three txt files. These txt files will be the files that

we will be copying to our test media. The files are named TEST1.txt, TEST2.txt, and

TEST3.txt. Each of these files contain 1000 bytes of data. TEST1.txt has 1000 number

ones (1). Yes, one thousand of them, one after the other. TEST2.txt, has 1000 number

twos (2), and TEST3.txt has 1000 number threes (3).

We will copy these files into the media in a specific order. The numbers in the

files will aid us in identifying the files when looking at the media through the hex viewer.

Test

Now that the media is sterilized, let’s format it. I pulled the media from the

computer and inserted back into a USB port. Within a second or two, Windows 7 asked

me to format the media

Since this is a test to be conducted on the NTFS file system, I formatted the drive

to NTFS. I chose an allocation unit size (cluster size) of 512 bytes, so that the bytes per

sector and cluster size would be the same, 512 byes.

My computer successfully formatted the drive without errors. The operating

system assigned it logical letter G. I right clicked on the media and looked at the

properties.

Now that the drive is formatted, we will now begin to write data to the media.

Copy the TEST1.txt file from the Test folder and paste into the media. Next, copy and

paste the TEST2.txt file onto the media.

At this point there should only be two files on the test media. Here is where it

gets interesting. In order to fragment TEST1.txt we are going to add another 1000 ones

to the file. Adding another 1000 bytes of data into the file will double its size from 1000

bytes to 2000 bytes. Open the TEST1.txt file in Notepad and add another 1000 ones

onto the file, save it and close it.

This is what it should look like.

SIDE NOTE: Notice that Windows now reads the file as having 1.95kb of data.

Even though I know that there are exactly 2000 bytes of data in the file, Windows only

reads 1.95KB, rather than a rounded 2KB. Actually Windows is right. The reason why

Windows reads 1.95KB of data is because there are actually 1024 bytes of data in a

kilobyte (KB). From the 2000 bytes of data in the file, Windows used 1024 bytes to

make up 1.0 KB. The remaining 976 bytes get divided by 1024, which is 0.952125.

Windows now adds 0.95 to the 1.0KB and displays 1.95(KB) of data to us.

Let’s continue with the test. Now, copy the TEST3.txt file from the test folder and

paste it into the test media. Now, go back to the TEST1.txt file and add another 1000

ones to the file. Test1.txt should now have 3000 bytes of data.

This is what is should look like

Examination

Our test media is now complete and ready for examination. Let’s look at the

media down at the hex level. For the purposes of this article, I used a demo version of

WinHex 16.3.

After firing up WinHex and opening our test media as a physical device, I learned

that WinHex is reporting that the TEST1.txt file starts in cluster 288 of the media, which

also happens to be sector 288. The TEST2.txt file starts in cluster 290 of the media,

which also happens to be sector 290. And lastly, the TEST3.txt file starts in cluster 294

of the media, which also happens to be sector 294. Notice that WinHex recognizes that

Test1.txt is three times larger than TEST2.txt and TEST3.txt.

Let’s go to each one of the clusters and see what we find.

Cluster 288: First cluster of TEST1.txt

Cluster 289: Second cluster of TEST1.txt

Cluster 292: Third cluster of TEST1.txt (Fragmented File)

Cluster 293: Fourth cluster of TEST1.txt (Fragmented File, continuation)

Cluster 296: Fifth cluster of TEST1.txt (Fragmented File, continuation)

Cluster 297: Sixth cluster of TEST1.txt (Fragmented File, Final)

Results:

TEST2.txt and TEST3.txt were each 1000 bytes in length and each occupied two

clusters on the media. TEST1.txt was 3000 bytes and it occupied 6 clusters on the

media. When we first wrote TEST1.txt onto the media it was only 1000 bytes in length.

At the time that it was first written to the media, it only occupied two clusters. When

TEST2.txt was written to the media, it was written immediately after TEST1.txt. We

then went back to TEST1.txt and added 1000 bytes, which caused the file to double in

length. When the operating system recognized that TEST2.txt was occupying the

immediate clusters after TEST1.txt, it had no choice but to write the extra 1000 bytes of

data to the next available clusters, which were 292 and 293. This action caused

TEST1.txt to become fragmented. A fragmented file is a file, whose file data is written

to the disk in a non-contiguous manner, in a fragmented manner, hence the term

fragmented file. We then added TEST3.txt and again went back to TEST1.txt and

added another 1000 bytes of data. Because the data in the TEST1.txt file was written to

three different areas of the media, its MFT record should contain three data runs it its

data attribute. Let’s look at the MFT’s data attribute.

Below is TEST1.txt’s MFT record. Highlighted in blue, is the record’s data

attribute. Notice the attribute identifier of 0x80000000.

Here is a closer look

The attribute in fact contains three data runs. Here are all the data runs

Let’s analyze each one of the runs individually.

First run list

The first runlist has a value in hex of 0x21 02 20 01. The right nibble of the byte

0x21 indicates how many bytes are used to calculate the number of contiguous clusters

in the run (1=one byte, that byte is 0x02). The value of that byte 0x02 tells us that the

file is contiguous for 2 clusters. The value of the left nibble of byte 0x21 tells us that the

last two bytes in the run (0x20 01) will indicate where the starting cluster of the run is

located. In this runlist the starting cluster is located at cluster 0x20 01. This hex value

of 0x20 01 converted into little endian, 0x01 20, indicates that the decimal value of the

starting cluster is cluster 288 for the aforementioned contiguous 2 clusters.

Second run list

The second runlist has a value in hex of 0x11 02 04. The right nibble of the byte




last byte in the run (0x04) will indicate where the offset in clusters where the file’s data

will continue. In this runlist the offset is 0x04. This hex value of 0x04 converted into

little endian is also 0x04. 0x04 is a decimal value of 4. The file’s data will continue at

offset 4 from cluster 288, which is cluster 292, for the aforementioned contiguous 2

clusters.

Third runlist

The third runlist has a value in hex of 0x11 02 04. The right nibble of the byte




last byte in the run will indicate where the offset in clusters where the file’s data will

continue. In this runlist the offset is 0x04. This hex value of 0x04 converted into little

endian is also 0x04. 0x04 is a decimal value of 4. The file’s data will continue at offset

4 from cluster 292, which is cluster 296, for the aforementioned contiguous 2 clusters.

No more runlists.

Conclusion

The data belonging to TEST1.txt was written to clusters 288, 289, 292, 293, 296

and 297. The file’s data was written to a total of six clusters that were all accounted for

by the file’s Master File Table record.

If this test helped you understand data runs of a fragmented file, and you were

able to use it in the course of your investigation, we would like to hear from you. Please

post your comments or email the author of this article [email protected].

Examining Data Runs of a Fragmented File in NTFS

Documents

Transcript of Examining Data Runs of a Fragmented File in NTFS