Welcome, Guest!

Here are some links you may find helpful

[Tutorial] Manually recovering files from a FATX file system

burninrubber0

Donator
Original poster
Donator
Registered
Jan 26, 2019
Donations
£270.73
41
150
33
youtube.com
AGName
burninrubber0
AG Join Date
Aug 16, 2016
This tutorial primarily focuses on recovering fragmented files or other files that FATX-Tools can't properly handle. It is not good for beginners. We'll need:
Although FATX-Tools may be able to handle fragments in the future, there are still cases where manual recovery may be necessary, so this tutorial should remain relevant.
Note that I'm neither the most experienced nor most knowledgeable when it comes to file systems, so if you see anything that seems incorrect, please say something.

For this tutorial, we'll be using a public HDD image from a Warface devkit as an example. (Warning: Expands to 120 GB. Use HDDRawCopy.)

FATX Basics
Before manual recovery, you'll need to know some key things about the FATX file system. FATX is very similar to FAT32, so there's lots of documentation crossover, but this section will be limited to only the relevant parts.

Partition header
The partition header provides critical information about the FATX partition: Its ID, sectors per cluster, and root cluster. For this tutorial, only the sectors per cluster will be relevant.


Here, the sectors per cluster value is 0x20. Because the sector size (defined in the partition table on devkits or in the kernel on retail) is 0x200, this means the size of each cluster is 0x4000 (16 KB).

Cluster map
The cluster map, or cluster chain map, is what gets read in the real world, effectively acting as a table of contents and telling the file system the location of each individual cluster the file uses. This is needed because of fragmentation, where a file isn't stored in a unified block and instead gets split up across the file system.


Here we see an example of fragmentation, jumping from cluster 0x4D to 0x9A. Without the cluster map, this wouldn't be possible.
Unfortunately, the cluster map is often destroyed. This is because formatting the drive works by wiping the cluster map, rather than the slower process of destroying directory entries or data. (There seems to be some data destruction (see "Skat", "Ploo"), but it's usually minimal.)

Directory entry
Directory entries are what the file system uses to store information on individual files and directories: their attributes, name, type, cluster, size, and timestamps. For us, these are the most important parts of the file system: files can be properly recovered using them, but without them, all metadata is lost and differentiating between files can be difficult or impossible.


The first entry reads: Name is 6 bytes long; is a directory; has name DEVKIT; first cluster is 2; and was created, modified and accessed 2012-05-24 15:39:18.
Since FATX-Tools recovers this metadata properly, we only need to read the cluster number (offset 0x2C in each entry) for manual recovery.

Refer to FAT and Design of the FAT system for more in-depth reading. The Free60Project wiki page is also a good resource, albeit unclear in certain places.

Recovery using FATX-Tools
Usage of FATX-Tools is necessary to find out what files require manual recovery. Run main_gui.py and use the File>Open command to open the image, then right-click Partition1. Note the offset and length, they'll be needed later.

Orphan analysis
The Perform Orphan Analysis option detects files via the file system using directory entries. If all goes well, a tab named Analysis results will appear after the analysis completes. Click it and a cluster list should appear. Note the cluster numbers; we'll need them later in order to find the files. Use the recover all option in the right-click menu.


The main drawback to this option is it can't handle fragments properly due to the lack of a cluster map. This means any files recovered will be recovered as if they're a single block, causing corrupt recoveries in some cases. Even files that would recover correctly in a normal recovery could be corrupt when using this option. Though there are ways of automatically detecting fragmentation, none are so perfect as to resolve the issue entirely.
If the Recover File System option recovers the same files, use that instead. The files are less likely to be corrupt than files recovered with an orphan analysis.

Finding fragments
After recovering all files via an orphan analysis, they should all be in a folder together. Open 010 Editor and use Search>Find in Files, then select the directory of the recovered files and search for the following hex bytes:
Code:
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
This is not a catch-all, and turns up lots of false positives, but it works for finding directory entries incorrectly recovered with a file. In this case, we find data instead.


Note the names of detected files. We'll use Cluster1/DEVKIT/Warface/CryAISystem.dll as an example.

Finding the file in the HDD image
Open the HDD image in a hex editor. From here, we need to get the offset of the data. This is where the partition offset and size we noted comes in. You can usually find the data offset by doing partitionSize * mapEntrySize / clusterSize + 0x1000 + partitionOffset and aligning 0x1000. In this case, that's 0x183DAC6000 * 4 / 0x4000 + 0x1000 + 0x2B0EB0000 which works out to 0x2B26EEAC6, or 0x2B26EF000 after alignment.
Now that we have the data offset, it's time to find the directory entry for the file. Use the cluster number of the file we noted and do dataOffset + (clusterNumber * clusterSize) - clusterSize. For CryAISystem.dll, this is 0x2B26EF000 + (0xE * 0x4000) - 0x4000, which comes out to 0x2B2723000.
Now we need to look for the directory entry for CryAISystem.dll and note the cluster.


The cluster number for this file is 0x111D and the file size is 0x38A000. Use the formula we used to find the directory entry earlier to find the file itself. You should end up at 0x2B6B5F000.


Yep, that's what we want to see - a XEX2 header is what you'd expect to find at the start of a DLL.

Finding the fragmentation
Open the file the orphan analysis recovered in a hex editor and go to the first offset we noted in 010 Editor, 0x60AE4.


We can't tell from this whether it belongs or not, but the thing about fragments is because they're clusters, they should always be aligned 0x4000 . Be careful, though - that doesn't necessarily mean it will be. It's a guessing game, so check every 0x1000 to see if there's anything that looks like a start or end.


There's a clear distinction between things at 0x64000. Let's check 0x4000 bytes before it to see if this is indeed the cluster end.


The PART header confirms this is the beginning of the cluster and that the fragment is one cluster in the file. Note that data will be random and won't necessarily be a PART header - it could be anything.

Repairing fragments
Defragmenting the file manually
Go back to the image, select the first 0x60000 of the DLL and copy it into a new file. Then skip 0x4000 and copy the remainder over. The remainder size can be gotten by doing fileSize - preFragmentSize, 0x38A000 - 0x60000 for this. For this file, that's all that's needed for repair. If there are multiple fragments, it's the same process multiple times. It's best to start from the last fragment and work your way backwards so the offsets don't change.
To check that it was recovered correctly, you can compare it to the file recovered via the Recover File System option. They should match.


Because this image has a cluster map and the file is part of it, we can see the fragment in action. Where cluster 0x1135 would normally be, 0x1136 is, and where 0x1136 would be, there's an end-of-file marker for TK_1PG26RD_063F83G10QB7R.03JCR6DR7FGOG, the file using cluster 0x1135. The first cluster isn't listed because it's defined in the directory entry.
If you have to recover files with an orphan analysis, viewing the cluster map isn't an option, but for this tutorial it demonstrates what a fragment looks like.

Defragmenting the file by modifying the file system
This is more challenging, but ultimately the "correct" way of doing things. Because CryAISystem.dll already has a cluster map, I'll use the only other image available to me, a Burnout 5 one, to demonstrate how this works.


FATX-Tools doesn't see anything. Let's check the cluster map.


This cluster map has been wiped. We need to rebuild it so it finds the files from the old filesystem. To do this, we need to do two things: re-add the directory entries to the DEVKIT directory, and re-add the used clusters to the cluster map. We'll skip the orphan analysis etc as that was covered earlier.


The current DEVKIT directory has completely overwritten the old one, meaning there are no folder names left from the old FS. We'll have to make some up when creating the entries.


Directory entry written. Now we need to find which clusters its files use and add them to the cluster map.


There are the file entries. At this point, you'd normally be able to see them in FATX-Tools, but there are two issues: the first 192 entries have been overwritten, so we'll have to move the remaining ones to the beginning of the cluster; and the clusters are off by 5 for some reason, so we'll have to add 5 to each. These issues seem rare, so I won't cover them in-depth.



The files may be displaying, but the recovery won't be attempted if they're marked as deleted. Changing 0xE5 to the file name length resolves this.
Finally, we need to add their entries in the cluster map so FATX-Tools can recover them.


Shown are the first two files in the cluster map, where the second file, TRK_UNIT134.BNDL, has a fragment at 0x98000. Recovery should now be feasible.


And there they are. Checking the files reveals no fragmentation.

Corrupt files
True corruption happens for only one reason: overwriting. The reason a file is overwritten varies, but the most common case is it was deleted and written over as new data was written to the drive. There are, however, two instances where a partial recovery may be possible.

Not all clusters got overwritten
This is fairly self-explanatory. New data is only written if it needs to be and won't necessarily destroy the entire old file. If you know what kind of data it is, it's possible some of the file can be recovered and potentially repaired. Recovery relies entirely on your own knowledge of the data.

The file was fragmented
New data might write a single block, but there's no guarantee the old file was one too. When fragmented, even if FATX-Tools recovers only garbage, there could be a fragment somewhere else on the drive. If you know some of the data, it could possible to search for and recover the truncated data. I recovered a critical file like this recently, so it can happen, albeit rarely.

Conclusion
Ultimately, the process for properly recovering fragmented or corrupt files is complicated at best and completely out of reach at worst. Finding fragments is a guessing game more than anything and the best way of recovering files is extremely time-consuming and can't be fully automated. As a result, this should really only be done when files crucial to running a build are recovered corrupt and there's no other choice.

Personally, I feel the majority of users, even experienced ones, won't do this when they can due to the complexity and "guessing" nature of this sort of recovery. I hope aerosoul, or anyone really, can automate something for this, but don't foresee it happening for a number of reasons.

Finally, my New Year's resolution is to develop an enormous hatred of disk fragments. Have a great 2020!
 
Last edited:

Make a donation