Saturday, March 7, 2015

Parse the MFT in PowerShell


[This post has been deprecated.  During the development of PowerForensics, I have decided to move away from the Sleuth Kit naming convention in favor of the artifact's true names (ex. Get-IStat has become Get-MFTRecord and Get-ICat has become Get-ContentRaw.  For a better explanation of PowerForensic's capabilities please follow my "On the Forensics Trail" series.]



Over the past year or so I have been thinking about the best way to implement a Digital Forensic framework from within PowerShell.  Recently, I wrote a PowerShell cmdlet in C# to parse the Windows Prefetch file for useful forensic artifacts, but I quickly realized that accessing files directly is not forensically sound.  

During my search for solutions, I came across clymb3r's blog post about his Invoke-Ninjacopy cmdlet and thought that I could use this methodology for my Forensic framework.  One of the components of Invoke-Ninjacopy is a DLL that parses the NTFS file system, and upon reviewing the DLL project I decided that porting it to a C# assembly library would be a very logical starting point for PowerForensics.  With the help of Brian Carrier's File System Forensics book, for context, I was able to reproduce the NTFS parsing code in C#.  

PowerForensics works by opening a read handle to the logical volume (such as the C Drive), and parsing the NTFS structures within the volume's raw bytes.

NOTE: PowerShell must be run as administrator to access the Volume's file handle.

The rest of this post describes some of the initial capabilities presented by PowerForensics.

Get-IStat
While writing the code for PowerForensics, I realized that I was starting to reproduce some of the functionality inherent in The Sleuth Kit, so I decided to maintain a similar naming convention to those tools (ex. Get-FSStat, Get-IStat, Get-ICat) for continuity purposes.

Get-IStat is a cmdlet that can be used to return the MFT Entry for a specified file.  The file can be specified by its Path or via its Index Number (what record it is in the MFT).  Additionally, the investigator can specify what Logical Volume they want to investigate through the VolumeName parameter (Remember that the volume must be using NTFS as its file system).


FileRecord Object

Get-IStat returns a custom FileRecord object.  This object is built into PowerForensics, and represents a File Record in the Master File Table.  In the context of this post the important properties of the FileRecord object are the RecordNumber, the record's index into the MFT, and the Attribute Array, the records attribute objects. 


Upon drilling down into the attribute array, we can see the STANDARD_INFORMATION and FILE_NAME attributes (there are more that didn't make the screenshot).  Each attribute contains information that proves invaluable during a forensic investigation.  For example, an investigator can use the timestamps from the STANDARD_INFORMATION and FILE_NAME attributes for timeline analysis or comparison for evidence of timestomping.


Get-ICat

Great! Now we have the ability to read the MFT by parsing the raw bytes of the volume.  What if we were able to make a copy of the contents of the file via the raw disk?

PowerForensics includes the Get-ICat cmdlet, which parses the DATA attribute in the file's MFT Record and outputs the contents of the file in the form of a byte array.  This byte array can be saved to a variable, and used as input to the Add-Content cmdlet to be output to a file (NOTE: Add-Content must be used with the Encoding parameter set to "byte").

NOTE: Get-ICat has not yet implemented all of the functionality of TSK's icat command.


Once we copy cmd.exe's raw bytes to a new file we can use a MD5 hashing function (shown below) to ensure our dumped file is the same as the original.


Below you can see that both C:\Windows\System32\cmd.exe (the original) and C:\Users\Public\Desktop\cmd (our copy using PowerForensics) have the same MD5 hash:


What about files that are locked by the Operating System like the registry hives?  Because PowerForensics is accessing the raw bytes on the HDD and parsing the Master File Table itself we are able to export these locked files while the OS is using them.

Below you will see that I am unable to use the System.IO.File ReadAllBytes method to read the SAM registry hive "because it is being used by another process".


Here we use Get-IStat to view the FileRecord object belonging to the SAM hive.  Notice that DATA attribute's NonResident property is set to True.  This means that the file's contents are too large for the MFT Record, which is 1024 bytes, and is stored elsewhere on the disk.  Conveniently, that other location is contained within the StartCluster and EndCluster Arrays.  We are able to multiply the StartCluster and EndCluster values by the size of cluster (typically 4096 bytes), to find the actual bytes on disk containing the file contents.  These properties are used by Get-ICat to output a byte array containing the file's contents.


Below we use Get-ICat and Add-Content to output the SAM file to our Desktop.


Although we cannot hash the original SAM file (because of the error discussed above), we can throw the outputted SAM file into a Hex Editor and see the registry hive file header.  Imagine when we add registry parsing to PowerForensics...


You can download the source code to PowerForensics on my github.  To use PowerForensics within PowerShell download the dll in the repo and use the Import-Module cmdlet within PowerShell (Ex. Import-Module <pathToDLL>).

I'm excited to hear your feedback and suggestions for further development.

- Invoke-IR - By Jared Atkinson -