Saturday, September 21, 2013

What's New in the Prefetch for Windows 8??

I am currently working on a PowerShell module that will provide a remote forensic capability for forensicators and incident responders alike.  I know many tools that parse file based forensic artifacts (Ex: Prefetch), but none of these tools parse remote artifacts.  Remote parsing provides defenders with many advantages, including a capability called Least Frequency of Occurrence (LFO).  If a defender can aggregate data from every prefetch file on every host in a large network, they can use LFO to spot anomalies, which may in turn point out malicious activity.


While creating a cmdlet, to parse Windows Prefetch files, I realized that there were many inconsistencies between the format of prefetch files on Windows XP and Windows 7.  I also realized that little research had been done regarding the structure of Windows 8 prefetch files, and assumed there would be small inconsistencies with this version as well.  I am writing a blog series to describe the methodology I used to learn Windows 8 prefetch file format, and to document some of the interesting finds along the way.


What is Prefetch?
Simply put, application prefetching is used to store information about frequently used applications to help Windows load them more quickly in the future.  This information is stored in "prefetch" files found in the %systemroot%\Prefetch\ directory.  At any given time the system can keep up to 128 (Windows XP/2003/Vista/7/2008) or 1024 (Windows 8/8.1/2012) individual prefetch files (Each one correlates to a single application).  These files contain information about the application like how many times it was run, when was it last run, what path it was run from, what external files it loaded in its first 10 seconds of execution, etc.


Understanding the Prefetch File Format (Windows 8)

This series of examples is using a cmd.exe prefetch file from Windows 8


File Signature (Offset: 0x00 8-bytes)




The first value found in the data structure is the File Signature at Offset 0x00.  The signature is a 8-byte value containing (0x1A,0x00,0x00,0x00,0x53,0x43,0x43,0x41), the first of which represents the OS (0x11 = XP, 0x17 = 7, 0x1A = 8).  The ASCII representation of the final 4 bytes is SCCA, which is commonly recognized as the Prefetch File Signature.


Application Name (Offset: 0x10 60-bytes)




Next, at Offset 0x10, we see a Unicode Formatted String that says "CMD.EXE".  This value represents the name of the application for which this prefetch record has been created.  This name can/should be compared to the name of the prefetch file itself to ensure the file name has not been tampered with.  One important detail is that this value is represented in 60 bytes, to allow for variable lengths of application names.  The standard is for the application name to be terminated by Unicode NULL (0x0000).


File Path Hash (Offset: 0x4C 4-bytes)




Have you ever wondered about those 8 random hex values in name of prefetch files?  Interestingly enough the value is a hash of the application's file path.  Remember earlier I said that one prefetch file is created for each application run?  Well...that is not entirely truthful.  A more precise explanation is that one prefetch file is created per application from a specific file path.  When cmd.exe is run from C:\Windows\system32 one prefetch record is created.  However, if I copy cmd.exe to C:\Windows and execute it, a second prefetch file will be created.

From a detection/investigation perspective this hash can be useful in finding malware hiding in plain sight.  Often times attackers will name their malware as common Windows applications, putting their version in a different directory than the real thing (Ex: Placing svchost in C:\Windows instead of C:\Windows\system32).  If they are careless and execute the malware without cleaning up their tracks, there will be a conspicuous second prefetch record for svchost that will provide investigators with some attack details.

Similarly to the application name, the file path hash is stored inside the prefetch file.  This can again be used to check for simple manipulation of the prefetch file name.

NOTE: The majority of non-unicode values are stored in Little Endian, so when parsing/reading the values make sure you flip the bytes.

Application Run Count (Offset: 0x0D 4-bytes) 


One of the more useful pieces of information found in the prefetch file is the run count, found at Offset 0xD0).  The run count tells a forensicator how many times the application has been executed. Remember this value is stored in Little Endian, so this example shows the run count to be one.

Last Access Timestamp (Offset: 0x80 8-bytes)


The last really useful piece of information found in the main section of the prefetch file is the last accessed timestamp.  This value is of extreme importance during an investigation because it tells the investigator when the program was last executed.  This value is stored as a "FILETIME" object, which is describe by Microsoft as being "the number of 100-nanosecond intervals that have elapsed since 12:00 A.M. January 1, 1601 Coordinated Universal Time (UTC)".  There are plenty of way to turn the hex value into an actual time (Ex: the hex editor I'm using automatically converts hex timestamps to the actual time, or you can use PowerShell to convert the value with .NET).

Christmas coming early...Thanks Microsoft!
Some of you may be familiar with the file structure of a prefetch file in previous versions of Windows.  If you are you are probably asking why the run count (Offset 0xD0 isn't directly following the timestamp (Offset 0x80) like it always has.  Well my friends, this is because Microsoft has decided to throw us a bone for whatever reason.  Prefetch files in Windows 8 contain not one timestamp, not two timestamps, but 8 Last File Accessed Timestamps!  Below I show a quick demonstration of how the Windows 8 prefetch files store the 8 timestamps.

2 Timestamps


Above is a screen shot of the same CMD.EXE prefetch file, except I ran the application an extra time to increase the run count (0xD0) to two times.  The important thing to notice is that the original timestamp has been shifted over to Offset 0x88, and the most recent timestamp has moved in to the 0x80 position.   Well this is interesting, but what about the other 64 bytes between the 2nd timestamp and the run count value?

8 Timstamps


After some trial and error testing I was able to determine that the Windows 8 prefetch files will store the 8 most recent last accessed timestamps.  Notice in the screenshot that the original time is still there, but it has been moved all the way down to the basement (Offset: 0xB8).  It is interesting to note that there is still 16 bytes of 0's.  What ever comes of that space?

What happens after 8?!?


Well that is a good question.  I have noticed that those 0's are changed after the timestamp slots a filled, but I have yet to determine what the value represents (If anyone knows or figures it out please let me know).  I am assuming it has something to do with the timestamps because the values seem to only appear after the 8 timestamp values have been created.

Conclusion
Well this concludes the first portion of my foray into the Windows 8 Prefetch File Structure.  My next post will discuss the different sections contained within prefetch files, and what information can be found in each section.

Thursday, August 15, 2013

R2D2 Memory Sample Analysis


Introduction
One of the skills I have been learning over the past year and a half or so is memory analysis.

For those of you that don't know the Volatility's Google Code page has made quite a few memory samples available for test analysis.  I stumbled upon them this evening, and I figured I'd try to tear one apart for an hour or so, and see where that got me.

The sample I decided to begin with, at random, was the one titled R2D2 (cool sounding name)

For my analysis I used Volatility 2.2 and SANS' Ubuntu SIFT Workstation VM.


Setup
Personally I don't like to look at long command lines, so the first order of business during memory analysis is to set up my Volatility environment variables (VOLATILITY_LOCATION & VOLATILITY_PROFILE).  Initially the profile (OS/Service Pack/Architecture) is not known, so I settle for setting the location with the following command:


Once the location is set we can start using Volatility!  The great news is that Volatility already knows what image we want to analyze because of the variable we just set.  The first plugin that should be run during memory analysis is imageinfo.  imageinfo provides us with the image's profile, which is a cruicial piece of information when using Volatility.  Specifying an image tells Volatility what data structures to expect when parsing the image.  To execute the imageinfo plugin you simply type:


Now that we have found the suggested profile we can specify our VOLATILITY_PROFILE environment variable:



Analysis
Ok, we have handled all the necessary setup, and now it is time to roll up our sleeves and dig into some R2D2 memory analysis.  I like to start my analysis looking for low hanging fruit (processes, network connections, services, and drivers).  These artifacts are the most tangible artifacts, and we tend to have the most experience with them.  I began with the pstree plugin to see a hierarchical view of the processes relative to their parent process:


Initially nothing stuck out from the process tree, so I moved on to psxview.  The psxview plugin queries process data structures using 4 or 5 different methods.  The output contains a list of processes, and tells the analyst whether each method was successful in detecting the process or not.  For example, if a process is found using the pslist method (follows the linked list from one EPROCESS structure to another), but not using the psscan method (search for EPROCESS structures regardless of linked list), then it is a good indication the system has been manipulated (possibly by a rootkit using DKOM techniques).  You can use the psxview plugin like so:


This sample appears to have no hidden processes.

It is always a good idea to see what network traffic was occurring at the time of the memory capture.  The connscan plugin will parse connection information from the image.


It appears there was an established connection from the localhost on local port 1026 to 172.16.98.1 on port 6666.  This connection was established by PID 1956 or explorer.exe (correlated from pstree output).  This connection is suspicious because explorer.exe does not typically make network connections, so this connection lets us know that explorer.exe is probably our process of interest.

Using the output of pstree we see that cmd.exe is a child process of explorer.exe, and it might be nice to see what commands were issued to that cmd shell.  One cool plugin is the cmdscan (or consoles for Windows 7 and Server 2008).  The cmdscan plugin parses the command history buffer located in csrss.exe (or conhost.exe on Windows 7 and Server 2008), and returns any commands that remain in the memory buffer.  Below is the command line usage for the cmdscan:


In the command history buffer were two commands "sc query malwar" and "sc query malware" (Just what we were looking for!).  It appears the attacker was checking on the registration details for a service named malware.  At this point the service name is a good indicator of malicious activity, but I've seen some pretty strange things in the past.

To enumerate that service information we can check the HKLM\SYSTEM\ControlSet001\Services registry key for the "malware" service's registration details.  We have to first find the virtual memory address of the System hive, which we can do using the hivelist plugin.


Once we have the virtual address (0xe1018388) we can enumerate the key ControlSet001\Services\malware using the printkey plugin.


Looking at the malware service we notice the service's image path was C:\WINDOWS\system32\drivers\winsys32.sys with a start type of 1 which according to Microsoft "represents a driver to be loaded at kernel initialization", or what appears to be the persistence mechanism.

Now we need to learn more about this apparent driver, and maybe even grab a copy of it for ourselves.  Using Volatility's svcscan we can gather more information about the "malware" service.


The results show that the driver associated with this service is recognized as \driver\malware.  We can now work toward grabbing a copy of the running driver.  We must execute a driverscan to access the starting memory address of the driver.


Once the address is obtained (0xf9eb4000) we can use moddump to dump the driver at the address specified using the  -b parameter.


We now have a copy of the suspected malicious driver (winsys32.sys), which we can pass to Virus Total (if we think it has been seen before) or our reverse engineers.  I submitted the driver to Virus Total, and it was flagged by 14/35 anti-virus programs as the R2D2 backdoor.


We can perform rudimentary analysis of the driver using the strings utility.


Below is an example of view the ascii strings contained within the malicious file.


Earlier we found a suspicious network connection coming from explorer.exe.  We know explorer.exe to be a legitimate process, so we may want to look for dll injection as a possible avenue of infection. We can list explorer.exe's loaded modules using the dlllist plugin.


One module (mfc42ul.dll) seems to stand out because its virtual address is so different than other modules (0x10000000 vs 0x70000000).  I am unsure if this is a valid indicator of dll injection or just a coincidence, but I submitted the module to Virus Total, and the dll was flagged as being associated with the same R2D2 backdoor.




NOTE: In the ascii strings output of the malicious driver (winsys32.sys) we find references to the malicious module (mfc42ul.dll).


We perform some quick triage of the dll and notice some strings that indicate HTTP functionality, and we also see a string "C3P0-r2d2-POE" which appears to be the string that got this malware its name.



That wraps up my analysis of the R2D2 backdoor.  Upon completion of my analysis, I stumbled upon evild3ad's blog post documenting his analysis of the same sample.  Please check it out!

If anyone has additional details I missed, or has any feedback to improve my methodology it would be greatly appreciated.


Monday, July 29, 2013

PowerShell Intro

Before covering how Incident Responders can leverage Windows PowerShell, I thought it would be appropriate to provide a small introduction for those who are not familiar with all the cool things that can be done through PowerShell.

What is PowerShell:
Windows PowerShell is Microsoft's answer to our cries for a better scripting environment and command shell.  Scripting in PowerShell is much more intuitive than batch, and although PowerShell does have some Microsoft centric quirks it more closely resembles other interpreted languages like Python.  Microsoft has made the entire .NET framework available through PowerShell, which allows a great deal of flexibility to those who want to take on development.  Through PowerShell, Microsoft without a doubt opened many doors for Forensicators and Incident Responders.

PowerShell introduces two features which make scripting significantly more easy.  The first feature is tab completion, those readers familiar with unix will understand how nice tab completion can be.  If you do not know the exact command syntax all you have to do is press tab and PowerShell will do the work for you.  The second, and in my opinion most important, feature is the object oriented nature of PowerShell.  PowerShell commands (Cmdlets) return .NET objects, which can be manipulated to display the desired results.  These new features alone make scripting in PowerShell easier and more logical than batch or VB scripting.

PowerShell History:
To better understand Windows PowerShell's origins refer to Jeffrey Snover's Monad Manifesto. The manifesto describes Monad platform, one part of which was the Monad Shell.  The Monad Shell is the brainchild which would eventually morph into PowerShell.

Availability:
PowerShell version 2.0 is available on Windows 7 and Windows Server 2008 by default.  Depending on your patching you may have version 2.0 if you are running Windows XP SP3, Windows Vista SP1, or Windows Server 2003 SP2. If you have a pre Windows 7 host that does not have PowerShell you can find the appropriate patches here.  PowerShell version 3.0 is available on Windows 8 and Windows Server 2012, and version 4.0 will be introduced with Windows 8.1 Beta and Windows Server 2012 R2.  To learn more about the additions to PowerShell in version 3.0 and 4.0 click the links embedded in their version numbers.

Environments:

PowerShell provides users two main environments, the Command Line Interface (CLI) and the Integrated Scripting Environment (ISE).

Windows PowerShell (CLI)

Windows PowerShell ISE
The CLI is your go to interface for PowerShell.  You can treat it exactly the same as the Windows Command Shell (in fact if you enter a native Windows command, PowerShell will automatically spawn a Command Shell and execute the command for you).  The CLI's utility begins to run out when you want to introduce logic or use more than one line which is why Microsoft gives us the ISE.  The ISE is enabled by default on Windows 7, but on Server 2008 you must install the Windows PowerShell Integrated Scripting Environment (ISE) feature.  The ISE provides, as the name implies, an environment that is fairly useful for writing, testing, and debugging scripts.  As seen in the picture above the ISE has three windows.  The window on the right is the script pane, where users can author scripts using some niceties like tab completion.  The bottom left window is the console pane which has practically identical use as the Windows PowerShell CLI.  The top left windows is the output pane, which is where output is written by default (PowerShell supports many different outputs such as: text, csv, html, and xml)

Cmdlet Naming Convention:
All PowerShell cmdlets follow a standard naming convention which is called the verb-noun convention.  Each cmdlet name will consist of a verb followed by a dash (-) and a singular noun.  Some example cmdlet names are Get-Process, Stop-Service, Set-Variable, and New-Object.  MSDN has a list of each approved verb and its meaning.

3 Cmdlets to Rule them All:
There are three cmdlets that will enable new users to use every other cmdlet available in Windows PowerShell.

The first cmdlet every PowerSheller should know is Get-Command.  This cmdlet returns a list of every cmdlet available in the current shell.  Get-Command can be used alone, or if you are looking for a specific type of cmdlet you can the -Verb parameter to specifically look for cmdlets that perform a certain action, and -Noun which will show all cmdlets pertaining to a specific type of object.

The second important cmdlet is Get-Help which is the equivalent of Linux's man command.  This cmdlet returns the help file which contains syntax, parameter descriptions, and usage examples.
Once you have found the cmdlet you want to use, and have determined the proper syntax it is important to understand what output you will receive.

As mentioned earlier PowerShell is object oriented, so each cmdlet will return an object with a series of properties and methods.  The cmdlet Get-Member accepts an object as a parameter, and will return the properties and methods of the object.  The most common syntax for Get-Member is Get-Process | Get-Member which will provide the output of Get-Process as the input object for Get-Member

PS Drives:
One final feature readers should be aware of is how PowerShell deals with structured system data.  PowerShell has a feature known as PSDrives through which it treats structured system data (i.e. the Windows Registry) as if it were the file system.  Additionally, PSDrives do not require single letter names, so you can name you data drive "Data" instead of "D:\".

To enumerate all PSDrives on the system use the cmdlet Get-PSDrive.  Below is some example output of the Get-PSDrive cmdlet.  As you can see there are many PSDrives available in the current PowerShell session such as the Local Machine and Current User registry hives, as well as, shell variables and aliases.  PowerShell treats all data as items, and the PSDrive concept allows administrators to learn one method that will work for files, registry, environment variables, etc.



Further Reading:
For those interested in learning a little more in depth about PowerShell basics there is an excellent book written by Don Jones called Learn Windows PowerShell 3 in a Month of Lunches. Don Jones is one of, if not the most, respected minds in regard to PowerShell, and he does an excellent job introducing this application to readers through a series of 1 hour lessons or lunches (I read it over a weekend because I was in a hurry to learn as much as possible).

- Invoke-IR - By Jared Atkinson -