Note: This is the first post in a series on pattern visualization.
Identifying patterns, excluding patterns, finding outliers, linking one pattern to another is what we as analysts do everyday. A combination of "hey, thats funny", "Eureka!", "I've got you now", and "Did you see that!". So a lot of hard work, more than a pinch of patience, a smidgen of luck, a game of connect the dots, and perhaps several harsh words as (insert name of tool here) crashes.
This ability comes mainly through experience, a lot of practice, and a lot of data swimming in front of our eyes.
For the new analyst the ability to view raw data and identify exploitable data or data of evidentiary value is a technique that is generally found after being drowned in case data for a year or more. The analyst is able to do this because they are able to recognize the patterns in their data.
This ability begins simply with being able to recognize the file signature for a file, then it progresses to a handful of files, and before long you are a human file signature analyzer.
|Image: PK signature for Zip and Office XLSX/DOCX/etc files.|
The next step is understanding file system artifacts. Does "FILE" ,"BAAD", or "INDX" ring a bell? If so, then congratulations you can identify NTFS file system artifacts.
|Image: NTFS MFT "FILE" record.|
Now how about "URL", "HASH", "REDR", and "LEAK"? If you recognize these, then you can identify the older Internet Explorer history records.
After being able to recognize the hexy internals to various file formats you will start recognizing the patterns to encoded data. You will be able to say to yourself (subvocalize, co-workers get concerned or annoyed when you start talking to yourself.) - "hey that looks like a Windows timestamp" or that's "UTF-8 encoded Arabic text converted to URL Encoding" or that's a "Base64 encoded JPEG".
|Image: Windows timestamp. NTFS MFT record file creations timestamp.|
Did you catch the three time stamps following the highlighted one in the image above?
You may be asking: "Why does it matter if I have this ability or not?"
- Any script monkey can press buttons. The key to truly evolving beyond this lowly simian is the ability to visualize patterns in the data.
- Being able to identify patterns is critical for reverse engineering unknown file formats, recovery of data from corrupt files, and for processing data fragments.
- As an analyst you should be able to understand and validate/verify your tools. A large portion of the digital forensic process is about identifying patterns to verify or recover data:
- File signature analysis
- File carving
- File system recovery
- Partition recovery
- et al ...
- Also, keep in mind that the script or tools you are using cannot cover or be tested against every permutation of data. Often the developer of a tool is creating their software based on the partial reverse engineering of an undocumented file format. This can lead to the tool failing. Leaving you to pick up the pieces. If you need actionable evidence or intelligence now, you can't wait for the the developer's next releases in another six months.
To be continued ....