Sunday, January 27, 2013

Backtrack Forensics: bulk_extractor

Menu: Forensics -> Forensic Analysis Tools
Directory: /usr/local/bin/bulk_extractor

bulk_extractor is a tool which will search a disk image for regular expressions. It has quite a few pre-defined,  and we can also create our own. We can specify one via the command line, or multiple, which it can read from a file. The tool runs in two phases, first it collects all information from a disk, and after it creates a histogram. It supports raw image (.dd), EnCase (.E01) and AFFLIB (.aff) files or it can be also run directly on the disk. It runs on multiple threads. bulk_extractor will also create a wordlist of all the words that are found in the disk image, which can be used as a dictionary for cracking encryption.

Let's see how to use it:

bulk_extractor -o output_dir image - this will scan the image file, and put all the results in the output directory
bulk_extractor -o output_dir image -j 30 - set threads to 30
bulk_extractor -o output_dir image -j 30 -E pdf - turns off all scanners except pdf
bulk_extractor -o output_dir image -e wordlist - enables wordlist scanner
bulk_extractor -o output_dir image -f 'regex-goes-here' - enables regex scanning, results are written to find.txt
bulk_extractor -h - help

There are quite a few other options, around tuning, enabling / disabling scanners and scanning a directory structure.

Starting the scan (it will run for a few hours):


Results directory, we can see that there are files for each scan type, which will contain all matches of the regex.


Example of the domain file, the number at the left side is the offset:


Official Website: https://github.com/simsong/bulk_extractor

No comments: