Sunday, January 13, 2013

Backtrack Forensics: PDF analysis with peepdf

Menu: Forensics -> PDF Forensics Tools
Directory: /pentest/forensics/peepdf

This is far the best PDF analysis tool I saw for Linux, it's an all-in-one utility to analyze PDF file. It will show all the objects and elements in the PDF, supports most of the encoding and filters, and can parse the PDF file. If you install Spidermonkey and Libemu it provides Javascript and shellcode analysis too. You can do string searches  see the physical structure (offsets) of the file, the logical tree, changelog, object streams. It also offers basic PDF creation, filter and object modification, string and name obfuscation, and many more. Again this is a great tool.

Usage:

It can be used in two modes, interactive and non-interactive, the first one is the powerful part. Before starting, it's worth to run an update:

./peepdf.py -u


The we can start the analysis. The most basic mode will display summary information of the PDF: number of objects, suspicious elements, if there are JavaScripts, where they are, MD5 hash, etc...

./peepdf.py /root/forensics/pdf/msf.pdf


I'm using the malicious PDF created before with metasploit, it utilizes a getIcon vulnerability, and added a reverse_tcp meterpreter as the payload. We can see that it found right away that the PDF probably malicious and also provided the CVE number. We can go to the interactive mode to discover more:

./peepdf.py -i /root/forensics/pdf/msf.pdf

It will display us the same information again, but with more colors, and we will get a PPDF> prompt, from where we can run the commands. Let's see couple of the options:

PPDF> help - will display the available commands


PPDF> info - displays the same basic information what we saw before
PPDF> metadata - displays metadata information of the file
PPDF> changelog - shows the changes of the PDF
PPDF> tree - displays the logical structure of the file


PPDF> offsets - displays the logical structure of the file



PPDF> object [ID] - shows the decoded content of an object
PPDF> rawobject [ID] - shows the raw content of an object


 

PPDF> stream [ID] - shows the decoded content of a stream
PPDF> rawstream [ID] - shows the raw content of a stream
PPDF> search [string] - searches for a string in the PDF and shows in which module it can be found
PPDF> references to|in [ID] - shows to which the references to or from the object


PPDF> js_analyze [ID] - analyzes JS in the object if we have spidermonkey installed
PPDF> quit - exit the program

Official website: http://code.google.com/p/peepdf/

No comments: