Thursday, July 19, 2012

Timeline Analysis - What's missing & What's coming..

If you missed my SANS 360 on timeline analysis...

What the heck is timeline analysis??  

Timeline creation and presentation is the concept of normalizing event data by time and presenting it in chronological order for review. This sequence of event data becomes a narrative “a story” of events over a period of time. Furthermore, it can be used to put events into context, interpret complex data and identify anomalies or patterns. The concept of timeline creation and presentation is widely used amongst many practices including Digital Forensics and Incident Response (DFIR)

For DFIR purposes, timeline creation and presentation primarily consists of recursively scanning through a file system (or linear through a physical or partition disk image) and extracting forensic artifacts and associated timestamp data. The data is then converted to a normalized structured format in which it can be subsequently reviewed in chronological order.

Creation and Filtering

A tool named “log2timeline”, by Kristinn Gudjonsson, is a example of a framework for automatic creation of forensic timeline data. If you are interested in learning more about timeline creation and analysis using log2timeline I suggest starting with Kristinn's list of resource or taking the NEW SANS 508 class (here's a review I authored based on my experience). The main purpose of log2timeline is to provide a single interface to parse various log files and artifacts found in evidence such as file system data, windows event logs, windows registry last written times and Internet history. The data is then output to a structured format such as CSV, SQLite, or TLN.

After the timeline is created, it can be filtered using “l2t_process”. This tool allows a user to “reduce” the size of the timeline by creating a subset of data responsive to certain keywords or time/date restrictions. For instance, a 5 GB timeline file could be filtered to 3 GB by running l2t_process using a date filter to only show events that occurred between 2009 - 2010.

Presentation

At the time of writing this there is no commercial or open-source tool specifically designed for DFIR professionals to review the output of log2timeline or forensic timeline data in general. Therefore, DFIR professionals are limited to using tools not specifically designed for forensic timeline data presentation such as Microsoft Office Excel, Splunk or grep. This limitation decreases productivity and increases the risk of error.

Some deficiencies of current presentation options
Microsoft Office Excel is a common method of reviewing forensic timeline data. Although Microsoft Excel is a intuitive and robust, it has fundamental limitations. For example the average output from log2timeline (based on a 320GB hard drive) is 5-10 million rows of data equaling approximately 3-5GB. Microsoft Excel version 2010 has a row limitation of 1,048,576 rows and version 2003 has a limitation of 65,536 rows. This severely limits DFIR professionals to view parts (“mini-timelines”) of the overall timeline often based on filtering criteria (pivot points) such as date ranges, keyword searches, or source types. In result, context can be taken away by not having the entire timeline to review. It also can make reviewing timeline data a iterative process by having to review multiple mini-timelines. 

Slide from my SANS 360 talk
On November 19, 2011 Klien&Co published an article documenting how to empower Splunk to review timeline data. Splunk is a  robust enterprise based application that collects and indexes data from various data sources. Splunk will store both raw and rich data indexes in an efficient, compressed, filesystem-based datastore, with optional data signing and auditing to prove data integrity. However, Splunk is complicated to use as it requires knowledge of Command Line Interface (CLI) and specific training on the tool. It is also difficult to generate reports and administer as a user.

grep, a CLI tool, is another option to parse and review forensic timeline data. However, for the average DFIR professional who is not familiar with CLI it can be a complicated and a inefficient method.

The Need

A better #$%!$@ way to review timelines [period].

The goal of my first phase of development was to create a forensic presentation tool specifically for timeline data. This would be a robust Graphical User Interface application that does the following:
  • Import structured timeline data such as log2timeline CSV file into a structured database. This would allow for fast indexed searches across large data sets.
  • Upon import, the application would allow the user to preserve source information. This will allow a practitioner to review data from multiple data sources in a SUPER timeline and easily correlate events across these different sources.
  • Subsequently, the forensic timeline data will be displayed for review in a Graphical User Interface (GUI) data grid similar to Microsoft Excel. It will have familiar features such as the ability to sort, filter, and color code rows by column headings or values.    For instance, a user could have the ability to import timeline data from 10 different hosts, filter to only show successful logons (based on evt log source types) between 2009 and 2010 and color color code the results by host to make the review process easy on the eyes :-)
  • Unlike Excel make filtering transparent.. visually see and understand  how the buttons you are pressing interact with the database and the results you are presented with -- sql query builder.
  • The interface would also be intuitive to the extent a user could create user defined tags, comments, and bookmarks for the purpose of reporting, filtering and assisting review. For instance, a user could create the tag “evidence of IP theft” and subsequently select one or multiple rows in the data grid and associate them with this tag -- just like you can in eDiscovery!!
  • At any point timeline data generate or reports or export data from the grid view. For example, export a filtered subset of data back into the CSV format to open in Excel or send to someone else? 
  • Ability to create custom queries.. so user is not limited by the GUI - think plugins!!!
  • Also, basic charting capability because "a picture can sometimes tell a thousand words".
The Solution
Let me start of by saying does anyone know what it feels like to stare at code for 5 hours (on a Saturday afternoon when its 80 degrees and sunny out , with no bathroom/food breaks, and all of your friends are at the beach?) trying to figure out why your code is broken, then to find out it's because your missing a single curly bracket somewhere? Well that's been my life for the last 12 months since I started my coding project. If you don't believe me -- ask my friends, oh wait I don't have any anymore - this tool has ruined my life :-)
Picture of new GUI with undockable panes for multiple monitor setups

If you have not had an opportunity to watch the recorded video (1:06:38 mark) of my SANS DFIR Summit 360 talk from the or review slides , I introduced the proof-of-concept tool I have been coding. Here is a short video (no sound) of the tool in action (note this is the first release - and the GUI has significantly changed since)

The tool consists of:
  • WX GUI from-end
  • Python Code
  • SQLite backend
Shout out to my high school sciene teacher, Mr. Wilson, who introduced me to Python. I used Python because it's cross-platform. My development and testing platform is Windows 7. At the DFIR Summit, I gave Tom Yarrish a copy of my tool and within minutes he had it running on his Macbook Pro running OSX. Pretty cool..

You can see auto-highlighting by source type and POC charting here..
I will never understand why people prefer in the year 2012 to still type things into green and black console windows? Therefore, I used WX as a GUI front-end. Why did I use WX? Simple because it's the first thing that came up in my Google search for "Python+GUI+programming". In hindsight I wish Google told me just to quit.

Also used SQLite3 as a back-end because A.) It's lightweight - no install required B.) You know its fast if high-frequency traders use it C.) It's scalable enough to review timeline data.
Overview of current process and development phases:
 


Overview of data flow:

In red I am working on in Phase 2.

When can you get it?
 
I currently have someone doing a code review. It will be posted VERY soon on the log2timeline-tools google code page -  http://code.google.com/p/l2t-tools/

As I stated in my SANS 360 talk, "it will be free to corporate and private but LE has to pay for this one.. you guys need to pay me back for all those parking tickets!"-- I might also post a donation page or something.. so I can buy myself a vacation or something. 


Also I really look forward to feedback positive/negative so I can improve and include thoughts in my future employer performance discussions so I dont wind up becoming a Walmart Greeter :-)


16 comments:

  1. Well done! Looking forward to this being released! Thanks, Dave!

    ReplyDelete
  2. This sounds awesome! Can't wait to give it a try. Thanks!

    ReplyDelete
  3. Dave - excellent post and thank you so much for creating GUI front end to l2t!!! I am looking forward to the release date!!

    ReplyDelete
    Replies
    1. thx!! very soon working on documentation..

      http://code.google.com/p/l2t-tools/wiki/l2t_Review

      Delete
  4. Nice work. It's come a long way and looks great!

    ReplyDelete
    Replies
    1. hehe, yea my life has also went downhill, thx!!

      Delete
  5. Hey David ! Very much looking forward to the release.

    ReplyDelete
    Replies
    1. thx! super soon just working on documentation now..

      http://code.google.com/p/l2t-tools/wiki/l2t_Review

      Delete
  6. This is going to be pretty helpful, thanks.

    ReplyDelete
    Replies
    1. Working on documentation now here.. and will be posted here soon as well..

      http://code.google.com/p/l2t-tools/wiki/l2t_Review

      Delete
  7. This tool sounds really useful, I appreciate all the hard work that must have gone into it. Is there an ETA for release (don't see anything for download in l2t-tools).

    ReplyDelete
  8. email me at david.nides@gmail.com

    ReplyDelete
  9. hi , this tool seems cool, how can i download it?

    ReplyDelete
  10. Hi,

    i came around this blog, because of the dfir mailing list. The gui looks great and i want to test it. But it isn't released yet, right?
    I'am a newbie in forensic's but if you need beta testers, maybe i can spend some time.

    ReplyDelete