Integrating Python

Integrating Python – Book Released

Integrating Python with Leading Computer Forensics Platforms

Integrating Python

Integrating Python with Leading Computer Forensic Platforms takes a definitive look at how and why the integration of Python advances the field of digital forensics. In addition, the book includes practical, never seen Python examples that can be immediately put to use. Noted author Chet Hosmer demonstrates how to extend four key Forensic Platforms using Python, including EnCase by Guidance Software, MPE+ by Access Data, The Open Source Autopsy/SleuthKit by Brian Carrier and WetStone Technologies, and Live Acquisition and Triage Tool US-LATT. This book is for practitioners, forensic investigators, educators, students, private investigators, or anyone advancing digital forensics for investigating cybercrime.

Additionally, the open source availability of the examples allows for sharing and growth within the industry. This book is the first to provide details on how to directly integrate Python into key forensic platforms.

  • Integrating Python provides hands-on tools, code samples, detailed instruction, and documentation that can be immediately put to use
  • Integrating Python shows how to integrate Python with popular digital forensic platforms, including EnCase, MPE+, The Open Source Autopsy/SleuthKit, and US-LATT
  • Presents complete coverage of how to use Open Source Python scripts to extend and modify popular digital forensic Platforms

About the Author

Chet Hosmer is the Founder of Python Forensics, Inc. a non-profit organization focused on the collaborative development of open-source investigative technologies using the Python programming language. Chet serves as a visiting professor at Utica College in the Cybersecurity Graduate program where his research and teaching focus on advanced steganography/data hiding methods and related defenses. He is also an Adjunct Faculty member at Champlain College in the Masters of Science in Digital Forensic Science Program where he is researching and working with graduate students to advance the application Python to solve hard problems facing digital investigators.

Chet makes numerous appearances each year to discuss emerging cyber threats including National Public Radio’s Kojo Nnamdi show, ABC’s Primetime Thursday, NHK Japan and ABC News Australia. He is also a frequent contributor to technical and news stories relating to cyber security and forensics and has been interviewed and quoted by IEEE, The New York Times, The Washington Post, Government Computer News, Salon.com, DFI News and Wired Magazine.

Chet is the author of four recent Elsevier/Syngress Books:

Chet delivers keynote and plenary talks on various cyber security related topics around the world each year.

Posted in Announcement | 2 Comments

AccessData User Summit

AccessData User Summit – a Great Exchange of Knowledge

ADUS Keynote

AccessData User Summit Keynote Discussion

This years AccessData User Summit was held in Lake Mary Florida April 5-7, 2016.  The summit brought together a plethora of industry experts, speakers and participants all with one objective in mind “to work together to advance cybercrime investigation methods and techniques”.  The event was one of the best orchestrated, organized and executed – before, during and after the event.

The breadth of Industry experts, investigators, practitioners  and educators was just outstanding.  The opportunities for interchange and networking provided a platform for deep discussions on current and future cybercrime related topics.

The technical sessions were hands on with each room outfitted with laptops for all participants allowing direct interactions with the subject matter being presented.  The rooms were filled and questions, new ideas and knowledge was shared by all in a very positive and inclusive format.

If you were unable to attend, ADUS has provided some key video clips from the conference.

Keynote sessions were moderated by Kevin DeLong who provided insight and pointed questions using an interview style engaging both the Keynote Speakers, as well as drawing the audience into the discussion.

 

Posted in Announcement, Discussion | 1 Comment

Chet Hosmer to Present at Enfuse 2016

Chet Hosmer, Python Forensics Founder to present at Enfuse 2016.

Enfuse 2016 May 24, 2016 Caesars Palace Las Vegas: 3:30 PM

The Enfuse 2016 talk will demonstrate new open source Python scripts to identify hidden content in multimedia files.

Presentation Description

Data Hiding is becoming a critical issue when investigating cybercrime. Suspects are becoming more sophisticated in hiding and protecting evidence. Recently new multimedia data hiding methods have been developed that aid in the hiding of evidence inside multimedia files without effecting the normal playing of the audio or video content. This Enfuse  lecture / demonstration will combine EnCase® and Python to discover and extract hidden content. All participants will be provided the Open Source Python and EnScripts to include their forensic toolkits.

Presentation Objective

Demonstration of new multimedia data hiding methods. Deep dive into the complexity of multimedia hiding methods. Detailed review of Open Source Python and EnScripts used to discover and extract the  hidden content. Live demonstration of the combined Python and EnScripts. Takeaway: Those attending this Enfuse presentation will receive copies of the Open source toolkit.

Why Attend Enfuse 2016?

In over 100 top-notch sessions reap the benefits of best practices, success stories, tools and practical solutions. Turn your biggest challenges into your greatest accomplishments when you learn from experts, leaders in the field and fellow professionals.

WHAT IS ENFUSE™?

Enfuse is a three-day security and digital investigations conference where specialists, executives, and experts break new ground for the year ahead. It’s a global event. It’s a community. It’s where problems get solved. Attend Enfuse to take your work—and your career—to a whole new level. Learn more about the change from CEIC to Enfuse in our FAQ.

Python Forensics

 

 

Posted in Announcement | Leave a comment

Version 1.1.1 QuickFish

QuickFish Python Script Update

Key Updates

– Supports both Python 2.7 and 3.3 Python Environment

– Fixed bug when processing SHA224

– Supports two hashes simulatenously i.e. MD5 and SHA512

– Added additional log and display messages

– Added Version and Release Date Information in the Log

– Added new progress indicator option

– Added hash matching feature based on input file

Special thanks to Lance for his persistence in getting me to update the script for Python 3 and for his detailed testing that turned up several bugs that are now corrected.

What is QuickFish?

For those not familiar with QuickFish.  QuickFish is a much improved version of the pfish.py script found in my Python Forensics Book.

Python Forensics

The new script is contained in a single file and performs a walk of the file system at a starting point defined by the user via a command line options.  For each file encountered during the walk, file attributes are collected along with the hash of the file.  Hashes supported include MD5, SHA1, SHA224, SHA256, SHA384 AND SHA512.   In the new version you can specify any two the hash methods.  The results are written to a comma separate value file that can be open by most the popular spreadsheet programs, for additional sorting and analysis.

Finally, an optional match file can be provided that contains a hash value and note in the following format one line per entry.

732A289B7DD8C7DD28D4D73ED2480BCF,Suspicious Image
40374D33463DFE213D31CCB0E1DEDC22,Proprietary Image

For each hash generated the hash is compared to those found in the match file.  If a match occurs then the MATCH will be indicated in the report along with the associated note.

Enjoy.

 

Download

Note, change .txt file extension to .py after download for safety.

'''
Copyright (c) 2014-2015 Chet Hosmer

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without restriction, 
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, 
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, 
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial 
portions of the Software.

'''
# 
# Python QUICK FISH File System Hash Program
# Author: C. Hosmer
#
#
# Revised: 10/02/2015  Version 1.1.1
#                     1) Fixed bug when processing SHA224
#                     2) Added additional log and display messages when
#                        the CSV File cannot be opened.  This is typically due
#                        to the file already being open in another application
#                        such as a spreadsheet
#                     3) Added Version and Release Date to Log
#                     4) Added new progress spinner argument -s, this displays a
#                        spinning progress indicator.  Good when scanning larger
#                        numbers of files and directories.  Note this is mutually 
#                        exclusive with the -v or verbose option
#
# Revised: 09/30/2015  Version 1.1
#                      1) Modified the source to work with either Python 2.7.x
#                      Python 3.x.  Thanks to the suggestions by Lance
#                      2) Fixed bug in Hash value lookup
#
# Revised: 11/1/2014 from the original pfish program in Python Forensics Book 
#                    ISBN: 978-0124186767
#                    New ideas and expansion inspired by Michelle Mullinix and Greg Dominguez
#                    Updates Include:
#                        a) Reduced the script to a single .py file for simple execution
#                        b) Allowed selection of one or (optionally) two hash types per run
#                        c) Supported all native hash types available in Python's hashlib
#                        d) Added the optional capability to include a hashmatch input file
#                           this will add two fields to the csv output file (Match and ID) 
#                           that contains the word FOUND when a match is identified
#                           along with the ID value associated with the hash from the input file. 
#                           Note the input file format for the hashmatch is strict  
#                           HASHVALUE,ID one entry per line.  

import logging    # Python Standard Library Logger
import time       # Python Standard Library time functions
import sys        # Python Standard Library system specific parameters
import os         # Python Standard Library - Miscellaneous operating system interfaces
import stat       # Python Standard Library - constants and functions for interpreting os results
import time       # Python Standard Library - Time access and conversions functions
import hashlib    # Python Standard Library - Secure hashes and message digests
import argparse   # Python Standard Library - Parser for command-line options, arguments
import csv        # Python Standard Library - reader and writer for csv files

# Support Functions Start Here, Main Script Entry is at the bottom

#
# Name: ParseCommand() Function
#
# Desc: Process and Validate the command line arguments
#           use Python Standard Library module argparse
#
# Input: none
#  
# Actions: 
#              Uses the standard library argparse to process the command line
#              establishes a global variable gl_args where any of the functions can
#              obtain argument information
#
def ParseCommandLine():

  parser = argparse.ArgumentParser('Python file system hashing .. QuickFish')

  group = parser.add_mutually_exclusive_group()
  group.add_argument('-v', "--verbose",  help="allows progress messages to be displayed", action='store_true')
  group.add_argument('-s', "--spinner",  help="displays progress indicator", action='store_true')

  # setup a group where the selection is mutually exclusive and required.

  group = parser.add_mutually_exclusive_group(required=True)
  group.add_argument('--md5',      help = 'specifies MD5 algorithm',      action='store_true')
  group.add_argument('--sha1',     help = 'specifies SHA1 algorithm',     action='store_true')  
  group.add_argument('--sha224',   help = 'specifies SHA224 algorithm',   action='store_true')  
  group.add_argument('--sha256',   help = 'specifies SHA256 algorithm',   action='store_true')  
  group.add_argument('--sha384',   help = 'specifies SHA384 algorithm',   action='store_true')  
  group.add_argument('--sha512',   help = 'specifies SHA512 algorithm',   action='store_true')   

  group = parser.add_mutually_exclusive_group(required=False)
  group.add_argument('--md5a',      help = 'specifies MD5 algorithm',      action='store_true')
  group.add_argument('--sha1a',     help = 'specifies SHA1 algorithm',     action='store_true')  
  group.add_argument('--sha224a',   help = 'specifies SHA224 algorithm',   action='store_true')  
  group.add_argument('--sha256a',   help = 'specifies SHA256 algorithm',   action='store_true') 
  group.add_argument('--sha384a',   help = 'specifies SHA384 algorithm',   action='store_true')  
  group.add_argument('--sha512a',   help = 'specifies SHA512 algorithm',   action='store_true')   

  parser.add_argument('-d', '--rootPath',   type= ValidateDirectory,         required=True, help="specify the root path for hashing")
  parser.add_argument('-r', '--reportPath', type= ValidateDirectoryWritable, required=True, help="specify the path for reports and logs will be written")   
  parser.add_argument('-m', '--hashMatch',  type= ValidateFileReadable,      required=False,help="specify the optional hashmatch input file path")   

  # create a global object to hold the validated arguments, these will be available then

  global gl_args
  global gl_hashType
  global gl_hashTypeAlt
  global gl_hashMatch
  global gl_hashDict
  global gl_verbose
  global gl_spinner

  gl_args = parser.parse_args()   

  if gl_args.verbose:
    gl_verbose = True
  else:
    gl_verbose = False

  if gl_args.spinner:
    gl_spinner= True
  else:
    gl_spinner = False

  # Determine the hash type(s) selected 

  # Mandatory

  if gl_args.md5:
    gl_hashType = 'MD5'

  elif gl_args.sha1:
    gl_hashType = 'SHA1'     

  elif gl_args.sha224:
    gl_hashType = 'SHA224'        

  elif gl_args.sha256:
    gl_hashType = 'SHA256'

  elif gl_args.sha384:
    gl_hashType = 'SHA384'    

  elif gl_args.sha512:
    gl_hashType = 'SHA512'

  else:
    gl_hashType = "Unknown"
    logging.error('Unknown Hash Type Specified')

  # Optional Type

  if gl_args.md5a:
    gl_hashTypeAlt = 'MD5'

  elif gl_args.sha1a:
    gl_hashTypeAlt = 'SHA1'     

  elif gl_args.sha224a:
    gl_hashTypeAlt = 'SHA224'        

  elif gl_args.sha256a:
    gl_hashTypeAlt = 'SHA256'

  elif gl_args.sha384a:
    gl_hashTypeAlt = 'SHA384'    

  elif gl_args.sha512a:
    gl_hashTypeAlt = 'SHA512'
  else:
    gl_hashTypeAlt = 'None'

  # Check for hashMatch Selection    
  if gl_args.hashMatch:
    # Create a dictionary from the input file
    gl_hashMatch = gl_args.hashMatch
    gl_hashDict = {}

    try:
      with open(gl_hashMatch) as fp:
        # for each line in the file extract the hash and id
        # then store the result in a dictionary
        # key, value pair
        # in this case the hash is the key and id is the value

        for line in fp:
          hashKey = line.split(',')[0].upper()
          hashID  = line.split(',')[1]
          # Strip the newline from the ID
          hashID  = hashID.strip()
          # Add the key value pair to the dictionary
          gl_hashDict[hashKey] = hashID

    except:
      logging.error("Failed to read in Hash List")
      DisplayMessage("Failed to read in Hash List")
  else:
    gl_hashMatch = False

  DisplayMessage("Command line processed: Successfully")

  return

# End ParseCommandLine============================================================      

#
# Name: ValidateDirectory Function
#
# Desc: Function that will validate a directory path as 
#           existing and readable.  Used for argument validation only
#
# Input: a directory path string
#  
# Actions: 
#              if valid will return the Directory String
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateDirectory(theDir):

  # Validate the path is a directory
  if not os.path.isdir(theDir):
    raise argparse.ArgumentTypeError('Directory does not exist')

  # Validate the path is readable
  if os.access(theDir, os.R_OK):
    return theDir
  else:
    raise argparse.ArgumentTypeError('Directory is not readable')

#End ValidateDirectory ===================================

#
# Name: ValidateDirectoryWritable Function
#
# Desc: Function that will validate a directory path as 
#           existing and writable.  Used for argument validation only
#
# Input: a directory path string
#  
# Actions: 
#              if valid will return the Directory String
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateDirectoryWritable(theDir):

  # Validate the path is a directory
  if not os.path.isdir(theDir):
    raise argparse.ArgumentTypeError('Directory does not exist')

  # Validate the path is writable
  if os.access(theDir, os.W_OK):
    return theDir
  else:
    raise argparse.ArgumentTypeError('Directory is not writable')

#End ValidateDirectoryWritable ===================================

#
# Name: ValidateFileReadable Function
#
# Desc: Function that will validate a file path as 
#       existing and readable.  Used for argument validation only
#
# Input: a file path
#  
# Actions: 
#              if valid will return the FilePath
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateFileReadable(theFile):

  # Validate the path is a file
  if not os.path.isfile(theFile):
    raise argparse.ArgumentTypeError('File does not exist')

  # Validate the path is readable
  if os.access(theFile, os.R_OK):
    return theFile
  else:
    raise argparse.ArgumentTypeError('File is not readable')

#End ValidateFileReadable ===================================

class Spinner:

  # Constructor

  def __init__(self):

    self.symbols = [' |', ' /', ' -', ' \\', ' |', ' \\', ' -', 'END'] 
    self.curSymbol = 0

    sys.stdout.write("\b\b\b%s " % self.symbols[self.curSymbol])
    sys.stdout.flush()

  def Spin(self):

    if self.symbols[self.curSymbol] == 'END':
      self.curSymbol = 0

    sys.stdout.write("\b\b\b%s " % self.symbols[self.curSymbol])        
    sys.stdout.flush()
    self.curSymbol += 1

# End Spinner Class

#
# Name: WalkPath() Function
#
# Desc: Walk the path specified on the command line
#           use Python Standard Library module os and sys
#
# Input: none, uses command line arguments
#  
# Actions: 
#              Uses the standard library modules os and sys
#              to traverse the directory structure starting a root
#              path specified by the user.  For each file discovered, WalkPath
#              will call the Function HashFile() to perform the file hashing
#

def WalkPath():

  processCount = 0
  errorCount = 0

  # Create a proper report path 
  reportPath = os.path.join(gl_args.reportPath, "fsreport.csv")
  oCVS = _CSVWriter(reportPath, gl_hashType, gl_hashTypeAlt)

  # Create a loop that process all the files starting
  # at the rootPath, all sub-directories will also be
  # processed

  # Create a proper root path
  if gl_args.rootPath.endswith('\\') or gl_args.rootPath.endswith('/'):
    rootPath = gl_args.rootPath
  else:
    rootPath = gl_args.rootPath+'/'

  logging.info('Start Scan Path: ' + rootPath)

  if gl_args.spinner:
    # Create a Spinner Object for displaying progress
    obSPIN = Spinner()      

  for root, dirs, files in os.walk(rootPath):

    if gl_spinner:
      # Update progress indicator
      obSPIN.Spin()   

    # for each file obtain the filename and call the HashFile Function
    for file in files:
      fname = os.path.join(root, file)
      result = HashFile(fname, file, oCVS)

      # if hashing was successful then increment the ProcessCount
      if result is True:
        processCount += 1
      # if not sucessful, the increment the ErrorCount
      else:
        errorCount += 1       

  oCVS.writerClose()

  return(processCount)

#End WalkPath==================================================

#
# Name: HashFile Function
#
# Desc: Processes a single file which includes performing a hash of the file
#           and the extraction of metadata regarding the file processed
#           use Python Standard Library modules hashlib, os, and sys
#
# Input: theFile = the full path of the file
#           simpleName = just the filename itself
#  
# Actions: 
#              Attempts to hash the file and extract metadata
#              Call GenerateReport for successful hashed files
#
def HashFile(theFile, simpleName, o_result):

  # Verify that the path is valid
  if os.path.exists(theFile):

    #Verify that the path is not a symbolic link
    if not os.path.islink(theFile):

      #Verify that the file is real
      if os.path.isfile(theFile):

        try:
          #Attempt to open the file
          f = open(theFile, 'rb')
        except IOError:
          #if open fails report the error
          logging.warning('Open Failed: ' + theFile)
          return
        else:
          try:
            # Get the Basic File Attributes
            # Before attempting to open the file
            # This should preserve the access time on most OS's

            theFileStats =  os.stat(theFile)
            (mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(theFile)

            # Attempt to read the file
            rd = f.read()

          except IOError:
            # if read fails, then close the file and report error
            f.close()
            logging.warning('File Access Error: ' + theFile)
            return
          else:

            #Print the simple file name
            DisplayMessage("Processing File: " + theFile)
            logging.info("Processing File: " + theFile)

            # Get the size of the file in Bytes
            fileSize = str(size)

            #Get MAC Times
            modifiedTime = time.ctime(mtime)
            accessTime   = time.ctime(atime)
            createdTime  = time.ctime(ctime)

            ownerID  = str(uid)
            groupID  = str(gid)
            fileMode = bin(mode)

            #process the file hashes

            if gl_args.md5:
              #Calculate and Print the MD5
              hash = hashlib.md5()
              hash.update(rd)
              hexMD5 = hash.hexdigest()
              hashValue = hexMD5.upper()
            elif gl_args.sha1:
              hash = hashlib.sha1()
              hash.update(rd)
              hexSHA1 = hash.hexdigest()
              hashValue = hexSHA1.upper()
            elif gl_args.sha224:
              hash = hashlib.sha224()
              hash.update(rd)
              hexSHA224 = hash.hexdigest()
              hashValue = hexSHA224.upper()                         
            elif gl_args.sha256:
              hash = hashlib.sha256()
              hash.update(rd)
              hexSHA256 = hash.hexdigest()
              hashValue = hexSHA256.upper()
            elif gl_args.sha384:
              hash = hashlib.sha384()
              hash.update(rd)
              hexSHA384 = hash.hexdigest()
              hashValue = hexSHA384.upper()
            elif gl_args.sha512:
              #Calculate and Print the SHA512
              hash=hashlib.sha512()
              hash.update(rd)
              hexSHA512 = hash.hexdigest()
              hashValue = hexSHA512.upper()
            else:
              logging.error('Hash not Selected')

            if gl_args.md5a:
              #Calculate and Print the MD5 alternate
              hash = hashlib.md5()
              hash.update(rd)
              hexMD5 = hash.hexdigest()
              hashValueAlt = hexMD5.upper()
            elif gl_args.sha1a:
              hash = hashlib.sha1()
              hash.update(rd)
              hexSHA1 = hash.hexdigest()
              hashValueAlt = hexSHA1.upper()
            elif gl_args.sha224a:
              hash = hashlib.sha224()
              hash.update(rd)
              hexSHA224 = hash.hexdigest()
              hashValueAlt = hexSHA224.upper()
            elif gl_args.sha256a:
              hash = hashlib.sha256()
              hash.update(rd)
              hexSHA256 = hash.hexdigest()
              hashValueAlt = hexSHA256.upper()
            elif gl_args.sha384a:
              hash = hashlib.sha384()
              hash.update(rd)
              hexSHA384 = hash.hexdigest()
              hashValueAlt = hexSHA384.upper()
            elif gl_args.sha512a:
              hash = hashlib.sha512()
              hash.update(rd)
              hexSHA512 = hash.hexdigest()
              hashValueAlt = hexSHA512.upper()
            else:
              hashValueAlt = "Not Selected"

            # Check if hash matching was selected
            if gl_hashMatch:
              # If yes then check to see if we have a match
              # and if we do save the result
              if hashValue in gl_hashDict: 
                DisplayMessage("Hash Match")
                foundValue = "Found"
                foundID = gl_hashDict[hashValue]
              elif hashValueAlt in gl_hashDict:
                DisplayMessage("Hash Match")
                foundValue = "Found"
                foundID = gl_hashDict[hashValueAlt]       
              else:
                foundValue = ""
                foundID    = ""
            else:
              # Matching not set
              foundValue = ""
              foundID    = ""                            

            # write one row to the output file

            resultList = [simpleName, foundValue, foundID, theFile, fileSize, modifiedTime, accessTime, createdTime, hashValue, hashValueAlt, ownerID, groupID, str(mode)]     
            o_result.writeCSVRow(resultList)

            DisplayMessage("================================")
            return True
      else:
        logging.warning('[' + repr(simpleName) + ', Skipped NOT a File' + ']')
        return False
    else:
      logging.warning('[' + repr(simpleName) + ', Skipped Link NOT a File' + ']')
      return False
  else:
    logging.warning('[' + repr(simpleName) + ', Path does NOT exist' + ']')        
  return False

# End HashFile Function ===================================

#==================================================

#
# Name: DisplayMessage() Function
#
# Desc: Displays the message if the verbose command line option is present
#
# Input: message type string
#  
# Actions: 
#              Uses the standard library print function to display the messsage
#
def  DisplayMessage(msg):

  if gl_verbose:
    print(msg)

  return   

#End DisplayMessage=====================================

# 
# Class: _CSVWriter 
#
# Desc: Handles all methods related to comma separated value operations
#
# Methods  constructor:     Initializes the CSV File
#                writeCVSRow:   Writes a single row to the csv file
#                writerClose:      Closes the CSV File

class _CSVWriter:

  def __init__(self, fileName, hashType, hashTypeAlt):
    try:
      # create a writer object and then write the header row
      if (sys.version_info > (3, 0)):
        self.csvFile = open(fileName, 'w',newline="\r\n")
      else:
        self.csvFile = open(fileName, 'w')

      tempList = ['File', 'Match', 'ID', 'Path', 'Size', 'Modified Time', 'Access Time', 'Created Time', hashType, hashTypeAlt, 'Owner', 'Group', 'Mode']
      outStr = ",".join(tempList)
      self.csvFile.write(outStr)
      self.csvFile.write("\n")
    except:
      logging.error('CSV File Open Failure')
      DisplayMessage("Error Opening CSV File")
      DisplayMessage("Make sure CSV File Location is Writable and Ensure the file is not open")
      quit()

  def writeCSVRow(self, outList):
    outStr = ",".join(outList)
    self.csvFile.write(outStr)
    self.csvFile.write("\n")

  def writerClose(self):
    self.csvFile.close()

# ------------ MAIN SCRIPT STARTS HERE -----------------

if __name__ == '__main__':

  QFISH_VERSION = '1.1.1'
  ReleaseDate   = "October 2, 2015"

  # Turn on Logging
  logging.basicConfig(filename='QUICKFISH.log',level=logging.DEBUG,format='%(asctime)s %(message)s')

  # Process the Command Line Arguments
  ParseCommandLine()

  # Record the Starting Time
  startTime = time.time()

  # Record the Welcome Message
  logging.info('')
  logging.info('Welcome to QUICKFISH')
  logging.info('Version'+ QFISH_VERSION)
  logging.info('Release Date: '+ ReleaseDate)
  logging.info('\nStart Scan\n')
  logging.info('')
  DisplayMessage('Wecome to QUICKFISH Version: '+ QFISH_VERSION + ' Release Date: ' + ReleaseDate + '\n')

  # Record some information regarding the system
  logging.info('System:  '+ sys.platform)
  logging.info('Version: '+ sys.version)

  # Traverse the file system directories and hash the files
  filesProcessed = WalkPath()

  # Record the end time and calculate the duration
  endTime = time.time()
  duration = endTime - startTime

  logging.info('Files Processed: ' + str(filesProcessed) )
  logging.info('Elapsed Time: ' + str(duration) + ' seconds')
  logging.info('')
  logging.info('Program Terminated Normally')
  logging.info('')

  DisplayMessage('Files Processed: ' + str(filesProcessed) )
  DisplayMessage('Elapsed Time: ' + str(duration) + ' seconds')
  DisplayMessage('')
  DisplayMessage("Program End")

Posted in Announcement, Example, Source Code | Leave a comment

EnCase and Python

EnCase & Python–Extending Your Investigative Capabilities

​Digital investigations are constantly changing as new technologies are utilized to create, store or transfer vital data.  Augmenting existing forensic platforms with innovative methods of acquiring, processing, reasoning about and providing actionable evidence is vital. Integrating open-source Python scripts with leading-edge forensic platforms like EnCase ® provides great versatility and can speed new investigative methods and processing algorithms to address these emerging technologies.

Chet Hosmer, James Habben and Robert Bond provide new insights into this process.

Access the Webinar Replay

 

Posted in Announcement, Source Code | Leave a comment

en2PY EnScript / Python Blog File

en2py – Heuristic Indexing with Python …. Download

Source files related to James Habben and Chet Hosmer Blog Post

Digital Forensics Today – EnScript and Python: Exporting Many Files for Heuristic Processing – Part 1

The following download file contains the pyIndex.py source code along with the required matrix.txt file.

Download - Python Source Code

Requirements:

1) Python 2.7.x is installed

2) Install the Python 3rd party package stop-words

….. pip install stop-words    (from the command line)

2) Create the folder  c:\python27\EnCase\Index\

3) unzip the files pyIndex.py and matrix.txt into this folder

4) Then from within EnCase execute James Habben’s EnScript with the selected files you would like to heuristically index

Enjoy !

 

Posted in Announcement, Example, Source Code | Leave a comment

CEIC 2015 Python Heuristic Reasoning and EnCase

 CEIC 2015

Thanks to everyone who attended Heuristic Reasoning with Python and EnCase at CEIC 2015 with Chet Hosmer

The presentation covered the integration of Python Scripts with EnCase utilizing EnScripts as the conduit.  The presentation covered Scripts that:

  • Extracted proper names and their frequency from documents within an EnCase Case
  • Indexed probable words contained in binary and/or text documents
  • Mapped image Geo-locations along with the spatial distances between image locations
  • Demonstrated the use of Natural Language Processing to identify key evidence

As promised here are the Python and EnScript Open Source Files utilized during the presentation.  Enjoy.

Download Session Souce Code

Python Forensics

 

 

 

 

Posted in Announcement, Example, Source Code | Leave a comment

CEIC 2015 – May 20, 2015 2:00-3:30 PM

CEIC 2015

Heuristic Reasoning with Python and EnCase

Speaker: Chet Hosmer , Python Forensics, Inc.

Session Description: CEIC 2015

Applying scripting languages to the art and science of digital investigations is certainly not new.  However, with the advancements in EnScript and EnCase App Central paired with the rapidly growing interest in Python; 1+1 may very well equal 11. This lab-demo will demonstrate the integration of Python with EnCase and provide open source templates. More specifically it will demonstrate how to apply natural language understanding and heuristic reasoning using Python; based on evidence directly collected and processed by EnCase.

Session Objective:

– Share/Demonstrate EnCase Python Integration Methods
– Define the value of using heuristics and natural language
– Demonstrate Python Language Heuristic and Natural Language methods
– Provide full open source for these new methods

Presenter:

Chet Hosmer is the Founder of Python Forensics, Inc. a non-profit organization focused on the collaborative development of open-source investigative technologies using the Python programming language.   Chet has made numerous appearances to discuss emerging cyber threats including National Public Radio’s Kojo Nnamdi show, ABC’s Primetime Thursday, NHK Japan, CrimeCrime TechTV and ABC News Australia. He has also been a frequent contributor to technical and news stories relating to cyber security and forensics and has been interviewed and quoted by IEEE, The New York Times, The Washington Post, Government Computer News, Salon.com and Wired Magazine.

Chet is the author of three recent Elsevier/Syngress Books:  Python Passive Network Mapping: ISBN-13: 978-0128027219,  Python Forensics: ISBN-13: 978-0124186767 and Data Hiding which is co/authored with Mike Raggo: ISBN-13: 978-1597497435.

Posted in Announcement | 2 Comments

HTCIA International Conference 2015

Join us at HTCIA International Conference 2015 in Orlando Florida:

HTCIA International Conference 2015

Chet Hosmer, will be presenting “Discovering Hidden Content in Multimedia Files ….. Using Python”

Tuesday September 1, 2015 at 8:00 AM

Chet will present new open source methods for discovering hidden content in multimedia files.  Such as .MP4, .MP3, AVI.  The open source Python source code will be provided to all participants in the lecture/lab session.

Chet Hosmer is the Founder of Python Forensics, Inc. a non-profit organization focused on the collaborative development of open-source investigative technologies using the Python programming language.   Chet is also the founder of WetStone Technologies, Inc. and has been researching and developing technology and training surrounding forensics, digital investigation and steganography for over two decades. He has made numerous appearances to discuss emerging cyber threats including National Public Radio’s Kojo Nnamdi show, ABC’s Primetime Thursday, NHK Japan, CrimeCrime TechTV and ABC News Australia. He has also been a frequent contributor to technical and news stories relating to cyber security and forensics and has been interviewed and quoted by IEEE, The New York Times, The Washington Post, Government Computer News, Salon.com and Wired Magazine.

Chet is the author of three recent Elsevier/Syngress Books:  Python Passive Network Mapping: ISBN-13: 978-0128027219,  Python Forensics: ISBN-13: 978-0124186767 and Data Hiding which is co/authored with Mike Raggo: ISBN-13: 978-1597497435.

Chet serves as a visiting professor at Utica College where he teaches in the Cybersecurity Graduate program. He is also an Adjunct Faculty member at Champlain College in the Masters of Science in Digital Forensic Science Program. Chet delivers keynote and plenary talks on various cyber security related topics around the world each year.

Posted in Announcement | Leave a comment

PFIC 2014

PFIC 2014 : Thanks to everyone who attended the Advanced Python Forensics Lab Sessions at PFIC 2014 in Snowbird Utah.  November 12-14, 2014

PFIC 2014 Chet Hosmer Python Presentation

 

 

 

 

 

The following Zip file contains the Python Source Code along with the presentation at PFIC-2014 in Snowbird, Utah.

Enjoy !

Posted in Example, Source Code | Leave a comment

Python Single Word / Proper Name Extraction

by Chet Hosmer

When examining ASCII text data during a forensic investigation, it is often useful to extract proper names and then rank those proper names by the highest number of occurrences.  The Python language has built-in capabilities that will perform this extraction swiftly and easily.  To demonstrate, I created a single Python script that will do just that.  You can run the script on your favorite platform, Windows, Linux or Mac using Python 2.7.x.

So first, what is a proper name and how does that differ from a proper noun?  Also, why is this useful during cybercrime investigations?  Current linguistics differentiate this by classifying the noun.  For example proper nouns such as the single word (mountain or river) are valid nouns, but are not actually very useful to the forensic examiner.  On the other hand proper names in a single word such as (Everest, Mississippi,) are more interesting.  Even more thought-provoking are proper names such as (Robert , Jonathan, Kevin, Austin, Texas, Bourbon Street, the Pentagon, or the Chrysler building).  In normal texts these proper names are likely capitalized and quite easy to strip, identify, count and sort.

So how is this done?

The code excerpt found below is a function that I created to extract possible proper names from an input string. The function (ignoring comments) is just over 20 lines of code.

For a more in-depth look into searching, indexing and natural language examination using Python, check out Chapter 4 and Chapter 7 of my current book, Python Forensics.

####################
# Function
# Name: ExtractProperNames
# Purpose: Extract possible proper names from the passed string
# Input: string
# Return: Dictionary of possible Proper Names along with the number of 
#         of occurrences as a key, value pair
# Usage: theDictionary = ExtractProperNames('John is from Alaska')
####################

def ExtractProperNames(theString):

    # Prepare the string (strip formatting and special characters)
    # You can extend the set of allowed characters by adding to the string
    # Note this example assumes ASCII characters not unicode

    allowedCharacters ="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

    finalString = ''

    # Notice that you can write Python like English if you choose your 
    #    words carefully

    # Process each character in the theString passed to the function
    for eachCharacter in theString:
        # Check to see if the character is in the allowedCharacter string
        if eachCharacter in allowedCharacters:
            # Yes, then add the character to the finalString
            finalString = finalString + eachCharacter
        else:
            # otherwise replace the not allowed character 
            #    with a space
            finalString = finalString + ' '

    # Now that we only have allowed characters or spaces in finalString
    #     we can use the built in Python string.split() method
    # This one line will create a list of words contained in the finalString

    wordList = finalString.split()

    # Now, let's determine which words are possible proper names
    #     and create a list of them.

    # We start by declaring an empty list

    properNameList = []

    # For this example we will assume words are possible proper names
    #    if they are in title case and they meet certain length requirements
    # We will use a Min Length of 4 and a Max Length of 20  

    # To do this, we loop through each word in the word list
    #    and if the word is in title case and the word meets
    #    our minimum/maximum size limits we add the word to the properNameList
    # We utilize the Python built in string method string.istitle() 

    for eachWord in wordList:

        if eachWord.istitle() and len(eachWord) >= 4 and len(eachWord) <= 20:
            # if the word meets the specified conditions we add it
            #    to the properNamesList
            properNameList.append(eachWord)
        else:
            # otherwise we loop to the next word
            continue

    # Note this list will likely contain duplicates to deal with this
    #    and to determine the number of times a proper name is used
    #    we will create a Python Dictionary

    # The Dictionary will contain a key, value pair.
    # The key will be the proper name and value is the number of occurrences
    #     found in the text

    # Create an empty dictionary
    properNamesDictionary = {}

    # Next we loop through the properNamesList
    for eachName in properNameList:

        # if the name is already in the dictionary
        # the name has been processed so continue

        if eachName in properNamesDictionary:
            continue
        else:
            # otherwise we count the number of occurrences. 
            # We do this by using the List method list.count()
            cnt = properNameList.count(eachName)
            # then add the new entry to the dictionary
            # key   = eachName
            # value = count
            properNamesDictionary[eachName] = cnt

    # the function returns the created properNamesDictionary

    return properNamesDictionary

# End ExtractProperNames()

The completeprogram listing below utilizes this function to process a file and print out, in sorted order (highest occurrences first), each possible proper name found.  After the complete program listing, I apply the program to a well-known book.  The first 5 people that can determine the name of the book and author by examining the extracted proper names will receive a free Python Forensic Stylus Pen.  Send your guess for the name of the text and author to cdh@python-forensics.org.

Full Program Listing

 

'''
Copyright (c) 2014 Chet Hosmer

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without restriction, 
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, 
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, 
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial 
portions of the Software.

usage: python ExtractProperNames -i [full pathname of text file]

'''
# import the Operating System Module this handles file system I/O operations and definitions
import os

# import the Standard Library Module of handling program arguments
import argparse

# PSUEDO CONSTANTS
MIN_WORD_SIZE = 4      # Minimum length of a proper name
MAX_WORD_SIZE = 20     # Maximum lenght of a proper name

# Name: ParseCommand() Function
#
# Desc: Process and Validate the command line arguments
#           use Python Standard Library module argparse
#
# Input: none
#  
# Actions: 
#              Uses the standard library argparse to process the command line
#
# For this program we expect 3 potential arguments
# -i which defines the full path and file name of the input text file
#
# returns theArguments

def ParseCommandLine():

    parser = argparse.ArgumentParser('Proper Names Extractor')
    parser.add_argument('-i', '--inputFile', type= ValidateFileRead,  required=True, help="Input filename to extract from")

    theArgs = parser.parse_args()           

    return theArgs

# End ParseCommandLine()

#
# Name: ValidateFileRead Function
#
# Desc: Function that will validate that a file exists and is readable
#
# Input: A file name with full path
#  
# Actions: 
#              if valid will return path
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateFileRead(theFile):

    # Validate the path is a valid
    if not os.path.exists(theFile):
        raise argparse.ArgumentTypeError('File does not exist')

    # Validate the path is readable
    if os.access(theFile, os.R_OK):
        return theFile
    else:
        raise argparse.ArgumentTypeError('File is not readable')

# End ValidateFileRead()

####################
# Function
# Name: ExtractProperNames
# Purpose: Extract possible proper names from the passed string
# Input: string
# Return: Dictionary of possible Proper Names along with the number of 
#         of occurrences as a key, value pair
# Usage: theDictionary = ExtractProperNames('John is from Alaska')
####################

def ExtractProperNames(theString):

    # Prepare the string (strip formatting and special characters)
    # You can extend the set of allowed characters by adding to the string
    # Note this example assumes ASCII characters not unicode

    allowedCharacters ="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

    finalString = ''

    # Notice that you can write Python like English if you choose your 
    #    words carefully

    # Process each character in the theString passed to the function
    for eachCharacter in theString:
        # Check to see if the character is in the allowedCharacter string
        if eachCharacter in allowedCharacters:
            # Yes, then add the character to the finalString
            finalString = finalString + eachCharacter
        else:
            # otherwise replace the not allowed character 
            #    with a space
            finalString = finalString + ' '

    # Now that we only have allowed characters or spaces in finalString
    #     we can use the built in Python string.split() method
    # This one line will create a list of words contained in the finalString

    wordList = finalString.split()

    # Now, let's determine which words are possible proper names
    #     and create a list of them.

    # We start by declaring an empty list

    properNameList = []

    # For this example we will assume words are possible proper names
    #    if they are in title case and they meet certain length requirements
    # We will use a Min Length of 4 and a Max Length of 20  

    # To do this, we loop through each word in the word list
    #    and if the word is in title case and the word meets
    #    our minimum/maximum size limits we add the word to the properNameList
    # We utilize the Python built in string method string.istitle() 

    for eachWord in wordList:

        if eachWord.istitle() and len(eachWord) >= 4 and len(eachWord) <= 20:
            # if the word meets the specified conditions we add it
            #    to the properNamesList
            properNameList.append(eachWord)
        else:
            # otherwise we loop to the next word
            continue

    # Note this list will likely contain duplicates to deal with this
    #    and to determine the number of times a proper name is used
    #    we will create a Python Dictionary

    # The Dictionary will contain a key, value pair.
    # The key will be the proper name and value is the number of occurrences
    #     found in the text

    # Create an empty dictionary
    properNamesDictionary = {}

    # Next we loop through the properNamesList
    for eachName in properNameList:

        # if the name is already in the dictionary
        # the name has been processed so continue

        if eachName in properNamesDictionary:
            continue
        else:
            # otherwise we count the number of occurrences. 
            # We do this by using the List method list.count()
            cnt = properNameList.count(eachName)
            # then add the new entry to the dictionary
            # key   = eachName
            # value = count
            properNamesDictionary[eachName] = cnt

    # the function returns the created properNamesDictionary

    return properNamesDictionary

# End Extract Proper Names Function

#######################
# Main program for Extract Proper Names
# 
# Input: 
#       verboseFlag: used to be loud or silent in processing
#       inputfile:   full path and filename of the input text file
######################

def main(inputFile):

    try:
        # Note this method assumes the file can be completely
        # read into memory (for very large files a buffered approach
        # will be necesary)

        # Attempt to Open the File, then read the contents, and then close
        fp = open(inputFile, 'rb')
        fileContents = fp.read()
        fp.close()
    except:
        print 'File Handling Exception Reported'
        exit(0)

    # Call the ExtractProperNames function which returns
    #     a Python dictionary of possible proper names along with
    #     the number of occurances of that name

    properNamesDictionary = ExtractProperNames(fileContents)

    # Now lets print out the dictionary sorted by value
    #     the value is the number of occurances of the proper name

    # This approach will print out the possible proper names with
    #     the highest occurance first

    for eachName in sorted(properNamesDictionary, key=properNamesDictionary.get, reverse=True):
        print eachName, properNamesDictionary[eachName],
        print " : ",

# End Main Function

#=================================================================
# Entry Point
#=================================================================

# Processes the user supplied arguments
#     and if successful calls the main function
#     with the appropriate argument

if __name__ == "__main__":

    args = ParseCommandLine()

    # Call main passing the user defined argument
    #    in this case the input file name

    main(args.inputFile)

Program Execution and Output

C:\TESTPROPER>python ExtractProperNames.py -i sample.txt

Ahab 501  :  Whale 285  :  Stubb 255  :  Queequeg 252  :  Captain 215  :  Starbuck 196  :  What 175  :  Pequod 172  :  There 149  :  Sperm 135  :  This 113  :  Flask 104  :  White 89  :  Dick 86  :  Moby 86  :  That 85  :  Nantucket 85  :  Jonah 84  :  Project 84  :  Gutenberg 84  :  They 79  :  Bildad 76  :  Peleg 74  :  Leviathan 64  :  With 59  :  Then 57  :  Indian 57  :  Well 56  :  Tashtego 54  :  When 52  :  Right 52  :  Look 51  :  Here 49  :  Though 49  :  English 47  :  Lord 44  :  Steelkilt 40  :  Thou 39  :  Some 39  :  Cape 38  :  King 37  :  Greenland 35  :  Such 34  :  American 34  :  Daggoo 34  :  Fish 33  :  Pacific 32  :  While 31  :  Besides 30  :  Where 29  :  Come 29  :  Whales 29  :  Parsee 28  :  Fedallah 27  :  Nevertheless 27  :  From 26  :  Upon 26  :  Thus 26  :  First 24  :  Dutch 24  :  Lakeman 24  :  Stand 24  :  Foundation 24  :  Good 23  :  Like 23  :  Nantucketer 22  :  Radney 22  :  These 21  :  South 21  :  Gabriel 20  :  England 20  :  Atlantic 19  :  Take 19  :  Ishmael 19  :  Because 19  :  Meantime 19  :  Horn 18  :  Christian 18  :  Perth 18  :  Death 18  :  French 18  :  Bedford 18  :  Yojo 17  :  Soon 17  :  After 17  :  Hussey 16  :  Dough 16  :  Fast 16  :  Line 16  :  Meanwhile 16  :  However 16  :  German 15  :  Give 15  :  Only 15  :  Elijah 15  :  Town 15  :  Roman 15  :  Whether 15  :  States 15  :  Queen 14  :  Loose 14  :  Coffin 14  :  Still 13  :  Bunger 13  :  True 13  :  Archive 13  :  Post 13  :  London 13  :  Avast 13  :  Cook 13  :  Nothing 13  :  Literary 13  :  Great 13  :  Will 12  :  Manxman 12  :  Down 12  :  United 12  :  Derick 12

… shortened for brevity

Crammer 1  :  Macy 1  :  Canaris 1  :  Floundered 1  :  Dusk 1  :  Reckon 1  :  Spell 1  :  Mahomet 1  :  Societies 1  :  Petrified 1  :  Cooks 1  :  Science 1  :  Mesopotamian 1  :  Refund 1  :  Sicilian 1  :  Gold 1  :  Philologically 1  :  Alive 1  :  Britain 1  :  Borean 1  :  Scorpio 1  :  Archipelagoes 1  :  Executive 1  :  Passed 1  :  Berkshire 1  :  Cattegat 1  :  Monstrous 1  :  Shiver 1  :  Dunkirk 1  :  Kick 1  :  Fata 1  :  Turning 1  :  Manhattoes 1  :  Whitehall 1  :  Hollanders 1  :  Mazeppa 1  :  Buckets 1  :  Hygiene 1  :  Five 1  :  Know 1  :  Improving 1  :  Jimmini 1  :  February 1  :  Vitus 1  :  Pillar 1  :  Morquan 1  :  Future 1  :  Lose 1  :  Baltimore 1  :  Dover 1  :  Ehrenbreitstein 1  :  Dericks 1  :  Icebergs 1  :  Sleep 1  :  Actium 1  :  Shooting 1  :

_______________________________________________

Chet Hosmer is an author, educator and researcher.  Chet is a co-founder of WetStone Technologies, Inc., a Visiting Professor at Utica College in the Cybersecurity graduate program, and an Adjunct Professor at Champlain College where he teaches in the Digital Forensics Graduate program.  He resides with his two-legged and four-legged family near Myrtle Beach, South Carolina.  He is the author of the popular Syngress titles Python Forensics and Data Hiding, with more to come!

Python Forensicsdata hiding

Posted in Discussion, Example, Source Code | Leave a comment

Make Python Your “First” Language – for investigating cybercrime!

By Chet Hosmer

Many digital investigators, students, academics, examiners and researchers are frustrated by the current set of forensic tools available.  Don’t get me wrong, many of the toolkits are quite capable, but they also can be complex, expensive and come with a steep learning curve.  Furthermore, when the need arises to address new issues, handle special cases or to directly impact performance by unleashing multiple processing cores toward a specific problem, your control may be limited.  In addition, you may want to develop a deeper understanding of how digital evidence is acquired, examined and analyzed and add some of your own twists to the art of cybercrime investigation.

python logoEnter Python Forensics

The Python programming language is an environment that can be learned and applied by “anyone”.  You simply need a computer (PC, Mac, Linux, iOS, Android, Raspberry Pi or even and old Microvax laying around, and another yes – even a Windows phone) In addition, the open source nature has connected developers and researchers across the globe spurring them on to innovate modules and libraries to address many challenges including but certainly not limited to: space flight, weather prediction, financial modeling, movie production and now digital investigations.  Python is used today by prominent organizations like Google, Disney, Dropbox, Industrial Light and Magic, the National Weather Service, NASA, IBM and many others.

The language has built-in capabilities that directly relate to digital investigation.  For example the code below will perform a SHA 256 hash of a string – in three lines of code no less! This is one of the fundamental practices performed in digital investigation to protect the integrity of evidence and to perform searches for specific known files.

>>> import hashlib
>>> sha256 = hashlib.sha256()
>>> sha256.update("some data I would like hashed")
>>> print "SHA 256: "+sha256.hexdigest()

SHA 256: 994dcf28257fd644d4393e1fb56e26f3ed66e602b697b7bfaec1fc54bd475e2e

Internet protocolsOr maybe you need to capture network packets to identify possible information leaks.  Python provides Standard Libraries for a variety of network interface capabilities.  For example the built-in socket library provides the necessary building blocks for creating simple or advanced scripts that interface with the network.

>>> import socket
>>> mySocket = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
>>> recvBuffer, addr = mySocket.recvfrom(255)

Python Forensics: A workbench for inventing and sharing digital forensic technoology, my most recent book, dives into these and many other topics to provide you with a deep understanding of the fundamental concepts of applying the Python language to digital investigation challenges. The book also contains fully coded and documented python applications that can be used out of the box or extended by you.

So what are you waiting for? Start building, learning and experimenting with your new first language “Python”! And be sure to keep in touch along your journey!!

_____________________________________________

chet image for blogChet Hosmer is an author, educator and researcher.  Chet is a co-founder of WetStone Technologies, Inc., a Visiting Professor at Utica College in the Cybersecurity graduate program, and an Adjunct Professor at Champlain College where he teaches in the Digital Forensics Graduate program.  He resides with his two-legged and four-legged family near Myrtle Beach, South Carolina.  He is the author of the popular Syngress titles Data Hiding and Python Forensics, with more to come!

Posted in Example, General, Source Code | 1 Comment

Python Forensics – SQLite Investigations Part One

SQLite has grown in popularity over the past several years, especially for use in embedded applications that require local/client storage such as web browsers, dropbox and Skype. In addition, SQLite is embedded in iPhone, iPod touch and the iTunes applications, and Android, Microsoft and other popular operating system platforms also use SQLite in a variety of applications. This versatile database has some limitations, but for lightweight embedded applications it has become the ‘go to’ database.

For these reasons I’m often asked, “Can evidence be easily extracted from SQLite databases using a Python script? If so how can I build one?” The answer of course is yes, and there are quite a few examples of Python code snippets that demonstrate the basics. However, in many cases these examples lack detailed explanations and are not directly targeted at forensic interrogation of the databases. This makes it difficult to apply the snippets within a forensic context.

So…. I’ve decided to dedicate a blog series to Python SQLite Forensics. The series will take a deep dive into examining SQLite databases using Python, and will be presented in my normal style of providing you with every detail of how to do this. Following, of course, the same style as my book “Python Forensics, A workbench for inventing and sharing digital forensic technology”.

Databases contain a set of tables and associations defined as a schema. Here’s a nice example of a SQLite schema, provided by Mozilla showing the relationship of information stored by a web-browser.

Part One – The Basics

In Part One, I’m going to start simple and create the basics necessary to dump the SQLite database to a set of .csv files each representing a table in the database with the column headings and contents.

Capabilities

1. Allow the user to specify the SQLite database file to examine
2. Extract all the table names associated with the database
3. Extract the field / column headings of each table
4. Extract the contents of each table and create an associated .csv file
5. Provide complete exception handling
6. Provide fully documented source code with detailed comments

Special Note:  Make sure that you are using the latest SQLite.dll.  You can download the latest under Windows binaries from SQLite.org and then replace the existing file in your Python27 folder.  This will ensure compatibility with the latest implementations.

Executing the SQLite Script

Running the program from the command line using the –h or help option shows the basic operation.

SQL-1

 

 

 

Running the application against an actual SQLite database, (in this case the main.db from Skype) delivers the following results. As you can see the Skype database has many tables that could provide valuable information regarding user activity.

SQL-2

 

 

 

 

 

 

Examining the results directory, you can see that .csv files were created for each database table.

SQL-3

 

 

 

 

Examining the Python Script

The Python script below contains detailed comments and information that should get you started interrogating SQLite database files with Python.

In Part Two, I will be examining the relationships between tables and providing basic search and information identification code, so stay tuned.

Note…..you can also follow this blog feed to stay tuned in.

#
# Python Forensics
#
# SQLite Part I: Basic SQL Database Dump
#
# Dumps the table names and contents of each table
# by creating a Comma Separated Value (CSV) with the contents
# of each table
#
# Sample code with detailed comments.  
#
# usage: python sqlitePartOne.py -v -i .\main.db .\result
#
# Python Version 2.7.x
#
# Version 1.1  June 30, 2014
#

'''
Copyright (c) 2014 Chet Hosmer, Python Forensics, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without restriction, 
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, 
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, 
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial 
portions of the Software.

'''

# Import the standard library module sqlite3
# This type of import allows you to abbreviate the interface
# to sql methods.   i.e. sql.connect  vs sqlite3.connect
import sqlite3 as sql

# import the system module from the standard library
import sys

# import the standard library csv to handle comma separated value file I/O
import csv

# import the Operating System Module this handles file system I/O operations and definitions
import os

# import the Standard Library Module of handling program arguments
import argparse

#=====================================================================
#
# Local Classes and Method Definitions
#
#=====================================================================

# 
# Class: CSVWriter 
#
# Desc: Handles all methods related to comma separated value operations
#
# Methods  constructor:    Initializes the CSV File and writes the header row supplied (as a List)
#          writeCVSRow:    Writes a single row to the csv file
#          destructor:     Closes the CSV File

class CSVWriter:

    def __init__(self, csvFile, heading):
        try:
            # create a writer object and then write the header row
            self.csvFile = open(csvFile, 'w')
            self.writer = csv.writer(self.csvFile, delimiter=',',quoting=csv.QUOTE_ALL)
            self.writer.writerow(heading)
        except:
            print "CSV File: Initialization Failed"
            sys.exit(1)

    def writeCSVRow(self, row):
        try:
            rowList = []
            for item in row:

                if type(item) == unicode or type(item) == str:
                    item = item.encode('ascii','ignore')

                rowList.append(item)

            self.writer.writerow(rowList)

        except:
            print "CSV File Write: Failed" 
            sys.exit(1)

    def __del__(self):
        # Close the CSV File
        try:
            self.csvFile.close()
        except:
            print "Failed to close CSV File Object"
            sys.exit(1)

# End CSV Writer Class ====================================

#
# Display Class
# Replaces basic print function with two advantages
# 1. It will only print to the console if verbose was selected by the user
# 2. It will work with both Python 2.x and 3.x printing 
#

class Display():

    def __init__(self, verbose):
        self.verbose = verbose
        self.ver = sys.version_info

    def Print(self, msg):
        if self.verbose:

            if self.ver >= (3,0):
                print(msg)
            else:
                print msg        

# Display CLASS

# Name: ParseCommand() Function
#
# Desc: Process and Validate the command line arguments
#           use Python Standard Library module argparse
#
# Input: none
#  
# Actions: 
#              Uses the standard library argparse to process the command line
#
# For this program we expect 3 potential arguments
# -v which asks the program to provide verbose output
# -i which defines the full path and file name of the sqlite database to dump
# -d which defines the directory where the resulting table dumps should be stored
#
def ParseCommandLine():

    parser = argparse.ArgumentParser('SQL DB Dump')

    parser.add_argument('-v', '--verbose', help="enables printing of additional program messages", action='store_true')
    parser.add_argument('-i', '--sqlDB',   type= ValidateFileRead,  required=True, help="input filename of the sqlite database")
    parser.add_argument('-o', '--outPath', type= ValidateDirectory, required=True, help="output path for extracted tables")    

    theArgs = parser.parse_args()           

    return theArgs

# End ParseCommandLine()

#
# Name: ValidateFileRead Function
#
# Desc: Function that will validate that a file exists and is readable
#
# Input: A file name with full path
#  
# Actions: 
#              if valid will return path
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateFileRead(theFile):

    # Validate the path is a valid
    if not os.path.exists(theFile):
        raise argparse.ArgumentTypeError('File does not exist')

    # Validate the path is readable
    if os.access(theFile, os.R_OK):
        return theFile
    else:
        raise argparse.ArgumentTypeError('File is not readable')

# End ValidateFileRead()

#
# Name: ValidateDirectory Function
#
# Desc: Function that will validate that the directory exists and is writable
#
# Input: Path to a Directory 
#  
# Actions: 
#              if valid will return path
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateDirectory(theDirectory):

    # Validate the path is a valid directory
    if not os.path.exists(theDirectory):
        raise argparse.ArgumentTypeError('Directory does not exist')

    # Validate the path is writable
    if os.access(theDirectory, os.W_OK):
        return theDirectory
    else:
        raise argparse.ArgumentTypeError('Directory is not writable')

# End ValidateDirectory()

#=====================================================================
# Main Function Starts Here
#=====================================================================

# Main program for SQL Dump
# 
# Input: 
#       verboseFlag: used to be loud or silent in processing
#       theDB:       full path and filename of the input sqlite database file
#       outPath:     the path of the designated results directory

def main(verboseFlag, theDB, outPath):

    p = Display(verboseFlag)
    p.Print("Python Forensics: SQLite Investigation Part One - Simple Database Dump")

    try:
        # attempt to connect to a database file
        # this example uses the skype main.db that I have copied into
        # my local directory for easy access

        db = None
        db = sql.connect(theDB)

        # sql requires a cursor 
        # A database cursor is a  structure that enables you traverse over the records in a database
        # Cursors facilitate operations such as retrieval, addition and deletion of records contained
        # in a database

        dbCursor = db.cursor()    

        # Now let's utilize the cursor to execute a simple SQL command
        # that extracts the table names from the database

        dbCursor.execute("SELECT name FROM sqlite_master WHERE type='table';")

        # The next statement fetches all the results from the table query

        tableTuple = dbCursor.fetchall()

        # For good measure let's print the list of tables
        # associated with this datbase

        p.Print("Tables Found")

        for table in tableTuple:
            p.Print(table[0])

        # Now we have all the table names in the object tableTuple
        # We can interate through each tuple entry

        for item in tableTuple:

            # For this particular tuple we are only interested in the first
            # entry which is the name of the table

            tableName = item[0]
            p.Print("Processing Table: "+tableName+"\n")

            # Now we can use the table name to extract data
            # contained in the table

            tableQuery = "SELECT * FROM "+tableName 

            # We will use the cursor to execute the query
            # and then collect the row data using the fetchall() method

            dbCursor.execute(tableQuery)

            # Obtain the table description
            tableDescription = dbCursor.description      

            # Create a heading for each table
            tableHeading = []
            for item in tableDescription:
                tableHeading.append(item[0])

            oCSV = CSVWriter(outPath+os.sep+tableName+'.csv', tableHeading)

            rowData = dbCursor.fetchall()

            # Now we can interate through the row data
            # and write the results to the associated CSV file

            for row in rowData:
                oCSV.writeCSVRow(row)

            oCSV.__del__()

    except:
        p.Print ("SQL Error")
        sys.exit(1)

    finally:

        if db:
            db.close()    

    p.Print("End Program")

# End Main program

#=================================================================
# Main Program Entry Point
#=================================================================

# Processes the user supplied arguments
# and if successful calls the main function
# with the appropriate arguments

if __name__ == "__main__":

    args = ParseCommandLine()

    # Call main passing the user defined arguments

    main(args.verbose, args.sqlDB, args.outPath)
Posted in Example, Source Code | 2 Comments

First Python Forensic Script Challenge Winner Selected

WINNERS

Congrats to John Carney from http://www.carneyforensics.com/ for submitting the best new Python Forensic Script Idea at the 2014 TechnoSecurity Conference held in Myrtle Beach, SC June 1-4.

Watch for solutions to the script challenge in coming weeks.

 

Posted in Announcement | Leave a comment

Python Forensics Book Launch

TechnoSecurity 2014

Python Forensic Book launched at TechnoSecurity 2014. Stop by booth 616 for a chance to win a free book.

 

Posted in Announcement | 4 Comments

PFIC 2013 Python Labs

Thanks to everyone that attended the Python Labs at PFIC 2013.

As promised, I have included the lecture and Labs with full source code.

PYTHON PFIC-2013

Total Downloads: 4,762

Enjoy

Posted in Example, Source Code | 1 Comment

HTCIA International Conference 2013

Thanks to everyone who attended the Python-Forensics Lab at HTCIA International.

The presentation and lab are available for download. Enjoy.
Download Lab and Presentation
Download Count 1,102

Posted in Example, Source Code | Leave a comment

Ubuntu and Python a nice couple

I’m often asked: What is the best environment for developing Python applications?

The answer of course is that depends, mostly on your preferences. The great thing about Python is whether you are most comfortable on a Mac, Windows 8 or Linux you can enjoy the same integrated development environment.

However, with the advent of Ubuntu 12.x LTS (Long Term Support version) it certainly rises to the top for Linux. This version is guaranteed to be supported with updates and security patches until April 2017. http://www.ubuntu.com/download/desktop

In addition, Python 2.7.3 comes installed as part of the base installation. Also, the Ubuntu Software Center is available once installed and by searching for Python a plethora of additional resources and downloads are available to enhance your Python experience.

Enjoy!

Posted in Announcement | Leave a comment

Using Hex and Binary Numbers in Python

One of the first questions forensic investigators ask about when writing python programs or scripts is how do I handle Hex and Binary numbers and perform simple operations?

Python has built in intuitive capabilities to handle such numbers. Remember Python is designed to be as easy to read as English.

Opening the Python shell we can see how easy this really is.

> python
Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)]
Type “help”, “copyright”, “credits” or “license” for more information.

# First set the variable named value = to the decimal number 127
>>> value = 127
# displaying the number in hex as you would suspect is as easy and saying
# show me the hex representation of the variable “value”. using the proper syntax of course
>>> hex(value)
‘0x7f’
# I like to see my hex numbers in all caps, I know old school
# so I add on the upper() function as shown below
>>> hex(value).upper()
‘0X7F’

#displaying the number in binary works the same way
>>> bin(value)
‘0b1111111’
>>>
# what if we want to “Exclusive Or” two hex values together?
# we first set variable A = to a hex 20 and variable B = to a hex 40
>>> A = 0x20
>>> B = 0x40
# then we use the carrot operator to create the new variable C
# (this operator represents “Exclusive Or” in most languages)
>>> C = A ^ B

# then we use the hex function once again to display the result
>>> hex(C).upper()
‘0X60’

# and of course we then would like to display the variable C in binary
>>> bin(C)
‘0b1100000’

As the saying goes “as easy as pie”
One of the earliest uses of this idiom was in a comic story found in the The Newport Mercury (a Rhode Island Newspaper) back in 1887.

Posted in Example | Leave a comment

Python-Forensics @ Techno Security

A Python-Forensics lecture, demonstration along with a mini training session was held at the 15th annual Techno Security Conference in Myrtle Beach, SC.

Over 50 attendees participated and we had a great interchange of ideas.

Thanks to all that participated.

Posted in Announcement | 3 Comments