Version 1.1.1 QuickFish

QuickFish Python Script Update

Key Updates

– Supports both Python 2.7 and 3.3 Python Environment

– Fixed bug when processing SHA224

– Supports two hashes simulatenously i.e. MD5 and SHA512

– Added additional log and display messages

– Added Version and Release Date Information in the Log

– Added new progress indicator option

– Added hash matching feature based on input file

Special thanks to Lance for his persistence in getting me to update the script for Python 3 and for his detailed testing that turned up several bugs that are now corrected.

What is QuickFish?

For those not familiar with QuickFish.  QuickFish is a much improved version of the pfish.py script found in my Python Forensics Book.

Python Forensics

The new script is contained in a single file and performs a walk of the file system at a starting point defined by the user via a command line options.  For each file encountered during the walk, file attributes are collected along with the hash of the file.  Hashes supported include MD5, SHA1, SHA224, SHA256, SHA384 AND SHA512.   In the new version you can specify any two the hash methods.  The results are written to a comma separate value file that can be open by most the popular spreadsheet programs, for additional sorting and analysis.

Finally, an optional match file can be provided that contains a hash value and note in the following format one line per entry.

732A289B7DD8C7DD28D4D73ED2480BCF,Suspicious Image
40374D33463DFE213D31CCB0E1DEDC22,Proprietary Image

For each hash generated the hash is compared to those found in the match file.  If a match occurs then the MATCH will be indicated in the report along with the associated note.

Enjoy.

 

Download

Note, change .txt file extension to .py after download for safety.

'''
Copyright (c) 2014-2015 Chet Hosmer

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without restriction, 
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, 
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, 
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial 
portions of the Software.

'''
# 
# Python QUICK FISH File System Hash Program
# Author: C. Hosmer
#
#
# Revised: 10/02/2015  Version 1.1.1
#                     1) Fixed bug when processing SHA224
#                     2) Added additional log and display messages when
#                        the CSV File cannot be opened.  This is typically due
#                        to the file already being open in another application
#                        such as a spreadsheet
#                     3) Added Version and Release Date to Log
#                     4) Added new progress spinner argument -s, this displays a
#                        spinning progress indicator.  Good when scanning larger
#                        numbers of files and directories.  Note this is mutually 
#                        exclusive with the -v or verbose option
#
# Revised: 09/30/2015  Version 1.1
#                      1) Modified the source to work with either Python 2.7.x
#                      Python 3.x.  Thanks to the suggestions by Lance
#                      2) Fixed bug in Hash value lookup
#
# Revised: 11/1/2014 from the original pfish program in Python Forensics Book 
#                    ISBN: 978-0124186767
#                    New ideas and expansion inspired by Michelle Mullinix and Greg Dominguez
#                    Updates Include:
#                        a) Reduced the script to a single .py file for simple execution
#                        b) Allowed selection of one or (optionally) two hash types per run
#                        c) Supported all native hash types available in Python's hashlib
#                        d) Added the optional capability to include a hashmatch input file
#                           this will add two fields to the csv output file (Match and ID) 
#                           that contains the word FOUND when a match is identified
#                           along with the ID value associated with the hash from the input file. 
#                           Note the input file format for the hashmatch is strict  
#                           HASHVALUE,ID one entry per line.  

import logging    # Python Standard Library Logger
import time       # Python Standard Library time functions
import sys        # Python Standard Library system specific parameters
import os         # Python Standard Library - Miscellaneous operating system interfaces
import stat       # Python Standard Library - constants and functions for interpreting os results
import time       # Python Standard Library - Time access and conversions functions
import hashlib    # Python Standard Library - Secure hashes and message digests
import argparse   # Python Standard Library - Parser for command-line options, arguments
import csv        # Python Standard Library - reader and writer for csv files

# Support Functions Start Here, Main Script Entry is at the bottom

#
# Name: ParseCommand() Function
#
# Desc: Process and Validate the command line arguments
#           use Python Standard Library module argparse
#
# Input: none
#  
# Actions: 
#              Uses the standard library argparse to process the command line
#              establishes a global variable gl_args where any of the functions can
#              obtain argument information
#
def ParseCommandLine():

  parser = argparse.ArgumentParser('Python file system hashing .. QuickFish')

  group = parser.add_mutually_exclusive_group()
  group.add_argument('-v', "--verbose",  help="allows progress messages to be displayed", action='store_true')
  group.add_argument('-s', "--spinner",  help="displays progress indicator", action='store_true')

  # setup a group where the selection is mutually exclusive and required.

  group = parser.add_mutually_exclusive_group(required=True)
  group.add_argument('--md5',      help = 'specifies MD5 algorithm',      action='store_true')
  group.add_argument('--sha1',     help = 'specifies SHA1 algorithm',     action='store_true')  
  group.add_argument('--sha224',   help = 'specifies SHA224 algorithm',   action='store_true')  
  group.add_argument('--sha256',   help = 'specifies SHA256 algorithm',   action='store_true')  
  group.add_argument('--sha384',   help = 'specifies SHA384 algorithm',   action='store_true')  
  group.add_argument('--sha512',   help = 'specifies SHA512 algorithm',   action='store_true')   

  group = parser.add_mutually_exclusive_group(required=False)
  group.add_argument('--md5a',      help = 'specifies MD5 algorithm',      action='store_true')
  group.add_argument('--sha1a',     help = 'specifies SHA1 algorithm',     action='store_true')  
  group.add_argument('--sha224a',   help = 'specifies SHA224 algorithm',   action='store_true')  
  group.add_argument('--sha256a',   help = 'specifies SHA256 algorithm',   action='store_true') 
  group.add_argument('--sha384a',   help = 'specifies SHA384 algorithm',   action='store_true')  
  group.add_argument('--sha512a',   help = 'specifies SHA512 algorithm',   action='store_true')   

  parser.add_argument('-d', '--rootPath',   type= ValidateDirectory,         required=True, help="specify the root path for hashing")
  parser.add_argument('-r', '--reportPath', type= ValidateDirectoryWritable, required=True, help="specify the path for reports and logs will be written")   
  parser.add_argument('-m', '--hashMatch',  type= ValidateFileReadable,      required=False,help="specify the optional hashmatch input file path")   

  # create a global object to hold the validated arguments, these will be available then

  global gl_args
  global gl_hashType
  global gl_hashTypeAlt
  global gl_hashMatch
  global gl_hashDict
  global gl_verbose
  global gl_spinner

  gl_args = parser.parse_args()   

  if gl_args.verbose:
    gl_verbose = True
  else:
    gl_verbose = False

  if gl_args.spinner:
    gl_spinner= True
  else:
    gl_spinner = False

  # Determine the hash type(s) selected 

  # Mandatory

  if gl_args.md5:
    gl_hashType = 'MD5'

  elif gl_args.sha1:
    gl_hashType = 'SHA1'     

  elif gl_args.sha224:
    gl_hashType = 'SHA224'        

  elif gl_args.sha256:
    gl_hashType = 'SHA256'

  elif gl_args.sha384:
    gl_hashType = 'SHA384'    

  elif gl_args.sha512:
    gl_hashType = 'SHA512'

  else:
    gl_hashType = "Unknown"
    logging.error('Unknown Hash Type Specified')

  # Optional Type

  if gl_args.md5a:
    gl_hashTypeAlt = 'MD5'

  elif gl_args.sha1a:
    gl_hashTypeAlt = 'SHA1'     

  elif gl_args.sha224a:
    gl_hashTypeAlt = 'SHA224'        

  elif gl_args.sha256a:
    gl_hashTypeAlt = 'SHA256'

  elif gl_args.sha384a:
    gl_hashTypeAlt = 'SHA384'    

  elif gl_args.sha512a:
    gl_hashTypeAlt = 'SHA512'
  else:
    gl_hashTypeAlt = 'None'

  # Check for hashMatch Selection    
  if gl_args.hashMatch:
    # Create a dictionary from the input file
    gl_hashMatch = gl_args.hashMatch
    gl_hashDict = {}

    try:
      with open(gl_hashMatch) as fp:
        # for each line in the file extract the hash and id
        # then store the result in a dictionary
        # key, value pair
        # in this case the hash is the key and id is the value

        for line in fp:
          hashKey = line.split(',')[0].upper()
          hashID  = line.split(',')[1]
          # Strip the newline from the ID
          hashID  = hashID.strip()
          # Add the key value pair to the dictionary
          gl_hashDict[hashKey] = hashID

    except:
      logging.error("Failed to read in Hash List")
      DisplayMessage("Failed to read in Hash List")
  else:
    gl_hashMatch = False

  DisplayMessage("Command line processed: Successfully")

  return

# End ParseCommandLine============================================================      

#
# Name: ValidateDirectory Function
#
# Desc: Function that will validate a directory path as 
#           existing and readable.  Used for argument validation only
#
# Input: a directory path string
#  
# Actions: 
#              if valid will return the Directory String
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateDirectory(theDir):

  # Validate the path is a directory
  if not os.path.isdir(theDir):
    raise argparse.ArgumentTypeError('Directory does not exist')

  # Validate the path is readable
  if os.access(theDir, os.R_OK):
    return theDir
  else:
    raise argparse.ArgumentTypeError('Directory is not readable')

#End ValidateDirectory ===================================

#
# Name: ValidateDirectoryWritable Function
#
# Desc: Function that will validate a directory path as 
#           existing and writable.  Used for argument validation only
#
# Input: a directory path string
#  
# Actions: 
#              if valid will return the Directory String
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateDirectoryWritable(theDir):

  # Validate the path is a directory
  if not os.path.isdir(theDir):
    raise argparse.ArgumentTypeError('Directory does not exist')

  # Validate the path is writable
  if os.access(theDir, os.W_OK):
    return theDir
  else:
    raise argparse.ArgumentTypeError('Directory is not writable')

#End ValidateDirectoryWritable ===================================

#
# Name: ValidateFileReadable Function
#
# Desc: Function that will validate a file path as 
#       existing and readable.  Used for argument validation only
#
# Input: a file path
#  
# Actions: 
#              if valid will return the FilePath
#
#              if invalid it will raise an ArgumentTypeError within argparse
#              which will inturn be reported by argparse to the user
#

def ValidateFileReadable(theFile):

  # Validate the path is a file
  if not os.path.isfile(theFile):
    raise argparse.ArgumentTypeError('File does not exist')

  # Validate the path is readable
  if os.access(theFile, os.R_OK):
    return theFile
  else:
    raise argparse.ArgumentTypeError('File is not readable')

#End ValidateFileReadable ===================================

class Spinner:

  # Constructor

  def __init__(self):

    self.symbols = [' |', ' /', ' -', ' \\', ' |', ' \\', ' -', 'END'] 
    self.curSymbol = 0

    sys.stdout.write("\b\b\b%s " % self.symbols[self.curSymbol])
    sys.stdout.flush()

  def Spin(self):

    if self.symbols[self.curSymbol] == 'END':
      self.curSymbol = 0

    sys.stdout.write("\b\b\b%s " % self.symbols[self.curSymbol])        
    sys.stdout.flush()
    self.curSymbol += 1

# End Spinner Class

#
# Name: WalkPath() Function
#
# Desc: Walk the path specified on the command line
#           use Python Standard Library module os and sys
#
# Input: none, uses command line arguments
#  
# Actions: 
#              Uses the standard library modules os and sys
#              to traverse the directory structure starting a root
#              path specified by the user.  For each file discovered, WalkPath
#              will call the Function HashFile() to perform the file hashing
#

def WalkPath():

  processCount = 0
  errorCount = 0

  # Create a proper report path 
  reportPath = os.path.join(gl_args.reportPath, "fsreport.csv")
  oCVS = _CSVWriter(reportPath, gl_hashType, gl_hashTypeAlt)

  # Create a loop that process all the files starting
  # at the rootPath, all sub-directories will also be
  # processed

  # Create a proper root path
  if gl_args.rootPath.endswith('\\') or gl_args.rootPath.endswith('/'):
    rootPath = gl_args.rootPath
  else:
    rootPath = gl_args.rootPath+'/'

  logging.info('Start Scan Path: ' + rootPath)

  if gl_args.spinner:
    # Create a Spinner Object for displaying progress
    obSPIN = Spinner()      

  for root, dirs, files in os.walk(rootPath):

    if gl_spinner:
      # Update progress indicator
      obSPIN.Spin()   

    # for each file obtain the filename and call the HashFile Function
    for file in files:
      fname = os.path.join(root, file)
      result = HashFile(fname, file, oCVS)

      # if hashing was successful then increment the ProcessCount
      if result is True:
        processCount += 1
      # if not sucessful, the increment the ErrorCount
      else:
        errorCount += 1       

  oCVS.writerClose()

  return(processCount)

#End WalkPath==================================================

#
# Name: HashFile Function
#
# Desc: Processes a single file which includes performing a hash of the file
#           and the extraction of metadata regarding the file processed
#           use Python Standard Library modules hashlib, os, and sys
#
# Input: theFile = the full path of the file
#           simpleName = just the filename itself
#  
# Actions: 
#              Attempts to hash the file and extract metadata
#              Call GenerateReport for successful hashed files
#
def HashFile(theFile, simpleName, o_result):

  # Verify that the path is valid
  if os.path.exists(theFile):

    #Verify that the path is not a symbolic link
    if not os.path.islink(theFile):

      #Verify that the file is real
      if os.path.isfile(theFile):

        try:
          #Attempt to open the file
          f = open(theFile, 'rb')
        except IOError:
          #if open fails report the error
          logging.warning('Open Failed: ' + theFile)
          return
        else:
          try:
            # Get the Basic File Attributes
            # Before attempting to open the file
            # This should preserve the access time on most OS's

            theFileStats =  os.stat(theFile)
            (mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(theFile)

            # Attempt to read the file
            rd = f.read()

          except IOError:
            # if read fails, then close the file and report error
            f.close()
            logging.warning('File Access Error: ' + theFile)
            return
          else:

            #Print the simple file name
            DisplayMessage("Processing File: " + theFile)
            logging.info("Processing File: " + theFile)

            # Get the size of the file in Bytes
            fileSize = str(size)

            #Get MAC Times
            modifiedTime = time.ctime(mtime)
            accessTime   = time.ctime(atime)
            createdTime  = time.ctime(ctime)

            ownerID  = str(uid)
            groupID  = str(gid)
            fileMode = bin(mode)

            #process the file hashes

            if gl_args.md5:
              #Calculate and Print the MD5
              hash = hashlib.md5()
              hash.update(rd)
              hexMD5 = hash.hexdigest()
              hashValue = hexMD5.upper()
            elif gl_args.sha1:
              hash = hashlib.sha1()
              hash.update(rd)
              hexSHA1 = hash.hexdigest()
              hashValue = hexSHA1.upper()
            elif gl_args.sha224:
              hash = hashlib.sha224()
              hash.update(rd)
              hexSHA224 = hash.hexdigest()
              hashValue = hexSHA224.upper()                         
            elif gl_args.sha256:
              hash = hashlib.sha256()
              hash.update(rd)
              hexSHA256 = hash.hexdigest()
              hashValue = hexSHA256.upper()
            elif gl_args.sha384:
              hash = hashlib.sha384()
              hash.update(rd)
              hexSHA384 = hash.hexdigest()
              hashValue = hexSHA384.upper()
            elif gl_args.sha512:
              #Calculate and Print the SHA512
              hash=hashlib.sha512()
              hash.update(rd)
              hexSHA512 = hash.hexdigest()
              hashValue = hexSHA512.upper()
            else:
              logging.error('Hash not Selected')

            if gl_args.md5a:
              #Calculate and Print the MD5 alternate
              hash = hashlib.md5()
              hash.update(rd)
              hexMD5 = hash.hexdigest()
              hashValueAlt = hexMD5.upper()
            elif gl_args.sha1a:
              hash = hashlib.sha1()
              hash.update(rd)
              hexSHA1 = hash.hexdigest()
              hashValueAlt = hexSHA1.upper()
            elif gl_args.sha224a:
              hash = hashlib.sha224()
              hash.update(rd)
              hexSHA224 = hash.hexdigest()
              hashValueAlt = hexSHA224.upper()
            elif gl_args.sha256a:
              hash = hashlib.sha256()
              hash.update(rd)
              hexSHA256 = hash.hexdigest()
              hashValueAlt = hexSHA256.upper()
            elif gl_args.sha384a:
              hash = hashlib.sha384()
              hash.update(rd)
              hexSHA384 = hash.hexdigest()
              hashValueAlt = hexSHA384.upper()
            elif gl_args.sha512a:
              hash = hashlib.sha512()
              hash.update(rd)
              hexSHA512 = hash.hexdigest()
              hashValueAlt = hexSHA512.upper()
            else:
              hashValueAlt = "Not Selected"

            # Check if hash matching was selected
            if gl_hashMatch:
              # If yes then check to see if we have a match
              # and if we do save the result
              if hashValue in gl_hashDict: 
                DisplayMessage("Hash Match")
                foundValue = "Found"
                foundID = gl_hashDict[hashValue]
              elif hashValueAlt in gl_hashDict:
                DisplayMessage("Hash Match")
                foundValue = "Found"
                foundID = gl_hashDict[hashValueAlt]       
              else:
                foundValue = ""
                foundID    = ""
            else:
              # Matching not set
              foundValue = ""
              foundID    = ""                            

            # write one row to the output file

            resultList = [simpleName, foundValue, foundID, theFile, fileSize, modifiedTime, accessTime, createdTime, hashValue, hashValueAlt, ownerID, groupID, str(mode)]     
            o_result.writeCSVRow(resultList)

            DisplayMessage("================================")
            return True
      else:
        logging.warning('[' + repr(simpleName) + ', Skipped NOT a File' + ']')
        return False
    else:
      logging.warning('[' + repr(simpleName) + ', Skipped Link NOT a File' + ']')
      return False
  else:
    logging.warning('[' + repr(simpleName) + ', Path does NOT exist' + ']')        
  return False

# End HashFile Function ===================================

#==================================================

#
# Name: DisplayMessage() Function
#
# Desc: Displays the message if the verbose command line option is present
#
# Input: message type string
#  
# Actions: 
#              Uses the standard library print function to display the messsage
#
def  DisplayMessage(msg):

  if gl_verbose:
    print(msg)

  return   

#End DisplayMessage=====================================

# 
# Class: _CSVWriter 
#
# Desc: Handles all methods related to comma separated value operations
#
# Methods  constructor:     Initializes the CSV File
#                writeCVSRow:   Writes a single row to the csv file
#                writerClose:      Closes the CSV File

class _CSVWriter:

  def __init__(self, fileName, hashType, hashTypeAlt):
    try:
      # create a writer object and then write the header row
      if (sys.version_info > (3, 0)):
        self.csvFile = open(fileName, 'w',newline="\r\n")
      else:
        self.csvFile = open(fileName, 'w')

      tempList = ['File', 'Match', 'ID', 'Path', 'Size', 'Modified Time', 'Access Time', 'Created Time', hashType, hashTypeAlt, 'Owner', 'Group', 'Mode']
      outStr = ",".join(tempList)
      self.csvFile.write(outStr)
      self.csvFile.write("\n")
    except:
      logging.error('CSV File Open Failure')
      DisplayMessage("Error Opening CSV File")
      DisplayMessage("Make sure CSV File Location is Writable and Ensure the file is not open")
      quit()

  def writeCSVRow(self, outList):
    outStr = ",".join(outList)
    self.csvFile.write(outStr)
    self.csvFile.write("\n")

  def writerClose(self):
    self.csvFile.close()

# ------------ MAIN SCRIPT STARTS HERE -----------------

if __name__ == '__main__':

  QFISH_VERSION = '1.1.1'
  ReleaseDate   = "October 2, 2015"

  # Turn on Logging
  logging.basicConfig(filename='QUICKFISH.log',level=logging.DEBUG,format='%(asctime)s %(message)s')

  # Process the Command Line Arguments
  ParseCommandLine()

  # Record the Starting Time
  startTime = time.time()

  # Record the Welcome Message
  logging.info('')
  logging.info('Welcome to QUICKFISH')
  logging.info('Version'+ QFISH_VERSION)
  logging.info('Release Date: '+ ReleaseDate)
  logging.info('\nStart Scan\n')
  logging.info('')
  DisplayMessage('Wecome to QUICKFISH Version: '+ QFISH_VERSION + ' Release Date: ' + ReleaseDate + '\n')

  # Record some information regarding the system
  logging.info('System:  '+ sys.platform)
  logging.info('Version: '+ sys.version)

  # Traverse the file system directories and hash the files
  filesProcessed = WalkPath()

  # Record the end time and calculate the duration
  endTime = time.time()
  duration = endTime - startTime

  logging.info('Files Processed: ' + str(filesProcessed) )
  logging.info('Elapsed Time: ' + str(duration) + ' seconds')
  logging.info('')
  logging.info('Program Terminated Normally')
  logging.info('')

  DisplayMessage('Files Processed: ' + str(filesProcessed) )
  DisplayMessage('Elapsed Time: ' + str(duration) + ' seconds')
  DisplayMessage('')
  DisplayMessage("Program End")

This entry was posted in Announcement, Example, Source Code. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


one + 9 =