LogFileParser
Parser for $LogFile on NTFS
Install / Use
/learn @jschicht/LogFileParserREADME
Features Decode and dump $LogFile records and transaction entries. Decode NTFS attribute changes. Optionally resolve all datarun list information available in $LogFile. Option: "Reconstruct data runs". Recover transactions from slack space within $LogFile. Choose to reconstruct missing or damaged headers of transactions found in slack. Option: "Rebuild header". Optionally also finetune result with a LSN error level value. Option: "LSN error level". Logs to csv and imports to sqlite database with several tables. Optionally import csv output of mft2csv into db. Choose among 6 different timestamp formats. Choose timestamp precision: None, MilliSec and NanoSec. Choose Precision separator at millisec. Choose Precision separator at nanosec. Choose region adjustment for timestamps. Default is to present timestamps in UTC 0.0. Choose output separator. Option: "Set separator". Configurable UNICODE or ANSI output. Option "Unicode". Configurable MFT record size (1024 or 4096). Option "MFT record size". Optionally decode individual transactions or partial transactions (fragment). Option to reconstruct RCRD's from single or multiple transactions (fragments). Option to configure broken $LogFile. Useful with carved RCRD's as input. Option to skip fixups (for broken $LogFile, typically carved from memory). Detailed verbose output into debug.log. Configurable comma separated list of lsn's to trigger ultra verbose information about specific transactions into debug.log. Configuration for 32-bit OS. Configuration for binary data extraction of resident data updates. Autogenerated sql for importing out put into MySql database. Option to skip all sqlite3 stuff to speed up total parsing. Optional command line mode. Supports errorlevel suitable for batch scripting.
Background NTFS is designed as a recoverable filesystem. This done through logging of all transactions that alters volume structure. So any change to a file on the volume will require something to be logged to the $LogFile too, so that it can be reversed in case of system failure at any time. Therefore a lot of information is written to this file, and since it is circular, it means new transactions are overwriting older records in the file. Thus it is somewhat limited how much historical data can be retrieved from this file. Again, that would depend on the type of volume, and the size of the $LogFile. On the systemdrive of a frequently used system, you will likely only get a few hours of history, whereas an external/secondary disk with backup files on, would likely contain more historical information. And a 2MB file will contain far less history than a 256MB one. So in what size range can this file be configured to? Anything from 256 KB and up. Configure the size to 2 GB can be done like this, "chkdsk D: /L:2097152". How a large sized logfile impacts on performance is beyond the scope of this text. Setting it lower than 2048 is normally not possible. However it is possble by patching untfs.dll: http://code.google.com/p/mft2csv/wiki/Tiny_NTFS
Intro This parser will decode and dump lots of transaction information from the $LogFile on NTFS. There are several csv's generated as well as an sqlite database named ntfs.db containing all relevant information. The output is extremely detailed and very low level, meaning it requires some decent NTFS knowledge in order to understand it. The currently handled Redo transaction types with meaningfull output decode are:
InitializeFileRecordSegment CreateAttribute DeleteAttribute UpdateResidentValue UpdateNonResidentValue UpdateMappingPairs SetNewAttributeSizes AddindexEntryRoot DeleteindexEntryRoot AddIndexEntryAllocation DeleteIndexEntryAllocation WriteEndOfIndexBuffer SetIndexEntryVcnRoot SetIndexEntryVcnAllocation UpdateFileNameRoot UpdateFileNameAllocation SetBitsInNonresidentBitMap ClearBitsInNonresidentBitMap OpenNonresidentAttribute OpenAttributeTableDump AttributeNamesDump DirtyPageTableDump TransactionTableDump UpdateRecordDataRoot UpdateRecordDataAllocation CompensationlogRecord
The list of currently supported attributes: $STANDARD_INFORMATION $ATTRIBUTE_LIST $FILE_NAME $OBJECT_ID $SECURITY_DESCRIPTOR $VOLUME_NAME $VOLUME_INFORMATION $DATA $INDEX_ROOT $INDEX_ALLOCATION $REPARSE_POINT $EA_INFORMATION $EA $LOGGED_UTILITY_STREAM
So basically all attributes are supported.
Explanation of the different output generated:
LogFile.csv: The main csv generated from the parser.
LogFile_DataRuns.csv The input information needed for reconstructing dataruns
LogFile_DataRunsResolved.csv The final output of reconstructed dataruns
LogFile_INDX_I30.csv All dumped and decoded index records (IndexRoot/IndexAllocation)
LogFileJoined.csv Same as LogFile.csv, but have filename information joined in from the $UsnJrnl or csv of mft2csv.
MFTRecords.bin Dummy $MFT recreated based on found MFT records in InitializeFileRecordSegment transactions. Can use mft2csv on this one (remember to configure "broken MFT" and "Fixups" properly).
LogFile_lfUsnJrnl.csv Records for the $UsnJrnl that has been decoded within $LogFile
LogFile_UndoWipe_INDX_I30.csv All undo operations for clearing of directory indexes (INDX).
LogFile_AllTransactionHeaders.csv All headers of decoded transactions.
LogFile_BitsInNonresidentBitMap.csv All decoded SetBitsInNonresidentBitMap operations.
LogFile_DirtyPageTable32bit.csv and LogFile_DirtyPageTable64bit.csv All entries in every decoded DirtyPageTableDump operation for both 32bit and 64bit OS.
LogFile_Mft_ObjectId_Entries.csv Decoded $ObjectId attributes.
LogFile_ObjIdO.csv All decodes from system file $ObjId:$O.
LogFile_OpenAttributeTable.csv All entries in every decoded OpenAttributeTableDump operation.
LogFile_QuotaO.csv All decodes from system file $Quota:$O.
LogFile_QuotaQ.csv All decodes from system file $Quota:$Q.
LogFile_RCRD.csv All headers of decoded RCRD records.
LogFile_ReparseR.csv All decodes from system file $Reparse:$R.
LogFile_SecureSDH.csv All decodes from system file $Secure:$SDH.
LogFile_SecureSII.csv All decodes from system file $Secure:$SII.
LogFile_SecurityDescriptors.csv Decoded security descriptors. Source can be from $SECURITY_DESCRIPTOR or $Secure:$SDS.
LogFile_SlackAttributeNamesDump.csv All entries from decoded AttributeNamesDump transactions found in slack space.
LogFile_SlackOpenAttributeTable.csv All entries from decoded OpenAttributeTableDump transactions found in slack space.
LogFile_TransactionTable.csv Decoded TransactionTableDump transactions.
LogFile_Filenames.csv All resolved filenames with MftRef, MftRefSeqNo and Lsn.
LogFile_TxfData.csv Decoded data from $DATA:$TXF_DATA in $LOGGED_UTILITY_STREAM.
LogFile_UpdateFileName_I30.csv All decodes of UpdateFileNameRoot and UpdateFileNameAllocation for both redo and undo operations.
LogFile_CompensationlogRecord.csv All decodes of CompensationlogRecord. Not relevant for nt5.x.
Ntfs.db An sqlite database file with tables almost equivalent to the above csv's. The database contains 5 tables: DataRuns IndexEntries LogFile LogFileTmp (temp table used when recreating dataruns). UsnJrnl
Timestamps Defaults are presented in UTC 0.00, and with nanosecond precision. The default format is YYYY-MM-DD HH:MM:SS:MSMSMS:NSNSNSNS. These can be configured. The different timestamps refer to: CTime means File Create Time. ATime means File Modified Time. MTime means MFT Entry modified Time. RTime means File Last Access Time.
Reconstructing dataruns.
Many operations on the filesystem, will trigger a transaction into the $LogFile. Those relating to the $DATA attribute, ie a file's content are so far identified as;
InitializeFileRecordSegment CreateAttribute UpdateMappingPairs SetNewAttributeSizes
They all leave different information in the $LogFile. Resident data modifications behave differently and can not be reconstructed just like that, at least on NTFS volumes originating from modern Windows versions.
InitializeFileRecordSegment is when a new file is created. Thus it will have the $FILE_NAME attribute, as well as the original $DATA atribute content, including dataruns. Since the $LogFile is circular, and older events gets overwritten by newer, the challenge with the $LogFile is to get information long enough back in time. However, if InitializeFileRecordSegment is present, then we should be able to reconstruct everything, since all records written after that will also be available. We will also have information about the offset to the datarun list. This is a relative offset calculated from the beginning of the $DATA attribute. This is important information to have when calculating where in the datarun list the UpdateMappingPairs have done its modification.
CreateAttribute is the original attribute when it was first created (if not written as part of InitializeFileRecordSegment). With this one too, we should be able to reconstruct dataruns since we have all transactions available to us. However, this one will not in itself provide us with the file name. Here too, we have the offset to the datarun list available which is extremely useful when solving UpdateMappingPairs.
UpdateMappingPairs is a transaction when modifications to the $DATA/dataruns are performed (file content has changed). The information found in this transaction is not complete, and it contains just the new values added to the existing datarun list. It also contains a relative offset that tells us where in the datarun list the changes have been written. This offset is used in combination with the offset to datarun as found in InitializeFileRecordSegment and CreateAttribute.
SetNewAttributeSizes is a transaction that contains information about any size value related modifications done to the $DATA attribute. This is tightly connected to UpdateMappingPairs whic
