Hac65
HAC/65 - The 6502 Inferencing Disassembler
Install / Use
/learn @dhinson919/Hac65README
HAC/65 - The 6502 Inferencing Disassembler
- What is it?
- What can I do with it?
- How does it work?
- How do I get started?
- The Big Leagues
- FAQ (yet to be asked)
What is it?
HAC/65 is yet another addition to the surfeit of 6502 disassemblers available to hobbyists and computer historians (http://6502.org/tools/asm/). But there are a few notable features that set it apart from the pack:
- It has some smarts. Given some 6502 object code it will study it to determine which segments are code and which are data and fingerprint them so they can be easily spotted in other object code. It can even illuminate "dark code" lurking in a binary -- mysteriously uncalled code segments that may be of interest to historians, bug-hunters and conspiracy theorists.
- It is extensible. You can make it smarter by supplying extra knowledge in the form of stackable architecture overlays -- data files that let you assist the tool with known symbols, data structures locations, and program counter targets. This enables a powerful, iterative reverse-engineering workflow for researchers as well as providing new opportunities for automating the jobs of revision comparative analysis and historical assembly code listing re-documentation.
- It's implemented as a single Modern C++ executable with no additonal runtime dependencies so it is fast, small, and future-proof. Its modular design allows the core analyzer component to be easily embedded into other native code applications.
- It's hosted on Github, not some fly-by-night domain that you can never remember. So it will be standing by here waiting for you day or night until the sun explodes or you finally lose interest in old 6502 hardware, whichever comes first.
- It's new, and newer is always better, right?
What can I do with it?
It's main purpose is to serve as a tool for exploring poorly or completely undocumented 6502 object code at the subroutine level. Despite not being a new pastime among legacy computer hardware enthusiasts the author grew frustrated with the lack of good tools to do rigorous comparative studies of legacy ROM code revisions and product evolution. And while he found decent tools for basic disassembly, none allowed a user to iteratively refine the results as knowledge of a particular product was gained. The latter feature is especially important for analyzing object code with limited access to original source code and when the most useful information comes from code snippets in old magazine articles and fragments of long-dead BBS and Internet forum conversations.
HAC/65 is an attempt to address these needs by producing three basic reports:
-
A segmentation report shows the individual contiguous groupings of both instructions and non-instruction data bytes. Instruction (code) segments typically represent individual subroutines. Once the purpose of a particular subroutine is determined it can be assigned a label which is then shown by subsequent disassemblies in called and caller code. Data segments typically represent one more more data structures and they too can be assigned labels and declared as such to the analyzer which may help it discover additional code segments.
-
A segment fingerprint report lists information about the segments ordered by fingerprint ID (hash code). Segments with the same fingerprint are likely candidates for being duplicate code within the same object code or across different object code files. This is useful for spotting new instruction sequences across ROM revisions, for example, where older segments may have been relocated within the address space but otherwise unaltered.
-
A disassembly report produces a more conventional disassembler output using a simple listing format that works well for use with differencing tools to spot specific differences in instructions or data across object code files. Clever users could even automate the translation of the listing into source code for actual assemblers with relative ease.
How does it work?
There are many different techniques that can be used to disassemble object code. The author chose code ledge analysis which, while not as thorough as say control flow or simulation analysis, is relatively simple and produces sufficient results thanks to the limited complexity of 6502 architecture and applications.
In a nutshell, the analyzer tries to infer contiguous instruction segments by looking for program counter "landing-edges" and "leaping-edges", aka ledges. It uses an iterative process of identifying ledges and then analyzing the instructions between them to find new ledges. Once all ledges have been identified then, in theory, all code segments have been identified and therefore any segments left over must be data segments. However there is the possibility that some of the leftover segments could actually be unreachable code segments, aka dark code. If the user chooses, those segments can be further analyzed using some simple heuristics to determine if they are likely code or likely data and treated as such.
How do I get started?
First it should be understood that the tool is meant for the advanced hobbyist with multi-platform skills. The author has attempted to construct a quality product, for his personal machinations if for nobody else, but there has been little consideration for making it accessible to a broad audience and there likely won't be. He simply doesn't have the time! However, it is fully frontally exposed here on Github for anyone to use and integrate into there own projects with a very permissive license and no strings attached. The author is motivated purely by the desire to help preserve knowledge of legacy 6502 hardware of all kinds and would be happy to assist similar efforts.
Currently, to use it you will need:
- A 64-bit Linux distro. It has not yet been ported to Windows, Mac or anything else.
- The ability to edit files on Linux if you want to add or modify architecture overlay files.
You can get started right away by downloading just the pre-built executable.
Samples of reports using the ROM and overlay files provided with this project can be found in the samples directory.
To build it you will need:
- A recent 64-bit Linux distro. It was developed and tested on Ubuntu 18.04.1 LTS "bionic".
- git (or compatible), cmake and g++ 7.1 (minimum).
Building HAC/65
-
Ensure you have the required prerequisites:
$ sudo apt-get install git cmake g++ -
Clone the project:
$ git clone https://github.com/dhinson919/hac65.git -
Make the executable:
$ cd hac65 $ cmake . $ make -
The executable
hac65will be built into the current working directory.
Example session
The following example session illustrates features and uses of the tool.
First, let's run the tool without any args:
$ ./hac65
Error: usage: hac65 [options] object-file
Options:
-h Display this information
-v Display version
-S <digits> Starting position within object
-E <digits> Ending position within object
-A <aro-name> Top architecture overlay
-o <digits> Origin address
-i Illuminate dark code
-R [sfdo] Reporting options
s = segments
f = segment fingerprints
d = disassembly
o = overlays
As you can see it could not continue because of missing command line arguments. Specifically, you must at least supply the path to an object file to analyze. The object file can be located anywhere but if you intend to use architecture overlay files (.aro, see below) they must be located in the tool's working directory.
The project distribution comes with a few sample ROMs in the rom/ directory:
$ ls -l rom
total 40
drwxr-xr-x 2 hac65er hac65er 4096 Nov 4 13:53 ./
drwxr-xr-x 6 hac65er hac65er 4096 Nov 4 23:41 ../
-rw-r--r-- 1 hac65er hac65er 4096 Nov 4 13:53 1050-FLOPOS.rom
-rw-r--r-- 1 hac65er hac65er 4096 Nov 4 13:53 1050-revK.rom
-rw-r--r-- 1 hac65er hac65er 10240 Nov 4 13:53 800antsc.rom
-rw-r--r-- 1 hac65er hac65er 10240 Nov 4 13:53 800apal.rom
These are commonly available ROM images for Atari 400/800 OS "A" systems (NTSC and PAL) and Atari 1050 floppy drives. Both are 6502 architecture systems so they are object file candidates. We'll use them in the following examples.
Let's try to disassemble the 1050-revK revision and see what happens:
$ hac65 rom/1050-revK.rom
Error: encountered an out-of-object address ($FFFA) -- is the origin address set correctly? (see -o option)
HAC/65 will always try to be polite and helpful even when it knows we've done something stupid. In this case we've failed to supply a value for the disassembly origin. Why is this necessary? Notice in the file listing that 1050-revK.rom is 4k bytes in size. But as everybody knows the 6502 has a 64k address space. When provided an object file HAC/65 will, unless told otherwise, assume that the object code had an original starting address of $0000. That would mean the highest object address would be $1000 (4k). There would not be a problem if that was in fact true, but in this case the analyzer been asked to resolve the address $FFFA which is well outside of that address range. What's going on?
To explain this we have to digress for a moment. As mentioned earlier one feature that sets HAC/65 apart from other disassemblers is its use of architecture overlays. These are stackable sets of metadata used to describe a
Related Skills
node-connect
340.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
340.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.1kCommit, push, and open a PR
