I need to build a disassembler, any suggestions for resources/examples?
24 comments
[DATA EXPUNGED]
@neauoire I wrote a very simple one for my bytecode interpreter ages ago: https://github.com/uliwitness/Leonie/blob/master/common/LEOInterpreter.c maybe that's of help? I wrote some more complex ones, but they're not public. Basically you iterate over the bytes of compiled code you have, find out what instruction it is based on the first few bits, then, depending on that, parse the rest of that instruction. @neauoire If you're going to display e.g. β80 00β as βLIT 00β or β#00β rather than the incorrect βLIT BRKβ, you'll need to have a little lookbehind on the first address on display. If you can find three bytes in a row that are non-zero when ANDed with 0x1f, then you're all set. Otherwise just look behind a reasonable distance and make a guess. Of course, sections that contain data rather than code will be decoded all wrong, so this LIT handling can only be an educated guess. @alderwick that's the plan for LITs :) I'll make a first naive implementation and report with my findings, I think I have an idea for labels too. @neauoire as part of my 8-bit alike microcomputer using AVR I included a 6502 VM and the shell has a disassembler written in C: https://github.com/reidrac/dan64/blob/master/dasm/dasm.c Probably buggy, but it is small and it may give you some ideas, perhaps? For your consideration, a #mos6502 disassembler in #lisp "cl-6502 is a Common Lisp emulator, assembler and disassembler for the MOS 6502 processor" https://github.com/kingcons/cl-6502#readme (Because if it's in Lisp it might be short and sweet.) Displaying hex bytes and shorts looks works pretty well :) @cancel I'm not sure if I'm totally in love with this, it creates all sorts of little side effects, like editing is weird, mouse picking is weird, I might keep it like LIT2 And somehow display the body of the literal differently. @cancel actually, yes, beetbug should display it like this. It's very nice when looking with code side by-side and see the exact same formatting. I think if there aren't too many editing side-effects, if you can, you should make it display on a single line. @neauoire I don't think I can collapse them into one line without adding a bunch of code, because right now it just maps one address/byte -> one line. but I could make it display both, primarily the one-line version on the first line, with the raw opcodes/bytes greyed out on the first and second lines. @cancel yes! that's a good idea. I've had to jump through the hoops to increment the line id too, it's the part of that display mode that irks me the most. A fully formatted value on the first LIT byte would be ideal. @neauoire This looks great! I'll have to attempt a mini version of this for the debugger. π @alderwick feel free to pick bits outs of Dexe :) I don't handle the unused opcodes yet, but otherwise the rest should be pretty easy to copy-paste! @cancel of course not :) This is for something that won't have the option of installing wine. |