An attempt at a disassembler for Micropython bytecode. Largely inspired by DC540's MicroPython Challenge.
$ python ./mdis.py -h
argument | type | description |
---|---|---|
-h , -help |
none | show the help menu |
-b INTEGER |
int to format | format an integer into the micropython bytecode format |
-op INTEGER |
int to get opcode of | integer |
-f FILE |
file | supply a file to print disassembly for |
-fr FROM |
0 offset addr | supply a from address for instruction dump. constraint: must be a 0 offset address. requires to address (-t ). |
-t TO |
0 offset addr | supply a to address for instruction dump. constraint: must be a 0 offset address. requires from address (-fr ). |
There are not too many contribution guidelines. Contributions should follow a decent code style and documentation style, and should be verbose with the changes.
cmd
- cli commandscore
- disassembler coremio
- base io / file iologger
- loggingparser
- file parsingreader
- file reading
- hexdump [3.3]
- termcolor [1.1.0]
- six [1.16.0]
There are a few things that are important to know with this very experimental project!
- Aforementioned
0-offset addresses
are addresses that have no offset. Similar to the addresses seen in an xxd dump. - The bytes parsed are big-endian
- Sometimes it won't dump an entire file.
- MicroPython deals with strings as shown here on page 139. This is unsupported as of currently.
- There is a hidden parser feature that is not utilised but prints out EVERY SINGLE HEX BYTE in the file, without addresses.
- This project essentially provides a level of "pseudocode", but with the actual MicroPython instructions. Due to previously listed, things, they are not the most accurate, but relatively enough.
The file print.mpy
is compiled through mpy-cross, and is a file with the following code:
print('hi')
Since it originates from a python file, we can look at the original disassembly:
"""
2 0 LOAD_NAME 0 (print)
2 LOAD_CONST 0 ('hi')
4 CALL_FUNCTION 1
6 POP_TOP
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
"""
Disassembling the .mpy file, we get this output:
Similar, but not entirely accurate.
This output highlights each opcode's hex value as seen on the left. These values are reflected in bc0.h.