The Rosetta 2 instruction size expansion factor for an sqlite3 binary is ~1.64x (1.05MB of x86 instructions vs 1.72MB of ARM instructions). Surprisingly good, especially given Firestorm cores have six-times the instruction cache of Ice Lake (192KiB vs 32KiB).
Something I'm not sure I said is that the goal is to have a single, equivalent ARM instruction for each x86 instruction. And, in real-world code, combining all those tricks allows Rosetta 2 to achieve that surprisingly often.
I said that typically, converting an x86 instruction to ARM will require an expansion, and I stand by it, but some of the counter-examples are rather entertaining. A lot of x86 instructions have 32-bit immediates, which become much more compact when most of those bits are unused.
For example, this instruction has two 32-bit immediates:
48 C7 83 D8 01 00 00 00 00 00 00 | mov qword ptr [rbx+1D8h], 0
And gets translated to:
7F EC 00 F9 | str xzr, [x3,#0x1D8]
I said that typically, converting an x86 instruction to ARM will require an expansion, and I stand by it, but some of the counter-examples are rather entertaining. A lot of x86 instructions have 32-bit immediates, which become much more compact when most of those bits are unused.
For example, this instruction has two 32-bit immediates:
48 C7 83 D8 01 00 00 00 00 00 00 | mov qword ptr [rbx+1D8h], 0