4

How do assembly compilers compile instructions like:

mov eax, [bx+si]

or

mov eax, [ebp + 4*eax - 40]

disclaimer: I'm still fairly new to this stuff.

I'm having trouble understanding from my current level of knowledge. I have exhausted google but can't find explanations on how it compiles; just on what it does. Because from my current understanding it can't be a predetermined calculation.

Is this generating a specific, different, opcode and then passing in the pieces and the CPU handles this automagically for it? If so I've missed this in all the x86 references I've found; or am maybe not looking for the right thing or what I expect the references to show it as.

Or is it injecting a bunch of instructions at compile time to do the math just before the instruction itself? Although, from reading various sources that use this technique to perform certain multiplication operations quicker than calling MUL(something along the lines of mov eax, [eax*4 + eax] (forgive me if I'm remembering that wrong) being equivalent to multiplying by 5 but quicker). It has lead me to believe this cannot be how it is handled.

I also have failed to find a good guide/read on the compilation process of x86 assembly. Only resources on programming in it. So if anyone knows any good resources let me know. :)

Thanks!

dennmat
  • 2,618
  • 2
  • 17
  • 20

2 Answers2

4

All processors have a set of addressing modes which must be modelled in the architecture definition.

Compiler backends match the intermediate representation against a series of patterns of available addressing modes. Usually the IR will explicitly compute the address. "Peephole optimization" looks for opportunities to merge the calculations into the instruction, using a matching addressing mode.

Sometimes optimizers do quite the opposite, sometimes they hoist the calculation outside of a loop then use a simple pointer dereference addressing mode. The best choice is not clear, it depends on how close to zero cost the complex addressing mode calculations are on the target.

Complex addressing modes can reduce register pressure, due to the reduced need to hold calculated addresses.


If you want more insight into how compilers do what they do, you should go through this Stanford University online material about compiler implementation. It gets fun at Handout 18, and Lecture 8. I am going to assume you are going to use a parser generator, so the theory of precisely how it parses can be skipped for now.

doug65536
  • 6,562
  • 3
  • 43
  • 53
  • This makes much more sense now. Thank you. – dennmat May 12 '16 at 02:24
  • Awesome, thanks for the reference, combing through a tonne of different sources, sometimes with conflicting information has been tiring. I will for sure use that resource heavily. Thanks again. – dennmat May 12 '16 at 02:40
  • All addressing modes are equally cheap (other than code-size) on x86, except for Intel Sandybridge-family. (i.e. the CPUs everyone cares about). This seems like a good answer to a different question. The OP seemed to be asking about how addressing modes are encoded in machine language, or even if they were pseudo-ops that assemble to multiple instruction. This is a good answer to the question of how compilers choose which addressing modes to use, but not how the compiler output is assembled. – Peter Cordes May 12 '16 at 05:03
  • @PeterCordes Yeah, I answered the question, "how do compilers compile complex addressing modes". I don't remember all that assembler related stuff being there when I originally answered. I answered the first question and assumed the "`lea` trick" for ofs+a+b*2^[0-3] in one instruction was just a tangent unrelated to the actual question. – doug65536 May 12 '16 at 05:17
  • Ah, I see. You must have started answering right away, and the OP snuck in some edits in the 5-min window. When I saw the question, it was clear the OP had just written "compile" when he meant "assemble", and same for the noun, since the asm source code was there. Of course, if people knew enough to ask perfect questions, in most cases they'd already know where to find the answers quickly and easily. :P I prob. won't bother to edit the question just to fix that, though. – Peter Cordes May 12 '16 at 05:25
  • I never edited, but @doug65536 's assumption was correct that the "`lea` trick" was a tangent I went on. But his answer in combination with the comment confirmed that it was the cpu handling this. @PeterCordes your answer definitely added an extra layer of clarity and I appreciate all of it. But I have to ask, is your concern with the word 'compile' vs 'assemble' when specifically talking about assembly purely a semantic one? Or is there ambiguity when using compile in this context? – dennmat May 12 '16 at 14:08
  • @dennmat: If your source code is assembly language, then you assemble it to machine code with an assembler. If your source code is C (for example), then you *compile* it with a *compiler*, and the compiler output might be assembly language or machine code (see http://gcc.godbolt.org/, the compiler explorer). I don't think there's any ambiguity given the rest of the text, just a mis-usage. I don't see any reference to *generating* asm in the question, just how asm turns into machine code (assembling), but apparently you wanted to know the answer to both questions, not just the one you asked. – Peter Cordes May 12 '16 at 19:29
  • Ahh I see. Ya again, still new to this word of asm, so thanks for the heads up. And ya the question was asking about about asm->machine code. I figured the best way to clarify my haziness on how a computer fully works from power on to OS, was to code it. So as a little hobby I'm writing a shitty "software" computer in python. Definitely learning a whole lot along the way. So this question popped up as I was making an assembler for the instruction set I made up(very loosely based off x86, and now realizing mine is very incomplete) and am implementing. Thanks for all the help it goes a long way. – dennmat May 13 '16 at 00:48
2

You're thinking of lea, not mov. (mov eax, [eax*4 + eax] is a load.) lea is a shift-and-add instruction that uses the asm syntax and machine-code encoding for addressing-modes. And yes, it is worth using it to replace multiple other instructions, or to replace an imul.

See Intel's instruction-set manual for the details of how addressing modes are encoded in the machine code. The official manuals contain pretty much everything you'd need to know to write an assembler from scratch, and they're available for free in PDF format. Volume2 is the insn set reference, with the opernd encoding info in an appendix. See also the other links in the tag wiki for tons of stuff, including optimization guides.

See also this summary of the available addressing modes.

Addressing modes with just a base register only take one byte, plus a disp8 or disp32 displacement. Indexed addressing modes need an extra byte (the Scale Index and Base byte). The *1, *2, *4, *8 scale factors for the index are shift counts, encoded in a 2-bit field in the SIB byte.

Besides code-size, there's also a performance cost to using indexed addressing modes on Intel Sandybridge-family CPUs: they can't micro-fuse. And on Haswell, the dedicated store-AGU can only handle "simple" addressing modes.

On other current x86 CPUs, all addressing modes perform the same other than code-size, so the address-math is free.

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847