-1

I am doing some research into x86 assemblers, and came across this wikipedia article.

It notes that Intel syntax assemblers infer the width of the instruction/operand from the register width.

AT&T syntax assembly mnemonics postfix a letter to the "base" mnemonic to specify the operand width. As with the Intel syntax, the register width can still be specified by the register name.

For example the ax register is 16 bits wide, the eax register is 32 bits wide.

In Intel syntax, the mov instruction width is inferred from the width of the register.

In AT&T, movl is an instruction which operates on 32 bit wide data, mov (?) operates on 16 bit wide data. (At least this is my guess - I have not written anything in AT&T before.)

My question is, what happens if the register and operator width differ, in an AT&T syntax assembly code, and one attempts to compile it? Does the operator width take precedence, or is this simply an error, and the assembly code will not assemble?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
FreelanceConsultant
  • 13,167
  • 27
  • 115
  • 225
  • 2
    Why don't you try it to find out? – Erik Eidt Aug 31 '21 at 15:00
  • att is not an assembler nor really a syntax it is a convention of destination last for x86. there are many at&t style assembly languages, and many of them are incompatible with each other. – old_timer Aug 31 '21 at 15:15
  • if you want to know how a tool works or a specific assembly language the assembly language (not talking intel vs at&t those are not unique/complete languages) you need to talk about a specific tool and version of that tool, the assembly language is assumed to be specific to that tool only and not compatible (despite claims from folks like nasm, clang, etc to compatibility with someone elses tool/language). – old_timer Aug 31 '21 at 15:17
  • assemblers assemble. not compile. – old_timer Aug 31 '21 at 15:17
  • Note that even in intel syntax you can use conflicting types and some at&t assemblers can also infer operand sizes. So it's not really an intel vs at&t difference. – Jester Aug 31 '21 at 15:17
  • your question has the same issue be it one of the many intel x86 assembly languages or the many AT&T x86 assembly languages and some of them use the byte ptr type solution and others use a movb type solution, etc. but your question is massively broad and unanswerable as written. – old_timer Aug 31 '21 at 15:18
  • 1
    wow that wikipedia article is massively wrong from the first line of text, someone needs to educate that author on the reality of assembly languages. – old_timer Aug 31 '21 at 15:22
  • 1
    @old_timer: AT&T *is* a whole syntax; your first comment is wrong. Besides destination-last (something other syntaxes like Plan9 / Goasm share with AT&T), AT&T also specifies the operand-size suffixes, the `%` and `$` decorations on registers and immediates, and the addressing-mode `disp(base, idx, scale)` syntax. Other syntaxes that share only some of these properties might be called "AT&T-like", but they're not AT&T. https://sourceware.org/binutils/docs/as/i386_002dVariations.html documents some of what AT&T syntax implies. – Peter Cordes Aug 31 '21 at 15:54
  • @old_timer: What you said *would* be correct about "Intel syntax" ([tag wiki](https://stackoverflow.com/tags/intel-syntax/info)); that is a whole family of syntaxes. Unless we limit the context to the GNU assembler and clang/LLVM, in which case `.intel_syntax noprefix` is a specific flavour of Intel syntax, but using a more specific name like "GAS Intel syntax" or "GNU Intel syntax" is essential. Again, this is not like AT&T, there is only 1 AT&T syntax, and AT&T assemblers are even [bug-compatible with the original AT&T Unix assembler](//sourceware.org/binutils/docs/as/i386_002dBugs.html). – Peter Cordes Aug 31 '21 at 16:00
  • by definition that cannot be true, the language is specific to the tool not the target as you well know. It only takes a person an afternoon to make this not true. only needs to be one case public or private tool to make this not true. after decades you think this is remotely the case? – old_timer Aug 31 '21 at 19:34
  • we need to educate folks on reality – old_timer Aug 31 '21 at 19:35

1 Answers1

1

Assembly is assembled, not compiled, so “does not compile” is true for assembly in general.

But yes, generally speaking a mismatched size suffix causes the assembler to reject your code with some error message. Usually this error message is helpful, but some times it may also be misleading. With a poorly written assembler, other results including nonsensical ones are possible. For example, some versions of DOS DEBUG.COM assemble

PUSH BYTE AL

into an invalid instruction that would logically be the result of encoding this instruction despite it not existing. However, modern assemblers generally do not misbehave this way (though assembler bugs are not unheard of).

Also note that in some cases, seemingly mismatched instruction sizes are permitted. For example, it is permitted to write

movl %ds, %eax

despite the former being a 16 bit register. This is because the CPU does actually support this operation with a 32 bit operand size, zero-extending %ds into %eax.

fuz
  • 88,405
  • 25
  • 200
  • 352