0

Let's say we work on architecture x86_64 and let's say we have the following string, "123456". In ASCII characters, it becomes 31 32 33 34 35 36 00.

Now, which assembly instructions should I use to move the entire (even if fragmented) content of this string somewhere in a way that %rdi stores the address of that string (points to that)?

Because I am not simply able to move the hex representation of the string into a register, like one can do with unsigned values, how do I do it?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Where does the string come from in the first place? If it's a literal you can probably just load the address of it. – 500 - Internal Server Error Nov 24 '21 at 12:37
  • 2
    I don't understand the last part of your question. You absolutely can move this string into a register, just like you can with integers. It will fit in a 64-bit register. – paddy Nov 24 '21 at 12:37
  • 3
    What is the *actual* problem you need to solve? What is the reason you need to "move the entire content"? Right now it feels very much like an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) – Some programmer dude Nov 24 '21 at 12:40
  • It's not quite clear what you want, please [edit] and show the value, `rdi` should contain with the string `"123456"`. The question title doesn't really match the question. Are you're talking about the _address_ of the string you want in the register or the _content_ of the string you want in the register. – Jabberwocky Nov 24 '21 at 12:43
  • If you play a little with the compiler and disassemble the output, the compiler will give you the answer. – alinsoar Nov 24 '21 at 12:49
  • 1
    What do you mean, the "hex representation" of a string? You mean in the asm source, that gets assembled to binary, like `mov eax, 0x123456`? Like compilers do to initialize small strings with mov-immediate https://godbolt.org/z/dKnTz8b9G ? (Or do you mean you want bytes that represent ASCII characters of an unsigned number, like you can get at runtime from [How to convert a binary integer number to a hex string?](https://stackoverflow.com/q/53823756)) – Peter Cordes Nov 24 '21 at 13:03

1 Answers1

0

There are a couple of ways to do so. If you want to move the entire string to another offset first, you would have to do so with a loop.

mov rbx, 0
loop:
 mov al, [string+rbx]
 mov [copyoffset+rbx], al
 inc rbx
 cmp al, 0x0
 jne loop
... Insert other code here

Then you can use the Lea instruction described below to move it into rdi.

If you just want to load the address of the string and don't care about moving it you can just use lea

lea rdi, [stringoffset]

Edit: Changed rax to al so we only move one byte at a time

Zopazz
  • 56
  • 5
  • 1
    The first part doesn't move anything into a register, it only copies the string into copyoffset – Jabberwocky Nov 24 '21 at 12:51
  • You're using (RIP-relative?) LEA in your 2nd code block, but in your first code block you're assuming that the address will fit in a 32-bit absolute sign-extended disp32 as part of the addressing mode. (So you can use it with another register.) If that's the case, you should use `mov edi, OFFSET stringoffset` to put the address in RDI. (Unless this is for a high-half kernel so it fits in 32-bit sign-extended but not 32-bit zero-extended). See [How to load address of function or label into register](https://stackoverflow.com/q/57212012) – Peter Cordes Nov 24 '21 at 12:55
  • Also, as @Jabberwocky says, it's not clear what the point of showing a naive unoptimized strcpy loop is. The question's example is about a short string that could be copied with a pair of overlapping dword loads/stores, like glibc memcpy would do. If you're writing asm by hand, normally you care about performance so you'd use SSE2 `movdqu` / `pcmpeqb` / `pmovmskb` to check for end of string, with some alignment handling... see glibc's `strcpy` or other examples of optimized strcpy. – Peter Cordes Nov 24 '21 at 12:59
  • Of course this can be easily optimized. It's just an example of a simple way to do so, in the way I understood the question. If he includes libc he could literally just do a call to strcpy. It would also be way better to use the stack or the heap for this than some rw part of the binary. But with the way the question was asked, it seemed like he's new and just needs a simple example he can understand. – Zopazz Nov 24 '21 at 13:19
  • 1
    you should also change `cmp rax, 0x0` – Tommylee2k Nov 24 '21 at 16:14