AVX convert 64 bit integer to 64 bit float

Question

I would like to convert 4 packed 64 bit integers to 4 packed 64 bit floats using AVX. I've tried something like:

int_64t *ls = (int64_t *) _mm_malloc(256, 32);
ls[0] = a;
//...
ls[3] = d;

__mm256i packed = _mm256_load_si256((__m256i const *)ls);

Which will display in the debugger:

(gdb) print packed
$4 = {1234, 5678, 9012, 3456}

Okay so far, but the only cast/conversion operation that I can find is _mm256i_castsi256_pd, which doesn't get me what I want:

__m256d pd = _mm256_castsi256_pd(packed);

(gdb) print pd
$5 = {6.0967700696809824e-321, 2.8053047370865979e-320, 4.4525196003213139e-320, 1.7074908720273481e-320}

What I'd really like to see is:

(gdb) print pd
$5 = {1234.0, 5678.0, 9012.0, 3456.0}

See also: [Best way to load a 64-bit integer to a double precision SSE2 register?](http://stackoverflow.com/q/15569015). Note that if you do not want to make assumptions about (or use ugly hacks to modify) the bits inside a packed-double vector, you can always perform two `CVTDQ2PD`, once using the lower 32-bit and then again using the upper 32-bit, and finally add the packed-double vector together. — rwong, Apr 17 '15 at 04:47

score 5 · Accepted Answer · answered May 13 '13 at 00:05

5

All of the cast intrinsics perform a bitwise cast, which is why you're not seeing meaningful results with that.

A vector conversion (the cvt intrinsics) between 64-bit integer and 64-bit float does not exist.

answered May 13 '13 at 00:05

Cory Nelson

29,236
5
72
110

I guessing that was the case, cheers for the confirmation. Time to solve the problem differently. – Michael Barker May 13 '13 at 00:31
Also, be aware that you cannot represent the same numbers with a 64-bit int and a 64-bit float. Most numbers in each format do not have an equivalent in the other. 64-bit floats are much bigger/smaller than an int, so you flat out can't even try. Going from int to float the best way possible (not a bitwise cast), you'll get approximations, but don't do anything important with them. – xaxxon May 13 '13 at 00:55

score 2 · Answer 2 · answered May 13 '13 at 12:03

For what it's worth, I looked in Agner Fog's vectorclass to see how he does it. He simply stores the 64-bit integers to an array and casts each array value to a double. It's inefficient but it works.

From file "vectorf256.h":

// function to_double: convert integer vector elements to double vector (inefficient)
static inline Vec4d to_double(Vec4q const & a) {
    int64_t aa[4];
    a.store(aa);
    return Vec4d(double(aa[0]), double(aa[1]), double(aa[2]), double(aa[3]));
}

// function to_double: convert integer vector to double vector
static inline Vec4d to_double(Vec4i const & a) {
    return _mm256_cvtepi32_pd(a);
}

AVX convert 64 bit integer to 64 bit float

2 Answers2

Linked