A completely general solution for this is difficult to achieve if a wider floating-point format with at least twice the number of significand bits is not available. Normally, on common system platforms using C or C++ one could just use float rem = (float) fmod ((double)a * (double)b, 360.0); with float mapped to IEEE-754 binary32 type, double mapped to IEEE-754 binary64 types and therefore both the multiplication and fmod() being exact, with only a single rounding error incurred when rounding the result back to float.
A high-performance implementation is possible in the absence of double support if a few parameters and restrictions can be enforced. This answer assumes that a C or C++ program is executed on a system platform that supports IEEE-754 arithmetic and with the binary32 data type exposed as float, and that the default rounding mode in effect is "to nearest or even". It is further assumed that the fused multiply-add (FMA) operation is provided in hardware, and exposed (for the binary32 type) via the standard math function fmaf(). This latter assumption holds for most current systems based on the ARM, x86, and Power processor architectures, as well as commonly used GPUs.
It is further assumed that the product $a \cdot b$ is restricted to $[-360 \cdot2^{22}, 360 \cdot 2^{22}]$, approximately $[-1.5\cdot 10^{9}, 1.5\cdot 10^{9}]$, such that $\lfloor \frac{a \cdot b}{360}\rfloor$ is exactly representable in a float number. With this restriction in place, the desired result can be computed with an error of at most 2 ulp as follows.
With the help of FMA and the reciprocal $\frac{1}{360}$ one can quickly compute $\mathrm{nint}(\frac{a \cdot b}{360})$, where $\mathrm{nint}(\cdot)$ is the nearest integer function, and implement Kahan's accurate difference-of-products computation $ab-cd$. The remainder is computed as $a\cdot b - \mathrm{nint}(\frac{a \cdot b}{360}) \cdot 360$. To match fmod() semantics, the remainder must have the same sign as the dividend $ab$. This can be achieved by conditionally adding / subtracting $360$ if the condition is not met.
The final float result may round to exactly $360$. If this is problematic, an additional check should be added to the algorithm below to map this to zero.
/*
Compute a*b-c*d with error <= 1.5 ulp.
Claude-Pierre Jeannerod, Nicolas Louvet, and Jean-Michel Muller,
"Further Analysis of Kahan's Algorithm for the Accurate Computation
of 2x2 Determinants", Mathematics of Computation, Vol. 82, No. 284,
Oct. 2013, pp. 2245-2264
/
float diff_of_products_kahan (float a, float b, float c, float d)
{
float w = d c;
float e = fmaf (c, -d, w);
float f = fmaf (a, b, -w);
return f + e;
}
/* Fast computation of fmod (a * b, 360). This is only accurate as long as the
quotient round (a * b / 360) can be represented accurately by a 'float'. It
is possible for the final result to equal 360 due to rounding error. If this
is problematic, an additional correction could be added.
/
float fmod360 (float a, float b)
{
const float divisor = 360.0f;
const float rcp_divisor = 2.77777777e-3f; // 1 / 360
const float int_cvt_magic = 0x1.8p23f; // 223 + 2*22
float prod, revs, rem;
prod = a * b;
/* approximate quotient and round it to the nearest integer */
revs = fmaf (prod, rcp_divisor, int_cvt_magic);
revs = revs - int_cvt_magic;
/* compute preliminary remainder: a * b - revs * 360 */
rem = diff_of_products_kahan (a, b, revs, divisor);
/* remainder may have wrong sign as quotient rounded; fix as necessary */
if ((rem * prod) < 0.0f) { // rem does not have same sign as product a*b
rem = rem + copysignf (divisor, prod);
}
return rem;
}