It looks like a Discrete Fourier Transform can take N sampled data points, and find weights for sinusoids of N different frequencies, which will exactly recreate the original N sampled data points: https://en.wikipedia.org/wiki/Discrete_Fourier_transform
Having done a lot of reading, I have a good understanding of what it is doing, and why it should approximately recreate the input points. But I don't see why it should be exact? Most articles only explain why you expect it to be similar-ish.
It measures the "similarity" of the signal to waves of different frequencies, by summing/integrating the product of the wave and the original signal. But why the product? Sure, when both are very positive, it gives a large positive value, or when both are negative, they multiply to a large positive value. Seems sensible.
...but why not sum/integrate something like the difference of the original signal, and a particular sinusoid? Seems like this would be a more accurate measure of how closely it matches that frequency?
But using the product is not only better than using the difference, it's actually perfect! It finds the intensities of the sinusoids needed to exactly recreate the input data points! Why? How?
I understand that it should be possible (N data points, and N frequency intensities, same N degrees of freedom). But why is the formula involving the product the correct, exact answer?