I am trying to train a neural network to approximate the sin(x) function, but I want it to generalize outside the range of the training data. Specifically, I train the network on x values within [-π, π] and test it on a disjoint range, such as [π, 2π].
While the model fits well on the training data, it fails to extrapolate to the test range, showing poor generalization. I understand that neural networks are not inherently great at extrapolating due to their reliance on training distributions, but I want to explore techniques that might help in this scenario.
I’ve considered using the validation loss as a signal for training (e.g., dynamically modifying the loss function or optimizer), but this might introduce data leakage, as the validation set informs the training process directly.
Here are the constraints and goals:
- No data leakage: The model should not have access to the test/validation outputs directly.
- No cheating: The network should learn the
sin(x)function without relying on built-in knowledge of trigonometric functions. - Generalization focus: The primary goal is to encourage the network to learn a truly generalizable representation of
sin(x).
For example, a naive MLP does this:
My Questions:
- Is there a principled way to use the validation loss to guide training without introducing data leakage?
- Are there known techniques or architectures that could improve extrapolation in such tasks?
- Is this even feasible, or are there fundamental limitations in using neural networks for tasks requiring extrapolation?
Any suggestions, ideas, or references to relevant research would be greatly appreciated.
