3

I am using TensorFlow to train a simple neural network (3 sequential dense layers). The problem is that the accuracy changes a lot every time I retrain it from scratch. I understand that, since the weights are initialized randomly, it may not always arrive at the exact same accuracy; but, I get a range of 4% for the accuracy on the test set.

This variation makes it impossible to check if different configurations of the network or different preprocessing steps for the data work better or worse because the configuration/preprocessing is a good/bad idea, or just because I got lucky/unlucky with the random initial weights.

This is an example of accuracies on 5 consecutive train+test's. The numbers are:

  1. Accuracy on the train set
  2. Accuracy on the validation split (20%)
  3. Accuracy on the test set
  4. MSE
        1.  2.  3.  4.
Run #1: 95  91  74  20
Run #2: 94  92  75  18
Run #3: 94  91  74  20
Run #4: 94  92  73  20
Run #5: 94  91  77  17

I also find it confusing that there is no correlation between the accuracies for training, validation, and test sets.

I have tried different configurations of the ANN, longer trainings, shorter trainings, bigger and smaller validation split, different optimizers...nothing seems to give me a more stable accuracy among re-trainings. The numbers I have posted here are the best I could get.

Is a 4% range something normal? Is there a way to avoid those sub-optimal trainings? Could it be a problem related to local minima?

Ethan
  • 1,657
  • 9
  • 25
  • 39
Mr. Goferito
  • 131
  • 4

3 Answers3

2

I suggest to apply a nonrandom weight initialisation in order to see the impact of random initialization.

For instance, you can use the Nguyen-Widrow weight initialization.

def initnw(layer):
"""
Nguyen-Widrow initialization function

:Parameters: layer: core.Layer object Initialization layer """ ci = layer.ci cn = layer.cn w_fix = 0.7 * cn ** (1. / ci) w_rand = np.random.rand(cn, ci) * 2 - 1

Normalize

if ci == 1: w_rand = w_rand / np.abs(w_rand) else: w_rand = np.sqrt(1. / np.square(w_rand).sum(axis=1).reshape(cn, 1)) * w_rand

w = w_fix * w_rand b = np.array([0]) if cn == 1 else w_fix * np.linspace(-1, 1, cn) * np.sign(w[:, 0])

Scaleble to inp_active

amin, amax = layer.transf.inp_active amin = -1 if amin == -np.Inf else amin amax = 1 if amax == np.Inf else amax

x = 0.5 * (amax - amin) y = 0.5 * (amax + amin) w = x * w b = x * b + y

Scaleble to inp_minmax

minmax = layer.inp_minmax.copy() minmax[np.isneginf(minmax)] = -1 minmax[np.isinf(minmax)] = 1

x = 2. / (minmax[:, 1] - minmax[:, 0]) y = 1. - minmax[:, 1] * x w = w * x b = np.dot(w, y) + b

layer.np['w'][:] = w layer.np['b'][:] = b

return

Source: https://pythonhosted.org/neurolab/_modules/neurolab/init.html

On the other hand, remember that local minimum algorithms (Gradient Descent, Adam Optimizer, etc.) used to have some stochastic behavior, for example in the definition of the starting point, or specific noise parameters like epsilon.

Nicolas Martin
  • 5,014
  • 1
  • 8
  • 15
2

A possible cause of the problem is that you are using the mean-squared error (MSE) as loss function for a classification problem.

Normally, for classification you would use categorical cross-entropy.

noe
  • 28,203
  • 1
  • 49
  • 83
2

If you're getting 95% accuracy on training set, but only 75% on test set, this points to serious overfitting, which none of the measures you've listed are likely to address.

It's also suspicious that validation result are so close to training, but far from test. This often happens when you change validation set during training, meaning there's effectively no validation set at all. Or if you keep training over and over too many times until obtaining the desired accuracy on validation set, which is also a recipe for overfitting.

IMil
  • 121
  • 2