'Deep learning train large numbers

When I try to create model to predict this data. I can't get good loss. How can I optimize it?

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

X = np.array([32878265.2, 39635188.8, 738222697.41, 33921812.23, 39364408, 50854015, 50938146.63, 54062184.4, 32977734, 27267164, 30673902.72])
    
y = np.array([80712, 111654, 127836.61, 128710, 147907, 152862, 154962, 138503, 140238, 105121, 113211.8])



X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.1)

scaler = preprocessing.StandardScaler().fit(X_train.reshape(-1, 1))
X_scaled = scaler.transform(X_train.reshape(-1, 1))

tf.random.set_seed(42)


model = tf.keras.Sequential([
  # tf.keras.layers.Dense(10),
  tf.keras.layers.Dense(1, input_shape=[1]),
  
  tf.keras.layers.Dense(1),
])

model.compile(loss=tf.keras.losses.mae,
              
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["mae"])

model.fit(X_scaled, y_train, epochs=5000,
          validation_data=(X_test, y_test))

Epoch 2466/5000 1/1 [==============================] - 0s 29ms/step - loss: 38000588.0000 - mae: 38000588.0000 - val_loss: 28384532.0000 - val_mae: 28384532.0000 Epoch 2467/5000 1/1 [==============================] - 0s 31ms/step - loss: 38000588.0000

  • mae: 38000588.0000 - val_loss: 28384536.0000 - val_mae: 28384536.0000 Epoch 2468/5000 1/1 [==============================] - 0s 41ms/step - loss: 38000588.0000 - mae: 38000588.0000 - val_loss: 28384540.0000 - val_mae: 28384540.0000 Epoch 2469/5000 1/1 [==============================] - 0s 41ms/step - loss: 38000588.0000
  • mae: 38000588.0000 - val_loss: 28384536.0000 - val_mae: 28384536.0000


Solution 1:[1]

Your NN model is just a linear regression. When you plot the data, you see that you have an outlier which is the main problem for a good prediction: enter image description here

I guess, you typed a digit too much.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Kilian