Keras very low accuracy, saturate after few epochs while training
I am very new to the data science domain and directly jumped to TensorFlow models. I've worked on examples provided on the website before. My first time doing any project using it.
I am building a Cricket Score Predictor using Keras, Tensorflow. I have a dataset of details of players in a csv containing columns - striker, non_striker, bowler, run_per_ball, run_per_ball_avg, ball_count. ball_count and run_per_ball are labels of the model and rest are features. I have a total of 51555rows x 6columns, after 80:20 split, train_dataset is 41244rows x 6columns.
Here's my code, there lots of extra stuff though, but will get the idea.
import pandas as pd
import numpy as np
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense, Dropout, BatchNormalization, Conv2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras.metrics import KLDivergence
from tensorflow.keras.layers.experimental import preprocessing
df = pd.read_csv('dataset/output_total_run_ball_avg2.csv')
df = df.loc[:,[striker, bowler, non_striker, run_per_ball, run_per_ball_avg, ball_count]]
df = df.sort_values(by=['run_per_ball_avg'])
wordList = []
wordMap = {}
def getNumber(word):
if word in wordMap:
return wordMap[word];
wordIndex = len(wordList)
wordList.append(word)
wordMap[word] = wordIndex
return wordIndex
for name in df[striker].drop_duplicates():
df.loc[df['striker'] == name, ['striker']] = getNumber(name)
for name in df[bowler].drop_duplicates():
df.loc[df['bowler'] == name, ['bowler']] = getNumber(name)
for name in df[non_striker].drop_duplicates():
df.loc[df['non_striker'] == name, ['non_striker']] = getNumber(name)
df['striker'] = df.striker.astype(int)
df['bowler'] = df.bowler.astype(int)
df['non_striker'] = df.non_striker.astype(int)
df.dtypes
sns.pairplot(df[[striker, bowler, non_striker, run_per_ball, run_per_ball_avg, ball_count]], diag_kind='kde')
train_dataset = df.sample(frac=0.8, random_state=0)
test_dataset = df.drop(train_dataset.index)
train_features = train_dataset.loc[:,[striker, bowler, non_striker, run_per_ball_avg]]
test_features = test_dataset.loc[:,[striker, bowler, non_striker, run_per_ball_avg]]
train_labels = train_dataset.loc[:,[ball_count, run_per_ball]]
test_labels = test_dataset.loc[:,[ball_count, run_per_ball]]
normalizer = preprocessing.Normalization()
normalizer.adapt(np.array(train_features))
def build_and_compile_model(norm):
model = keras.Sequential([
norm,
Dense(12, activation='relu'),
keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones'),
Dense(64, activation='selu'),
keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones'),
Dense(64, activation='elu'),
keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones'),
Dense(64, activation='selu'),
keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones'),
Dense(64, activation='elu'),
Dense(1)
])
model.compile(loss='mean_squared_error',
optimizer=SGD(lr=0.00001), metrics=['accuracy'])
return model
dnn_model = build_and_compile_model(normalizer)
dnn_model.summary()
history = dnn_model.fit(
train_features, train_labels,
validation_split=0.2,
verbose=2, epochs=3000)
When I train the model, performance is poor and gets saturated within few epochs. And below is a glimpse, the accuracy remains same for next 1000 epochs atleast.
Epoch 1/3000
1032/1032 - 1s - loss: 15.5479 - accuracy: 0.0984 - val_loss: 13.3904 - val_accuracy: 0.1297
Epoch 2/3000
1032/1032 - 1s - loss: 12.3266 - accuracy: 0.1654 - val_loss: 11.0267 - val_accuracy: 0.2033
Epoch 3/3000
1032/1032 - 1s - loss: 10.3872 - accuracy: 0.2040 - val_loss: 9.4669 - val_accuracy: 0.2104
Epoch 4/3000
1032/1032 - 1s - loss: 9.1706 - accuracy: 0.2088 - val_loss: 8.5238 - val_accuracy: 0.2117
Epoch 5/3000
1032/1032 - 1s - loss: 8.4002 - accuracy: 0.2102 - val_loss: 7.9032 - val_accuracy: 0.2124
Epoch 6/3000
1032/1032 - 1s - loss: 7.9329 - accuracy: 0.2108 - val_loss: 7.5526 - val_accuracy: 0.2127
Epoch 7/3000
1032/1032 - 1s - loss: 7.6496 - accuracy: 0.2110 - val_loss: 7.3502 - val_accuracy: 0.2128
Epoch 8/3000
1032/1032 - 1s - loss: 7.4813 - accuracy: 0.2110 - val_loss: 7.2292 - val_accuracy: 0.2132
Epoch 9/3000
1032/1032 - 1s - loss: 7.3916 - accuracy: 0.2110 - val_loss: 7.1537 - val_accuracy: 0.2135
Epoch 10/3000
1032/1032 - 1s - loss: 7.3251 - accuracy: 0.2111 - val_loss: 7.1124 - val_accuracy: 0.2136
Epoch 11/3000
1032/1032 - 1s - loss: 7.3063 - accuracy: 0.2111 - val_loss: 7.0945 - val_accuracy: 0.2137
Epoch 12/3000
1032/1032 - 1s - loss: 7.2791 - accuracy: 0.2111 - val_loss: 7.0772 - val_accuracy: 0.2139
Here my data graph from seaborn.
Things I've already tried after reading in various places and sources:
- Tried optimizers : Adam, SGD with different learning rate from 0.001 to 0.000001.
- Tried loss functions : max_absolute_error, max_squared_error, mse, categorical_crossentropy
- Normalised Inputs
- Mapped players name to individual numbers
- Added/Removed Sequential layers, 2 - 4 hidden layers
- Used Batch normalization
I've tried a lot and did a lot of tweaking but still no hope so far. I tried every possible method suggested online. Maybe I'm doing something silly. Any help would mean a lot.
Above are my findings so far, if any other details needed, I'll try to post here. Thanks to everyone who is willing to help.
Topic data-science-model keras tensorflow neural-network machine-learning
Category Data Science