What are some general tips to improve my MNIST classifier?

I have built a CNN from scratch in python using Numpy, to tackle the MNIST hand-written digit recognition problem. It's composed out of a convolutional layer (3 3x3 filters), a maxpooling layer (2x2 pooling) and the 10-label output layer. I'm using softmax for the output activation function and cross-entropy as loss function. I've tried running it with a couple of different hyperparameters and so far the best accuracy i've gotten is 97%, when training on the whole train dataset (60000 …
Category: Data Science

MNIST - Vanilla Neural Network - Why Cost Function is Increasing?

I've been combing through this code for a week now trying to figure out why my cost function is increasing as in the following image. Reducing the learning rate does help but very little. Can anyone spot why the cost function isn't working as expected? I realise a CNN would be preferable, but I still want to understand why this simple network is failing. Please help:) import numpy as np import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import matplotlib.pyplot …
Category: Data Science

Federated Learning: some clients with 0% accuracy

Suppose that I am doing a Federated Learning experiment using MNIST. As you know MNIST has 10 classes. Now, Federated Learning is useful especially in cases like hospitals, for collaborations, because one hospital can have samples from different classes wrt another hospital. So I want to reproduce this non-iidness. Suppose that I have 2 clients: the first client takes the first 5 digits of MNIST (0, 1, 2, 3 and 4) and the second client takes the last digits (5, …
Category: Data Science

MNIST data shape

In going through the different tutorials on CNN, autoencoders, and so on I trained myself on the MNIST problem. The different images are stored in a 3D array which shape is (60000,28,28). In some tutorials for the first layer of CNN they use the Flatten function keras.layers.Flatten(input_shape=()) but in other tutorials, they transform the 3D Array in A 4D Array (60.000, 28,28,1 ) that I suppose is identical that use the Flatten function? Am I right? Why there are two …
Category: Data Science

Tensorflow - I don't get the right shapes - `ValueError: Shapes (100, 10, 10) and (100, 10) are incompatible`

I am working on the mnist classification code. Such errors continue to occur in the code 코드 below. import tensorflow as tf (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data() print(x_train.shape) # (60000, 28, 28) print(y_train.shape) import matplotlib.pyplot as plt print("Y[0] : ", y_train[0]) plt.imshow(x_train[0], cmap=plt.cm.gray_r, interpolation = "nearest") x_train = x_train.reshape(-1,28*28) x_test = x_train / 255.0 x_test / 255.0 y_train = tf.keras.utils.to_categorical(y_train, 10) y_test = tf.keras.utils.to_categorical(y_test, 10) This is the error occurrence code. model = tf.keras.Sequential() model.add(tf.keras.layers.Dense(units=10, input_dim=784, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer=tf.optimizers.Adam(learning_rate=0.001), …
Category: Data Science

How to pass a sequence of 4 images into LSTM and CNN-LSTM

I got an assignment and stuck with it while going down the rabbit hole of learning PyTorch, LSTM, and CNN. Provided the well-known MNIST library I take combinations of 4 numbers and per combination, it falls down into one of 7 labels. eg: 1111 label 1 (follow a constant trend) 1234 label 2 increasing trend 4321 label 3 decreasing trend ... 7382 label 7 decreasing trend - increasing trend - decreasing trend The shape of my tensor after loading of …
Category: Data Science

Data set of vectors of SVG paths for digits

I have used the MNIST data set many times to train models for digit recognition based on object character recognition (OCR). I am now trying to do the same but with a data set of svg paths.. I am trying to find an MNIST equivalent of a digital path / svg based data set. Here is a sample: the svg <path d="m233.5,119.4375c-1,-1 -3.025818,-1.320366 -5,-1c-3.121445,0.506538 -8.191559,0.090805 -15,2c-14.665848,4.112541 -23.266006,8.139008 -31,11c-6.291519, 2.327393 -11.679474,6.571106 -14,11c-1.467636,2.801086 -2,7 -2,10c0,4 -0.610916,8.03746 0,13c0.503769,4.092209 2.877655,8.06601 4,10c1.809723,3.118484 4.718994,6.310211 8,9c5.576645,4.571762 11.887314,5.376694 …
Category: Data Science

Should I put time on my Vanilla ANN for classifying MNIST Dataset

I am building a Vanilla Neural Network in Python for my Final Year project, just using Numpy and Matplotlib, to classify the MNIST dataset. Here's the specifications of the model: One Input Layer + One Hidden Layers + One Softmax Layer Number Nodes in each layer :[784, 800, 10] Activation function used: ReLU and Softmax. Also Normalized the Train and Test set, by dividing it by 255 Have used Mini Batch Gradient Descent(Mini Batch Size=4096). The Model shows very low …
Category: Data Science

Accuracy over 100%

I am using OpenFL, the Intel framework for Federated Learning. If I run their tutorial example, I have that loss decreases and accuracy is in range 0-100%, like this: [16:21:05] METRIC Round 4, collaborator env_one train result train_loss: 3.083468 experiment.py:112 [16:21:29] METRIC Round 4, collaborator env_one localy_tuned_model_validate result acc: 0.640100 experiment.py:112 [16:21:53] METRIC Round 4, collaborator env_one aggregated_model_validate result acc: 0.632200 experiment.py:112 METRIC Round 4, collaborator Aggregator localy_tuned_model_validate result acc: 0.640100 experiment.py:112 METRIC Round 4, collaborator Aggregator aggregated_model_validate result acc: …
Category: Data Science

How long does it typically take to train a MNIST data on a Mac Pro?

My code is below: # define a simple CNN model def baseline_model(): # create model model = Sequential() model.add(Conv2D(30, (5, 5), input_shape=(1, 28, 28), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(15, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.2)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dense(50, activation='relu')) model.add(Dense(num_classes, activation='softmax')) # Compile model model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) return model # build the model model = baseline_model() # Fit the model model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=500, batch_size=200) # Final evaluation of the model scores = model.evaluate(X_test, y_test, verbose=0) My Mac's capacity: …
Topic: mnist cnn
Category: Data Science

MNIST Digit dataset requires login

I want to download the MNIST handwritten digit data from the offical site: https://yann.lecun.com/exdb/mnist/ But it wants me to enter a username and a password. How can I download the data? Does anybody know the credentials or any backup? Thank you.
Category: Data Science

Why does my CNN program improve neither the training accuracy nor the validation accuracy despite the error function drastically decreasing?

I have written a Python code to model a convolutional neural network (pastebin link) from the most basic Python libraries (numpy and math in addition to sklearn and pandas being used only for reading data). I will summarize the code's structure: Goal: To read MNIST dataset (total of 1797 8x8 grayscale images) and predict what number is written. Neural Network type: Basic convolutional; first layer will be three 3x3 filters with stride 1 (no padding), second layer will be three …
Category: Data Science

how to prepare data for cross validation in mnist dataset?

How to use k-fold cross validation for MNIST dataset? I read article documentation on sci-kit learn ,in that example they used the whole iris dataset for cross validation. from sklearn.model_selection import cross_val_score clf = svm.SVC(kernel='linear', C=1) scores = cross_val_score(clf, iris.data, iris.target, cv=5) scores for example while importing mnist dataset in keras from keras.datasets import mnist (Xtrain,Ytrain),(Xtest,Ytest)=mnist_load() in this dataset is already divided in test and train , so to apply cross validation on the entire dataset do we need to …
Category: Data Science

Does MNIST generalise to european handwriting?

I want to use the MNIST dataset to teach my neural network to recognise numbers. My problem is: The data I am working with is "european". What i mean by that is: the seven always has a dash (it often has a dash in MNIST too so that is more unlikely to cause a problem) and more importantly: the 1 is always written with this hook in Europe, whereas Americans tend to use a straight line (almost 100% of MNIST …
Category: Data Science

What model to train to restore MNIST test dataset

I came across this problem, and not sure where to start. What model would work best for this problem and why? Imagine the digits in the test set of the MNIST dataset (http://yann.lecun.com/exdb/mnist/) got cut in half vertically and shuffled around. Implement a way to restore the original test set from the two halves, whilst maximising the overall matching accuracy.
Category: Data Science

What is wrong in this Deep Neural network.?

I have recently written some simple Neural Network code just for my toy dataset and it works fine, so I have decided to take a big step forward and try to write code from scratch for MNIST data. But the code can only get the accuracy of 11% or even below (sometimes). I have googled for the solution and I haven't found any concrete results to solve my problem. (I am new the Neural Network too.) Before we get into …
Category: Data Science

ValueError: Data cardinality is ambiguous in colab

Initially, only x_train was reshape, and an error occurred, so x_test was also reshape. Then, I have another error. It seems to be an error caused by inconsistency with y data, but modifying the code does not solve the error. This is my code import tensorflow as tf (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data() print(x_train.shape) # (60000, 28, 28) print(y_train.shape) # (60000, ) import matplotlib.pyplot as plt print("Y[0] : ", y_train[0]) plt.imshow(x_train[0], cmap=plt.cm.gray_r, interpolation = "nearest") x_train = x_train.reshape(-1, 28*28) …
Category: Data Science

Novice machine learner wondering how to interpret big variance in batch error across batches in MNIST perceptron

I'm trying to get a better understanding of basic neural networks by implementing a little framework in C++. I've started with the classical MNIST exercise. I get to 91% accuracy on the test sample which I'm already pretty happy about. The thing is, the maximum accuracy is almost reached after just one epoch. The next epochs do not seem to improve the situation much. I am optimizing using stochastic gradient descent with a batch size of 40. During the training, …
Category: Data Science

save reconstructed data points from variational autoencoder as original MNIST

I have a VAE implementation that generates images from the latent distribution. I want to save those "images" as we have in the original dataset. For example, my VAE generates a data point, using following code: data_point = decoder.predict(sample_2).reshape(28,28,1) plt.figure(figsize=(4, 4)) plt.imshow(data_point, cmap = plt.cm.gray), plt.axis('off') plt.show() and I can see it as image (number 4 from MNIST). If I look at the value of data_point, it's something like this: array([[[4.03011961e-13], [2.21622661e-13], [1.77334818e-13], [7.62046296e-13], [2.77884297e-13], [2.07368519e-13], [8.03054997e-13], [2.32846815e-12], [3.30792956e-13], [5.10265875e-13], …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.