What is wrong in this Deep Neural network.?

Question

What is wrong in this Deep Neural network.?

Pranav Patel

2021年11月18日 20:59

I have recently written some simple Neural Network code just for my toy dataset and it works fine, so I have decided to take a big step forward and try to write code from scratch for MNIST data. But the code can only get the accuracy of 11% or even below (sometimes). I have googled for the solution and I haven't found any concrete results to solve my problem. (I am new the Neural Network too.)

Before we get into the code, my Neural Network structure is like this

Input layer: 784 Neurons
Hidden Layer (1): 15 Neurons
Output Layer: 10 Neurons

Activation Function:-

Hidden Layer (Sigmoid)
Output Layer (Softmax)

And for the loss I have taken Cross Entropy loss

To know where my equations come from visit this blog post.

Code

Required functions:

def sigmoid(x): 
    return 1.0 / (1.0 + np.exp(-x))

def derivation_sigmoid(x): 
    return x * (1.0 - x)

def softmax(x): return np.exp(x) / (np.sum(np.exp(x)))

def derivation_softmax(x): 
    tem = sum(np.exp(x)) ** 2    
    tem1 = sum(np.exp(x))    
    a = []    
    for i_ in range(len(x)):
        t1 = (np.exp(x[i_]) * (tem1 - np.exp(x[i_]))) / tem
        a.append(t1)    
    return a

def derivation_cross(x, y):    
    return -1 * ((x * (1 / y)) + (1 - x) * (1 / (1 - y)))


def extract_data(filename, num_images):   
    with gzip.open(filename) as bytestream:    
        bytestream.read(16)    
    buf = bytestream.read(IMAGE_SIZE * IMAGE_SIZE * num_images)    
    data =  np.frombuffer(buf, dtype=np.uint8).astype(np.float32)    # data =  (data - (PIXEL_DEPTH / 2.0)) / PIXEL_DEPTH    
        data =  data.reshape(num_images, IMAGE_SIZE, IMAGE_SIZE, 1)   
    return data

def extract_labels(filename, num_images):   
    with gzip.open(filename) as bytestream:    
        bytestream.read(8)   
    buf = bytestream.read(1 * num_images)    
    labels = np.frombuffer(buf, dtype=np.uint8).astype(np.int64)  
    return labels

Main code and libraries:

train_data_filename = "/path to the file"
train_data = extract_data(train_data_filename, 60000)
train_data_label = "/path to the file"
train_data_label_1 = extract_labels(train_data_label, 6000)



    input_neurons = train_data.shape[1] * train_data.shape[1]   # Number of Feature
    out_neurons = 10
    hidden_layer_nurons = 15
    lr = 0.01
    bias_hidden = np.ones((15, 1))
    bias_output = np.ones((10, 1))
    wih = np.random.uniform(-1, 1, size=(hidden_layer_nurons,input_neurons))
    who = np.random.uniform(-1, 1, size=(out_neurons, hidden_layer_nurons))



    def neural_network(wih, who, train_data, tarin_data_label_1):
        for e_ in range(0, 100000):
            for da_ in range(1, 2):
                # da_ = int(np.random.uniform(1, 3))
                c_error = []
                m1 = np.matrix((train_data[da_]))
                m2 = np.reshape(m1, (784, 1))
                m2 = ((m2 * 1) / 255)
                hidden_sigmoid = sigmoid((np.dot(wih, m2)))
                output = (np.dot(who, hidden_sigmoid))
                output_softmax = softmax((np.dot(who, hidden_sigmoid)))
                index = train_data_label_1[da_]
                for i in range(0, 10):
                    if i != index:
                        c_error.append([0])
                    else:
                        c_error.append([1])
                # c_error = np.matrix(c_error)
                a1 = np.array(c_error)
                a2 = np.array(output_softmax)

                # This is for Hidden - Output change weight
                cross_entropy_derivation = derivation_cross(a1, a2)
                # print cross_entropy_derivation
                softmax_derivation = derivation_softmax(output)
                sigmoid_derivation = derivation_softmax(hidden_sigmoid)
                sigmoid_derivation = np.reshape(sigmoid_derivation, (15, 1))

                m = np.reshape(softmax_derivation, (10, 1))
                d_output = cross_entropy_derivation * m
                who += (d_output.dot(hidden_sigmoid.T))

                # this is for Hidden - Input

                d_hidden = who.T.dot(d_output)
                d1 =  d_hidden * sigmoid_derivation
                wih += d1.dot(m2.T)
                test = np.argmax(output_softmax)
                print test, index


    neural_network(wih, who, train_data, tarin_data_label_1)

Topic mnist implementation deep-learning neural-network

Category Data Science

What is wrong in this Deep Neural network.?

About