No loss decay while learning Neural Network for X-OR operation in torch

I have implemented a simple 1 hidden layer feed forward neural network in torch to learn X-OR operation. Below is my code:

require 'torch'
require 'nn'

m = nn.Sequential()
m:add(nn.Linear(2,2))
m:add(nn.Linear(2,1))
m:add(nn.Sigmoid())

torch.manualSeed(1)

m.modules[1].weights = torch.rand(2,2)
m.modules[2].weights = torch.rand(2,1)

--print(parax_m)

criterion = nn.BCECriterion()

inputs = torch.Tensor(4,2)
inputs[1][1] = 0
inputs[1][2] = 0

inputs[2][1] = 0
inputs[2][2] = 1

inputs[3][1] = 1
inputs[3][2] = 0

inputs[4][1] = 1
inputs[4][2] = 1

targets = torch.Tensor(4,1)
targets[1][1] = 0
targets[2][1] = 1
targets[3][1] = 1
targets[4][1] = 0

function trainEpoch(m,criterion,inputs,targets)
    for i=1,inputs:size(1) do
        local input = inputs[i]
        local target = targets[i]
        local output = m:forward(input)
        --print(output)
        local loss = criterion:forward(output,target)
        print(loss)

            -- backward
        local gradOutput = criterion:backward(output,target)
        m:zeroGradParameters()
        local gradInput = m:backward(input,gradOutput)
        --update
        --module:updateGradParameters(0.9) -- momentum (require dpnn)
        m:updateParameters(0.01) -- W = W -0.1*dL/dW
    end
end

for i=1,10000 do
    trainEpoch(m,criterion,inputs,targets)
end

-- prediciton
testinput = torch.Tensor(4,2)
testinput[1][1] = 0
testinput[1][2] = 0

testinput[2][1] = 0
testinput[2][2] = 1

testinput[3][1] = 1
testinput[3][2] = 0

testinput[4][1] = 1
testinput[4][2] = 1

for i=1,testinput:size(1) do
    local output = m:forward(testinput[i])
    print(output)
end

When I run the above code, there is no decay in loss(almost same in all iterations) so it does not predict the correct output. Can anyone help me to find the mistakes what I am doing wrong here?

I have also tried with different manual seed value, different initialisation of weights but still loss remains same in all iterations.

Topic torch neural-network

Category Data Science


Finally I found the error in my Network. 1. I haven't added the non-linear layer after the first linear layer. 2. No randomisation while running stochastic gradient descent.

By updating these two things, now it is working fine. Updated Code:

require 'torch'
require 'nn'

m = nn.Sequential()
m:add(nn.Linear(2,2))
m:add(nn.Tanh())
m:add(nn.Linear(2,1))
m:add(nn.Sigmoid())

--print(parax_m)

criterion = nn.BCECriterion()

inputs = torch.Tensor(4,2)
inputs[1][1] = 0
inputs[1][2] = 0

inputs[2][1] = 0
inputs[2][2] = 1

inputs[3][1] = 1
inputs[3][2] = 0

inputs[4][1] = 1
inputs[4][2] = 1

targets = torch.Tensor(4,1)
targets[1][1] = 0
targets[2][1] = 1
targets[3][1] = 1
targets[4][1] = 0

function trainEpoch(m,criterion,inputs,targets)
    for i=1,inputs:size(1) do
        local idx = math.random(1,4)
        local input = inputs[idx]
        local target = targets[idx]
        local output = m:forward(input)
        --print(output)
        local loss = criterion:forward(output,target)
        print(loss)

            -- backward
        local gradOutput = criterion:backward(output,target)
        m:zeroGradParameters()
        local gradInput = m:backward(input,gradOutput)
        --update
        --module:updateGradParameters(0.9) -- momentum (require dpnn)
        m:updateParameters(0.01) -- W = W -0.1*dL/dW
    end
end

for i=1,10000 do
    trainEpoch(m,criterion,inputs,targets)
end

-- prediciton
testinput = torch.Tensor(4,2)
testinput[1][1] = 0
testinput[1][2] = 0

testinput[2][1] = 0
testinput[2][2] = 1

testinput[3][1] = 1
testinput[3][2] = 0

testinput[4][1] = 1
testinput[4][2] = 1

for i=1,testinput:size(1) do
    local output = m:forward(testinput[i])
    print(output)
end

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.