Model does not learn after ternarization of weights contrary to the paper mentioned below

I’m implementing the ‘Ternary Weights Network’ paper by Fengfu Li and Bo Zhang ( archive link - https://arxiv.org/abs/1605.04711).

I’m training a simple Covnet with linear layers on the MNIST dataset. Without ternarization, the exact same model converges with high accuracy, but after ternarization of the linear layers, the model does not perform well at all. It either gets stuck in a local optima ( in which it predicts all the classes with equal probability of 0.1) , or gets up to 18% accuracy( when I set the learning rate to aggressively high values)

What could be the reason for this?

class TernarizeOp():
    def __init__(self, model):
        count_targets = 0
        self.model = model
        for m in model.modules():
            if isinstance(m, nn.Linear):
                count_targets += 1
        self.ternarize_range = np.linspace(0, count_targets - 1, count_targets).astype('int').tolist()
        self.num_of_params = len(self.ternarize_range)
        self.saved_params = []
        self.target_modules = []

        for m in model.modules():
            if isinstance(m, nn.Linear):
                tmp = m.weight.data.clone()
                self.saved_params.append(tmp)  # tensor
                self.target_modules.append(m.weight)  # Parameter

    def TernarizeWeights(self):
        alpha = []
        for index in range(self.num_of_params):
            output,alpha_tmp = self.Ternarize(self.target_modules[index].data)
            self.target_modules[index].data = output
            alpha.append(alpha_tmp)
        return alpha

    def Ternarize(self, tensor):
        tensor = tensor.cuda()

        output = torch.zeros(tensor.size()).type(torch.cuda.FloatTensor)

        new_tensor = tensor.abs()
        delta = torch.mul(0.7, torch.mean(new_tensor, dim=1))
        new_tensor = torch.t(new_tensor)

        t = torch.greater_equal(new_tensor,delta).type(torch.cuda.FloatTensor)
        x = torch.greater(tensor,0).type(torch.cuda.FloatTensor)
        y = torch.less(tensor,0).type(torch.cuda.FloatTensor)
        y = torch.mul(y,-1)
        z = torch.add(x,y)
        t = torch.t(t)
        final = torch.mul(t,z)

        new_tensor = torch.t(new_tensor)

        final.cuda()
        alpha = torch.mean(torch.mul(final,new_tensor),dim=1)

        output = torch.add(output,final)

        return (output,alpha)

The lines of code which call the object of this class in the training loop are -

alpha = ternarize_op.TernarizeWeights()
l = []
l.append(imgs)
l.append(alpha)
output = model(l)

The forward method of my simple covnet takes this L list as an input for forward propagation.

Why does this not work?

Topic training computer-vision research neural-network

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.