How to handle undefined or null data in a neural network

Let me preface this post with I am incredibly new to machine learning/neural networks. I am currently working on a classification neural network using TensorFlow whose input is multiple features of continuous data and whose output is an array of confidence levels for a group number (softmax). In some instances, the data passed into the model could have some undefined values in various rows. I understand from research and testing that input tensors' elements must all be of the same type. I have looked into a couple of options on how to fix the issue of these undefined values:

  1. I could simply set these undefined points equal to some constant like 0 or -1 (I believe this to be my best option, as it does not sacrifice other features)
  2. I could remove any row of data with an undefined value. I'm not a fan of this idea as I am working with high-dimensional data, so if I remove one row, my model would be missing out on quite a few columns worth of data.

Beyond these two, I have been unable to find any additional information.

I have tested both of these ideas, and while they fix the issue, they do have some negative impacts on the accuracy of my model. My question is this: What are some other effective ways of handling undefined values when working with neural networks?

I understand that the question is relatively vague, and I apologize if it has any necessary information missing. Please let me know if there is anything I can clarify.

Topic tensorflow machine-learning

Category Data Science


One option is to remove the specific input node which has a null value for that training instance. This is similar to dropout. Thus, the connections between that input node and the next layer would not be present and would not contribute to prediction.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.