Using a neural network to learn regression in image processing

I have a camera system with some special optics that warp the field of view of the camera, dependent on two variables, $\theta_1$ and $\theta_2$. Given a specific configuration of these two variables, each pixel on my camera (which is 500x600 resolution) will see a specific coordinate on a screen in front of the camera. I can calculate this for each pixel, but it requires too many computations and is too slow. So, I want to learn a model that …
Category: Data Science

What enables transformers or very deep models "plan" ahead for sequential decision making?

I was watching this amazing lecture by Oriol Vinyals. On one slide, there is a question asking if the very deep models plan. Transformer models or models employed in applications like Dialogue Generation do not have a planning component but behave like they already have the dialogue planned. Dr. Vinyals mentioned that there are papers on "how transformers are building up knowledge to answer questions or do all sorts of very interesting analyses". Can any please refer to a few …
Category: Data Science

Intuitively, why do Non-monotonic Activations Work?

The swish/SiLU activation is very popular, and many would argue it has dethroned ReLU. However, it is non-monotonic, which seems to go against popular intuition (at least on this site: example 1, example 2). Reading the swish paper, the justification that the authors give is that non-monotonicity "increases expressivity and improves gradient flow... [and] may also provide some robustness to different initializations and learning rates." The authors provide an image to back up this claim, but at best this argument …
Category: Data Science

How to keep only significant weights in an ANN

My weights are store in a two dimensional matrix. Row i refers to node i in preceding layer and columns in that row are the neurons node i is connected to. I only want to keep some nodes. How do I pick 3 max weights and store it in a separate array while keeping track of which neuron it belonged to. Moreover, is it tested in theory that some weights contribute more than the others?
Category: Data Science

How to use text as an input for a neural network - regression problem? How many likes/claps an article will get

I am trying to predict the number of likes an article or a post will get using a NN. I have a dataframe with ~70,000 rows and 2 columns: "text" (predictor - strings of text) and "likes" (target - continuous int variable). I've been reading on the approaches that are taken in NLP problems, but I feel somewhat lost as to what the input for the NN should look like. Here is what I did so far: Text cleaning: removing …
Category: Data Science

Does high accuracy metrics with small (but equally sampled) dataset means a good model?

I have been training my CNN with 200 images per class for a classification problem. There problem is a binary classification one. And with the amount of test data ( 25 per class) I am getting good accuracy, precision and recall values. Does that mean my model is actually good?
Category: Data Science

What are deconvolutional layers?

I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work. The relevant part is 3.3. Upsampling is backwards strided convolution Another way to connect coarse outputs to dense pixels is interpolation. For instance, simple bilinear interpolation computes each output $y_{ij}$ from the nearest four inputs by a linear map that depends only on the relative positions of the input and output cells. In …
Category: Data Science

Averaging multiple train-test splits to estimate the performance with higher variability?

I have a small size data set and I want to assess the effect of a certain type of cases on the overall model performance. For example, is the model biased against people of a certain age group? Using a single train-test split, the number of cases of a particular type becomes quite small, and I suspect findings may occur due to randomness. Would it in this scenario make sense to use multiple train-test splits, compute the average performances, and …
Category: Data Science

Keras model with 3 input images giving wrong output

I have created a keras model that takes 3 images as input, passes them to individual CNN backbone(mobilenet_v2) and fuse the results from 3 individual streams. These fused outputs further goes through a FCN and gives probability values for 10 classes. Now when i pass 3 images to my model using model.predict(), I am getting an output of 3x10 (list of 3 outputs with 10 values in each). Here is the network snapshot and here is the output *[[0.04718336 0.07464679 …
Category: Data Science

Val Loss and manually calculated loss produce different values

I have a CNN classification model that uses loss: binary cross entropy: optimizer_instance = Adam(learning_rate=learning_rate, decay=learning_rate / 200) model.compile(optimizer=optimizer_instance, loss='binary_crossentropy') We are saving the best model so the latest saved model is the one that achieved the best val_loss: es = EarlyStopping(monitor='val_loss', mode='min', verbose=0, patience=Config.LearningParameters.Patience) modelPath = modelFileFolder + Config.LearningParameters.ModelFileName checkpoint = keras.callbacks.ModelCheckpoint(modelPath , monitor='val_loss', save_best_only=True, save_weights_only=False, verbose=1) callbacks = [checkpoint,es] history = model.fit(x=training_generator, batch_size=Config.LearningParameters.Batch_size, epochs=Config.LearningParameters.Epochs, validation_data=validation_generator, callbacks=callbacks, verbose=1) on the course of the training the logs show that the …
Category: Data Science

Overfitting in CNN

I am training a VGG net on STL-10 dataset I am getting Top-5 validation accuracy about 98% and Top-1 validation accuracy about 83% But both the Top-1 and Top-5 Training accuracy is reaching 100% Does this mean that the network is over-fitting? Or not? Code:: def conv2d(inp,name,kshape,s): with tf.variable_scope(name) as scope: kernel = get_weights('weights',shape=kshape) conv = tf.nn.conv2d(inp,kernel,[1,s,s,1],'SAME') bias = get_bias('biases',shape=kshape[3]) preact = tf.nn.bias_add(conv,bias) convlayer = tf.nn.relu(preact,name=scope.name) return convlayer def maxpool(inp,name,k,s): return tf.nn.max_pool(inp,ksize=[1,k,k,1],strides=[1,s,s,1],padding='SAME',name=name) def loss(logits,labels): labels = tf.reshape(tf.cast(labels,tf.int64),[-1]) #print labels.get_shape().as_list(),logits.get_shape().as_list() cross_entropy …
Category: Data Science

Understanding The Vertical and Horizontal stack in conditional gated Pixelcnn paper

I found some confusion understanding the importance of vertical and horizontal stacks as a solution to the blind spot problem presented in the original pixel cnn architecture discussed in this paper. The vertical and horizontal stacks ideas were presented in the this paper. Therefore, after browsing, I found this link to explain the concept. In the vertical stack section of the web page, I still find that pixel f, still cannot see pixels c, d and e. Any help is …
Category: Data Science

ValueError: Tensor Tensor("activation_5/Softmax:0", shape=(?, 2), dtype=float32) is not an element of this graph

There seem to be an issue with predicting using my keras model. I had trained it using the following keras code: model = Sequential() model.add(Conv2D(32, (3, 3), input_shape=(150, 150,3),padding='same')) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) model.add(Conv2D(32, (3, 3),padding='same')) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) model.add(Conv2D(64, (3, 3),padding='same')) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors model.add(Dense(64)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(2)) model.add(Activation('softmax')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) However when i predict it on my local system after training with the shape (1,150,150,3) . …
Category: Data Science

How to export shap waterfall values to dataframe?

I am working on a binary classification using random forest model, neural networks in which am using SHAP to explain the model predictions. I followed the tutorial and wrote the below code to get the waterfall plot shown below row_to_show = 20 data_for_prediction = ord_test_t.iloc[row_to_show] # use 1 row of data here. Could use multiple rows if desired data_for_prediction_array = data_for_prediction.values.reshape(1, -1) rf_boruta.predict_proba(data_for_prediction_array) explainer = shap.TreeExplainer(rf_boruta) # Calculate Shap values shap_values = explainer.shap_values(data_for_prediction) shap.plots._waterfall.waterfall_legacy(explainer.expected_value[0], shap_values[0],ord_test_t.iloc[row_to_show]) This generated the plot as …
Category: Data Science

Neural network is not giving the expected output after training in Python

My neural network is not giving the expected output after training in Python. Is there any error in the code? Is there any way to reduce the mean squared error (MSE)? I tried to train (Run the program) the network repeatedly but it is not learning, instead it is giving the same MSE and output. Here is the Data I used: https://drive.google.com/open?id=1GLm87-5E_6YhUIPZ_CtQLV9F9wcGaTj2 Here is my code: #load and evaluate a saved model from numpy import loadtxt from tensorflow.keras.models import load_model …
Category: Data Science

Learnable parameters in DNN

I've come across the term "learnable parameters" recently, and googling didn't help much as most search was describing learnable parameters in a CNN instead of a DNN. Is there any difference between the two? How would I compute the number of learnable parameters in a DNN? Could anyone please explain what those are with an example? I'm new to machine learning so I would appreciate some help on this.
Category: Data Science

Loss function to prevent estimator bias

I have a regression problem I'm trying to build a model for: Predicting sales per person (>= 0) depending on some variables. I'm running different model types and gave deep neural networks a try. The loss functions I'm using are mean squared error and mean absolute error (or sometimes a mix). I often run into this issue though, that despite mse and mae are being optimized, I end up with a very strong bias in the prediction, e.g. sum(training_all_predictions) / …
Category: Data Science

Strange Behavior for trying to Predict Tennis Millionaires with Keras (Validation Accuracy)

I'm trying to make an NN with Keras to predict the ATP players that will get more than US$1 million in prize money based on their weight and height (from a dataset I mined some weeks ago), but I have been getting a weird behavior especially for the validation accuracy. Sometimes it gets to 84-85%, which is reasonable since SVMs and GaussianNB seem to be able to hit only 83.3% at best (check this post for more info), but sometimes …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.