Fast AI Lesson 4 - MNIST. Confused about multiplying weights by pixels?

I’m on lesson 4 of the Fast AI "Deep Learning for Coders" course, and have been back through the same lesson a few times now but I don’t think I’m quite getting a few things. I want to have an understanding of what’s going on before moving on. This lesson is on MNIST - and Jeremy is recognising 3s vs 7s. So he has 12000 images (ignoring mini-batches) of about 800 pixels each, and his tensor has a shape of …
Category: Data Science

How to specify version for dependencies so that each one is compatible and stays within a size limit?

I am trying to deploy a web app to Heroku. The free tier is limited to 500 MB. I am using my resnet34 model as a .pkl file. I create model with it using the fastai library. This project requires torch and torchvision as dependencies. But not specifying the dependency will download the latest version of torch which alone is 750 MB and exceeds the memory limit. So, I specify torchvision version as 0.2.2 and specify the wheel for torch …
Category: Data Science

I need to plot only training curve in the fastai library using the learner.recorder.plot_losses() function . FASTAI devs pls help

I have a task where I need to only plot the training loss and not the validation loss of the plot_losses function in the fastai library with learner object having recorder class, but I am not able to properly implement the same. I am using the fastai v1 for this purpose due to project restrictions. Here is the github code for the same: class Recorder(LearnerCallback): "A `LearnerCallback` that records epoch, loss, opt and metric data during training." def plot_losses(self, skip_start:int=0, …
Category: Data Science

Colab variable inspector stops working after importing from fastbook

As best as I could find, this question was not asked before. I'm using colab, and I use its variable inspector. I'm trying to do the FastAI exercices, and I noticed that when doing them, the variable inspector stops showing variables. For instance - I open a new notebook and I start creating variables in cells. x=5, a=6 and so forth. These variables are shown in the inspector. But, once I run the line: from fastbook import * the variable …
Topic: fastai colab
Category: Data Science

How to get significativity for tabular data in machine learning?

Im using fastai to train a network on tabular data (https://docs.fast.ai/tutorial.tabular.html). I have a table with 5 columns, each of these is the specific attribute that describes a galaxy and helps to classify it into two types: elliptical and spiral. My question is: Is it possible to get the value of which of these attributes is most helpful/least helpful for the training? I mean some king of ranking.
Category: Data Science

From what function do come the gradients that I use to adjust weights?

I have a question about the loss function and the gradient. So I'm following the fastai (https://github.com/fastai/fastbook) course and at the end of 4th chapter, I got myself wondering. From what function do come the gradients that I use to adjust weights? I do understand that loss function is being derivated. But which? Can I see it? Or is it under the hood of PyTorch? Code of the step function: def step(self): self.w.data -= self.w.grad.data * self.lr self.b.data -= self.b.grad.data …
Category: Data Science

Classification with a lot of the classes

I’m trying to make model which will classify text into about 500 different classes. I think that I have to customize architecture of the Pooling Classifier which looks now like this: (1): PoolingLinearClassifier( (layers): Sequential( (0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): Dropout(p=0.2, inplace=False) (2): Linear(in_features=1200, out_features=50, bias=True) (3): ReLU(inplace=True) (4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): Dropout(p=0.1, inplace=False) (6): Linear(in_features=50, out_features=498, bias=True) ) I think that I have to change in (2): Linear layer to have more out_features because …
Category: Data Science

Fine-grained image classification

I have a dataset which has 4 classes (say A,B,C,D). The task requires fine-grained image classification. The problem I am facing is that for 2 of the classes (C,D), the model's performance is not so great. Out of around 200 test images belonging to class C, 46 are being predicted as belonging to class D. So, class C's recall is poor, as a result class D's precision is poor. Now, I also looked at the validation set images belonging to …
Category: Data Science

How to handle imbalanced NLP text data set e.g. some classes only have 2 records

I am working on a dataset with around 2000 records. Around 80% records have their the categorical labels. There are around 200 categories, some categories got more than 20 records; whereas others only have TWO.... Considering this is a text dataset, so I cannot do the oversampling for minority categories with techniques like what I could do for images. I am using Fast AI which is based on PyTorch. So what can I do for it?
Category: Data Science

Is it advisable to use a model which is underfit but gives very high accuracy?

I am training a model for a single-label classification task in Vision. In this training, I am using oversampling of all the classes, and MixUp augmentation, along with rotation and dihedral transformations to augment data. What happens is, the model, after being trained for 20 epochs, achieves $<8\%$ validation loss (CE Loss) and $98\%$ accuracy in predicting the labels of the images in the validation set. The problem is that the model underfits. While the accuracy is extremely high and …
Category: Data Science

fastai - using 'untar_data' function in kaggle kernel

I have recently started with fastai lesson 1 and I am using kaggle to run the course notebooks. While going through the ‘lesson1-pets’ notebook we use untar_data(URLs.PETS) to get the data. What I want to understand is where does this data get downloaded to? As I can observe, after running the untar_data(URLs.PETS) function, it says downloading… ,the data gets downloaded, but nothing gets added to the data section of kaggle kernel. Note: I am able to run the whole notebook …
Category: Data Science

Face recognition - How to make an image classifier with large number of classes?

I am planning to make an image classifier that identifies the face of every player in the English Premier League. I have a couple of questions (since until now I have only worked with small or academic datasets). My questions: How do I download this many different images? Since it's pretty hard to manually download the pictures individually, is there a way to automate it? I'm following this platform and am required to make a different class for each player. …
Category: Data Science

Improve performance of my CNN model

I am working on an image classification problem. There are 876 images in the training and 600 in the test dataset. It is a multi class classification for plants. Since this is my first CNN problem, I started working with tensorflow and keras to build my model and then started using transfer learning to improve my performance. The best model which I built so far using Keras is getting me a score of 0.68 ( since the test dataset has …
Category: Data Science

Arguments in python fast.ai function that are not in the function definition?

I have been coming across function calls that use arguments that are not in the function definition. I would like to know how that works (i.e. how the compiler interprets this). For example, this function call: interp.plot_confusion_matrix(figsize=(12,12), dpi=60) uses the variables "figsize" and "dpi", but neither of them turn up in the definition of plot_confusion_matrix: help(interp.plot_confusion_matrix) gives: plot_confusion_matrix(normalize: bool = False, title: str = 'Confusion matrix', cmap: Any = 'Blues', norm_dec: int = 2, slice_size: int = None, **kwargs) -> …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.