New to PyTorch and the PyTorch Forecasting library and trying to predict multiple targets using the Temporal Fusion Transformer model. I have 7 targets in a list as my targets variable. I'm using MultiLoss as my loss function with a list of 7 CrossEntropy loss functions (1 per target variable) -- In the problem I'm trying to model, there are 7 possible outcomes per time step and I'm trying to find which option is most likely. I'm looking for a …
I am trying to train and predict SVHN dataset (VGG architecture). I get very high validate/test accuracy by just getting the largest output class. However, the output weights are of large positive and negative numbers. Are they supposed to parsed as exp(output)/sum(exp(output)) to be converted to probability? Thank you!
I have several parts of one image that have one caption... I need to do image captioning by evaluating every part of the image to which the caption will belong so do I need to extract the features from parts of the image and pass it to the model with its caption ? or how can I do it please? for example; the dataset I have are the parts of the image which are divided into three parts “beach, sea, …
I am trying to deploy a web app to Heroku. The free tier is limited to 500 MB. I am using my resnet34 model as a .pkl file. I create model with it using the fastai library. This project requires torch and torchvision as dependencies. But not specifying the dependency will download the latest version of torch which alone is 750 MB and exceeds the memory limit. So, I specify torchvision version as 0.2.2 and specify the wheel for torch …
I am new to deep learning so forgive me if this is an obvious mistake, I have tried to find similar questions online yet none seem relevant to my problem. I am using pytorch for image classification and my accuracy is stuck at ~40% even though my framework seems fine. Am I missing something major? My data has 5 columns: Age, Gender, Ethnicity, Image name , and "pixel" values (which are already scaled). I want to target the "ethnicity" output, …
In my dataset, a data point is essentially a Time series of 6 feature over a year per month so in all, it results in 6*12=72 features. I need to find class outliers so I perform dimensionality reduction hoping the difference in data is maintained and then apply k-means clustering and compute distance. For dimensionality reduction I have tried PCA and simple autoencoder to reduce dimension from 72 to 6 but results are unsatisfactory. Can anyone please suggest any other …
In config file, VGG layer weights are initialized using this way: from easydict import EasyDict as edict MODEL = edict() MODEL.VGG_LAYER_WEIGHTS = dict(conv3_4=1/8, conv4_4=1/4, conv5_4=1/2) But how to initialize it using a parser? I have tried to do this the following way: parser.add_argument(’–VGG_LAYER_WEIGHTS’,type=dict, default=conv3_4=1/8, conv4_4=1/4, conv5_4=1/2, help=‘VGG_LAYER_WEIGHTS’) But got error. Please help me to write it correctly.
I had a setup a yolo4 pytorch framework in google colab by cloning git clone https://github.com/roboflow-ai/pytorch-YOLOv4.git. I generated checkpoints by giving training. As we need more robust training model, I given training again with assigning pretrained checkpoints but it seems loss started with high value as like first time training. Code is for training !python train.py -b 2 -s 1 -l 0.001 -g 0 -pretrained ./Yolov4_epoch100_latest.pth -classes 1 -dir ./train -epochs 100. Not sure if my pretrained checkpoint is used …
I'm training a ResNet50 for image classification and I'm interested in decreasing the dimensionality of the embedded layer, in order to apply some clustering techniques. The suggested dimension is something in the range 64-256, so I thought I'd start from 128. I'm using PyTorch. After loading the pretrained ResNet50 from the official release I would usually do this: model = t.load(cfg.resnet_path) model.fc = nn.Sequential(nn.Linear(in_features = 2048, out_features = num_classes, bias = True)) Everything worked and I reached an accuracy of …
I am currently attempting to reimplement a paper on fall detection (https://ieeexplore.ieee.org/abstract/document/9186597). It requires a background subtraction algorithm called Mask R-CNN. Are there any current implementations of this algorithm for background subtraction?
I've pretrained the RoBERTa model with new data using a 'simpletransformers' library: from simpletransformers.classification import ClassificationModel OUTPUT_DIR = 'roberta_output/' model = ClassificationModel('roberta', 'roberta-base',use_cuda=False, num_labels=22, args={'overwrite_output_dir':True, 'output_dir':OUTPUT_DIR}) model.train_model(train_df) result, model_outputs, wrong_predictions = model.eval_model(test_df) # model evaluation on test data where 'train_df' is a pandas dataframe that consists of many samples (=rows) with two columns: the 1st column is a text data - input; the 2nd column is a category (=label) - output. I need to create the same model and pretrain …
I am trying to learn the basic of pytorch so I can assemble my own CNN's. One thing I am also trying to learn is navigating the API documentation. Specifically at the moment I am trying to read through nn.Conv2d. I quote the documentation: Applies a 2D convolution over an input signal composed of several input planes. In the simplest case, the output value of the layer with input size $(N,C_{in},H,W)$ and output $(N,C_{out},H_{out},W_{out})$ can be precisely described as $$ …
There are some easy and comprehensive textbooks covering many deep learning concepts using PyTorch in detail. But, I am a little dissatisfied with the weightage given to RNNs compared to CNNs. I mean, some textbooks did not cover RNN at all. I am in need of learning NLP using PyTorch to great detail. So, I want the list of books or a single book that covers RNN, LSTM, etc., using PyTorch in great detail i.e., from beginner to advanced. I …
I am trying to build a machine learning model in python. I used pytorch and sklearn to make the model. My model is a bit complicated: I have between one to 8 input feature but several target variables. My target variables are kind of time series. Shape of my taget variable is 50x169 but my input feature the shape is between 50x1 to 50x8. I showed three different series in the upladed figure. I used algorithms like DecisionTreeRegressor and RandomeForestRegressor …
I am trying to understand RNN. I got a good sense of how it works on theory. But then on PyTorch you have two extra dimensions to your input data: batch size (number of batches) and sequence length. The model I am working on is a simple one to one model: it takes in a letter than estimates the following letter. The model is provided here. First please correct me if I am wrong about the following: Batch size is …
I am new to resnet models. I want to implement a resnet50 model for semantic segmentation I am following the code from this video, but my numclasses is 21. I have a few questions: If i pass in any rgb jpeg image into the model, I get an output of size (1, 21). What does this output represent? Since I am doing semantic segmentation, my images dont have any rgb channels, so what should I put for image_channels in self.conv1? …
I recently downloaded Camembert Model to fine-tune it for my purpose. Upon unzipping the file the contents are: Upon loading the model.pt file using pytorch: import torch model = torch.load(model_saved_at) I saw that model was in OrderedDict format containing the following keys: args model optimizer_history extra_state last_optimizer_state As the name suggests most of them are OrderedKeys themselves with the exception of args which belongs to a class argsparse.Namespace. Using vars() we can see args only contains some hyperparameters and values …
I have implemented the codes: https://towardsdatascience.com/image-feature-extraction-using-pytorch-e3b327c3607a?gi=7b5fd7b03ed1 for image feature extraction. But it is confusing that both 224*224 input image 448*448 input image work fine. As I understand, pretained VGG16 (without changing its trained weights) only takes 224*224 input image. I suppose the 1st layer (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) can take larger size of images, but the pretrained weights cannot extend to larger dimension of inputs. Am I right?
How to get sentence embedding using BERT? from transformers import BertTokenizer tokenizer=BertTokenizer.from_pretrained('bert-base-uncased') sentence='I really enjoyed this movie a lot.' #1.Tokenize the sequence: tokens=tokenizer.tokenize(sentence) print(tokens) print(type(tokens)) 2. Add [CLS] and [SEP] tokens: tokens = ['[CLS]'] + tokens + ['[SEP]'] print(" Tokens are \n {} ".format(tokens)) 3. Padding the input: T=15 padded_tokens=tokens +['[PAD]' for _ in range(T-len(tokens))] print("Padded tokens are \n {} ".format(padded_tokens)) attn_mask=[ 1 if token != '[PAD]' else 0 for token in padded_tokens ] print("Attention Mask are \n {} ".format(attn_mask)) …