finetuning

Fine Tuning BERT for text summarization

stu_dent

2022年6月3日 18:26

I was trying to follow this notebook to fine-tune BERT for the text summarization task. Everything was good till I come to this instruction in section Evaluation to evaluate my model: model = EncoderDecoderModel.from_pretrained("checkpoint-500") An error appears: OSError: checkpoint-500 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token or log in with huggingface-cli login and …

Topic: huggingface bert finetuning

Category: Data Science

Is it possible to add new vocabulary to BERT's tokenizer when fine-tuning?

user123635

2022年6月1日 09:05

I want to fine-tune BERT by training it on a domain dataset of my own. The domain is specific and includes many terms that probably weren't included in the original dataset BERT was trained on. I know I have to use BERT's tokenizer as the model was originally trained on its embeddings. To my understanding words unknown to the tokenizer will be masked with [UNKNOWN]. What if some of these words are common in my dataset? Does it make sense …

Topic: bert finetuning word-embeddings nlp

Category: Data Science

Why not using linear regression for finetuning the last layer of a neural network?

Funkwecker

2022年5月30日 11:05

In transfer learning, often only the last layer of the network is retrained using gradient descent. However, the last layer of a common neural network performs only a linear transformation, so why do we use gradient descent and not linear (or logistic) regression to finetune the last layer?

Topic: finetuning transfer-learning linear-regression neural-network

Category: Data Science

Is it possible to "fine-tune" a pre-trained logistic regression model?

eduardokapp

2022年5月17日 16:57

Fine tuning is a concept commonly used in deep learning. We may have a pre-trained model and then fine-tune it to our specific task. Does that apply to simple models, such as logistic regression? For example, let's say I have a dataset with attribute variables of an animal and I want to classify whether or not it is a mammal or not. The labels on that dataset are only "mammal"/"not mammal". I then train a logistic regression model for this …

Topic: pretraining finetuning logistic-regression scikit-learn

Category: Data Science

ValueError: Mixed precision training with AMP or APEX (`--fp16` or `--bf16`) and half precision evaluation (`--fp16) can only be used on CUDA devices

ali hayen

2022年5月17日 08:24

i’m fine tuning the wav2vec-xlsr model. i’ve created a virtual env for that and i’ve installed cuda 11.0 and tensorflow-gpu==2.5.0 but it gives the following error : ValueError: Mixed precision training with AMP or APEX (--fp16 or --bf16) and half precision evaluation (--fp16_full_eval or --bf16_full_eval) can only be used on CUDA devices. i want to fine tune the model on GPU ANY HELP ?

Topic: cuda transformer finetuning gpu deep-learning

Category: Data Science

Pretrained vs. finetuned model

lazarea

2022年5月17日 07:15

I have a doubt regarding terminology. When dealing with huggingface transformer models, I often read about "using pretrained models for classification" vs. "fine-tuning a pretrained model for classification." I fail to understand what the exact difference between these two is. As I understand, pretrained models by themselves cannot be used for classification, regression, or any relevant task, without attaching at least one more dense layer and one more output layer, and then training the model. In this case, we would …

Topic: pretraining transformer finetuning transfer-learning

Category: Data Science

Tuning a classifier for high precision, with no regard for recall

imageimbalanceuser

2022年5月8日 23:07

I understand this falls under the decision making aspect, rather than the probabilistic, but for the purposes of some work I am doing, I need the classifier to have very high precision, as I can't afford a false positive. I do not care about false negatives, and consequently, do not care about recall. Since it is currently a binary classifier, some might say to play with the decision probability threshold from its current 0.5 value, but I will eventually need …

Topic: finetuning multiclass-classification image-classification

Category: Data Science

Is it possible to fine-tuning BERT by training it on multiple datasets? (Each dataset having it's own purpose)

Tony Jesuthasan

2022年3月26日 06:03

BERT can be fine-tuned on a dataset for a specific task. Is it possible to fine-tune it on all these datasets for different tasks and then be utilized for these tasks instead of fine-tuning a BERT model specific to each task?

Topic: bert transformer finetuning transfer-learning nlp

Category: Data Science

Is it okay to fine-tuning bert with large context for sequence classification?

yyouyki

2022年3月25日 08:15

I want to create sequence classification bert model. The input of model will be 2 sentence. But i want to fine tuning the model with large context data which consists of multiple sentences(which number of tokens could be exceed 512). Is it okay if the size of the training data and the size of the actual input data are different? Thanks

Topic: bert finetuning

Category: Data Science

Incompatible shapes (None, 1) and (None, 5) with Keras VGGFace Finetuning

Vyacheslav

2022年3月23日 10:00

Categories to learn and predict: df.race.unique() array(['0', '1', '3', '2', '4'], dtype=object) Data: train_generator = image_gen.flow_from_dataframe( df_train, x_col="img_name", y_col="race", directory=str(data_folder), class_mode="sparse", target_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=BATCH_SIZE, shuffle=True, ) val_generator = image_gen.flow_from_dataframe( df_val, x_col="img_name", y_col="race", directory=str(data_folder), class_mode="sparse", target_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=BATCH_SIZE, shuffle=False, ) Model load and fit: vggface_model = load_model("resnet50face.h5") base_model = tf.keras.Model([vggface_model.input], vggface_model.get_layer("flatten_1").output) base_model.trainable = False last_layer = base_model.get_layer('avg_pool').output hidden_layer = Flatten(name='flatten')(last_layer) out_layer = Dense(5, activation='softmax', name='gender_classifier')(hidden_layer) custom_base_model = tf.keras.Model(base_model.input, out_layer) custom_base_model.compile( optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001), loss="categorical_crossentropy", metrics=['accuracy']) history = custom_base_model.fit( x=train_generator, validation_data=val_generator, steps_per_epoch=20, epochs=40) Error …

Topic: finetuning keras tensorflow

Category: Data Science

Tuning model by my metric

Sherry

2022年3月16日 12:43

My project is using a metric to evaluate the performance of regression model, it is not belong to basic metric in Machine learning (MSE,MAE,...). So, how can I tuning model base on my metric ?

Topic: finetuning

Category: Data Science

Are most deep learning models online learning models?

Horus

2022年2月22日 06:33

I'm online learning starter. from my perspective, online learning model is the model which can update its paramater with data flows(I've seen a article pointing out that incremental model is irrevalent of time while online learning emphasizes the data flows in time-series). Here I regard them as one thing. And in my view, most deep learning can be fine tuned,as we fine-tune a pre-trained bert model, is that means a deep learning model can be fine tuned is equivalent to …

Topic: finetuning machine-learning-model online-learning deep-learning scikit-learn

Category: Data Science

Transformer similarity fine-tuned way too often predicts pairs as similar

Simone

2022年2月18日 06:48

I fine-tuned a transformer for classification to compute similarity between names. This is a toy example for the training data: name0 name1 label Test Test y Test Hi n I fined-tuned the transformer using the label and feeding it with pairs of names as its tokenizer allows to feed 2 pieces of text. I found a really weird behavior. At prediction times, there exist pairs that have very high chances to be predicted as similar just because they have repeated …

Topic: huggingface transformer finetuning classification similarity

Category: Data Science

How to improve a CNN without changing the architecture?

nmtp

2022年2月14日 06:03

I'm currently using an autoencoder CNN that's built upon the VGG-16 architecture that was designed by someone else. I want to replicate their results using their dataset first but I'm finding that: -Validation losses diverge from training losses fairly early on (I get to around 10 epochs and it already looks like it's overfitting) -At its best, the validation losses aren't even close to being as low as training losses -In general, the accuracy is still worse than reported in …

Topic: finetuning hyperparameter-tuning cnn training neural-network

Category: Data Science

Value accuracy remains the same

A Arbitrage

2022年2月12日 13:49

I have used my own build model and also fine-tuned other two model ResNeT50 and VGG16, but val_acc remains the same for them all. import tensorflow as tf model_1 = Sequential() model_1.add(Conv2D(32, kernel_size=(3,3), padding='same', activation='relu', input_shape=(224,224,3))) model_1.add(MaxPooling2D(2,2)) model_1.add(Dropout(0.3)) model_1.add(Conv2D(64, kernel_size=(3,3), padding='same', activation='relu')) model_1.add(Flatten()) model_1.add(Dropout(0.3)) model_1.add(BatchNormalization()) model_1.add(Dense(1, activation='sigmoid')) model_1.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) history_1 = model_1.fit(train_gen, epochs=3, batch_size=32, validation_data=(val_gen)) Results: CPU Frequency: 2199995000 Hz Epoch 1/3 13/13 [==============================] - 81s 6s/step - loss: 2.3583 - accuracy: 0.3001 - val_loss: 0.3717 - val_accuracy: 0.1900 …

Topic: finetuning cnn keras tensorflow

Category: Data Science

Where to download the weights for PyTorch Efficientnet-b6

Simone

2022年2月7日 08:22

I would like to know how to download the weights for PyTorch Efficientnet-b6 architecture. Only the weights, not the entire architecture.

Topic: finetuning pytorch

Category: Data Science

Fine tune the RetinaNet model in PyTorch

xcsob

2021年12月1日 13:01

I would like to fine the pre-trained RetinaNet model available in torchvision in order to create my own object detection. I'm trying to replicate what is done for the FastRCNN at this link: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html#finetuning-from-a-pretrained-model What I have done is the following: model = model = torchvision.models.detection.retinanet_resnet50_fpn(pretrained=True) num_classes = 2 # get number of input features and anchor boxed for the classifier in_features = model.head.classification_head.conv[0].in_channels num_anchors = model.head.classification_head.num_anchors # replace the pre-trained head with a new one model.head = RetinaNetHead(in_features, num_anchors, …

Topic: torchvision finetuning

Category: Data Science

How to freeze certain layers in models obtained from keras.applications

AAA

2021年11月22日 23:13

I am currrently trainning to use transfer learning on ResNet152 obtained from Keras Applications: tf.keras.applications.ResNet152( weights="imagenet", input_shape=(400,250,3) ) I know to freeze all the layers I need to set the trainable attribute to False, but right now I need to freeze certain layers. More specifically, I need to unfreeze the last three layers of this model but freeze the rest. So how do I do that?

Topic: finetuning transfer-learning keras

Category: Data Science

How to fine-tune GPT-J with small dataset

Ilya Karnaukhov

2021年11月17日 13:03

Firstly, thank you so much for looking at this post. I could really use some help. I have followed this guide as closely as possible: https://github.com/kingoflolz/mesh-transformer-jax I'm trying to fine-tune GPT-J with a small dataset of ~500 lines: You are important to me. <|endoftext|> I love spending time with you. <|endoftext|> You make me smile. <|endoftext|> feel so lucky to be your friend. <|endoftext|> You can always talk to me, even if it’s about something that makes you nervous or …

Topic: tpu finetuning tensorflow

Category: Data Science

How can I build my voice speech-to-text model?

NPP

2021年11月3日 10:30

I found an instruction to build such kind of custom model on Azure. Prepare data for Custom Speech However, I would like to either fine-tune or transfer learning on Google Colaboratory or docker. In that case, what machine learning framework do you recommend using? If you know some Github repo or articles for this challenge, could you share them with me?

Topic: finetuning transfer-learning speech-to-text training nlp

Category: Data Science

About