Train a spaCy model for semantic similarity

I'm attempting to train a spaCy model for the purposes of computing semantic similarity but I'm not getting the results I would anticipate. I have created two text files that contain many sentences that use a new term, "PROJ123456". For example, "PROJ123456 is on track." I've added each to a DocBin and saved them to disk as train.spacy and dev.spacy. I'm then running: python -m spacy train config.cfg --output ./output --paths.train ./train.spacy --paths.dev ./dev.spacy The config.cfg file contains: [paths] train …
Category: Data Science

Why does Light GBM model produce different results while testing?

Using the Light GBM regressor, I have trained my data and, using Grid Search, I got the best parameters, but while testing with the best parameters I am getting different results each time, which means the model produces different results for each test iteration. I ran the lightgbm twice with the same parameters, but got different results in validation. I found the only random seed parameter to be baggingSeed. After fixing baggingSeed, the problem also occurred. Should I fix any …
Category: Data Science

Can depth be used as a feature when predicting rock type from well log data?

I am trying to predict the lithofacies, i.e. the rock type, from well log data, a project very similar to the one described in this tutorial. A well log can be seen as a 1D curve tracking how a given property (e.g. gamma radiation, electrical resistivity, etc...) varies as a function of depth. The idea is to use these 1D arrays as the input features to train a Machine Learning model (e.g. SVM or Random Forest), to infer the facies …
Topic: training svm
Category: Data Science

Too many hours for Training Custom Object Detector

I am following this tutorial: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html When I reach the training paragraph ( Training the Model ) and run this command: python3 model_main_tf2.py --model_dir=models/my_ssd_resnet50_v1_fpn --pipeline_config_path=models/my_ssd_resnet50_v1_fpn/pipeline.config I get mesages like this: Instructions for updating: `seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead. W0128 18:24:29.575707 140532950755136 deprecation.py:341] From /usr/local/lib/python3.9/dist-packages/tensorflow/python/util/dispatch.py:1096: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version. Instructions for updating: `seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead. WARNING:tensorflow:From /usr/local/lib/python3.9/dist-packages/tensorflow/python/autograph/impl/api.py:465: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a …
Topic: cnn training
Category: Data Science

Is it recommended to train a NER model using a dataset that has all tokens annotated?

I'd like to train a model to predict the constant and variable parts in log messages. For example, considering the log message: Example log 1, the trained model would be able to identify: 1 as the variable Example, log labeled as the constants. To train the model, I'm thinking of leveraging a training dataset that would have all tokens in all of the log entries annotated. For example, for a particular log entry in the dataset, we would have a …
Category: Data Science

Error: An operation has `None` for gradient with categorical_crossentropy

I am trying to train my discriminator network using Keras with the TensorFlow backend. The network is meant to classify the input into one of the 9 output labels. I am passing a 2D input (height, width, no channels) and a one-hot vector for the output. I was able to train the network independently using fit(). However, now that I have switched to train_on_batch, it is giving me the error mentioned. This is my discriminator code: def build_discriminator(time_steps, feature_size, input_spectrogram=None): …
Category: Data Science

Given is the result of the model performance. Help me with this MCQ

You also evaluate your model on the test set, and find the following: Human-level performance 0.1% Training set error 2.0% Dev set error 2.1% Test set error 7.0% What does this mean? (Check the two best options.) You have underfit to the dev set. You should get a bigger test set. You should try to get a bigger dev set. You have overfit to the dev set.
Category: Data Science

Is it beneficial to use a batch size > 1 even when all computing power can be used?

In regards to training a neural network, it is often said that increasing the batch size decreases the network's ability to generalize, as alluded to here. This is due to the fact that training on large batches causes the network to converge to sharp minimas, as opposed to wide ones, as explained here. This begs the question: In situations where all available computing power can be used by training on a batch size of one, is there a benefit to …
Category: Data Science

Converting a negative loss term to inverse

I'm training a classifier using this loss function: $$ \mathcal{L} = \mathcal{L}_{CE} - \lambda_1 \mathcal{L}_{push} +\lambda_2 \mathcal{L}_{pull} $$ I need to maximize a certain value using $\mathcal{L}_{push}$ and that's why it has a negative coefficient. The problem is while I'm training the model the loss value became negative and I keep getting random accuracy results. I tried changing $- \lambda_1 \mathcal{L}_{push}$ to $\lambda_1 \frac{1}{ \mathcal{L}_{push}}$ to get numeric stability and results are not bad anymore. The thing is I'm not …
Category: Data Science

Training loss decreasing while Validation loss is not decreasing

I am wondering why validation loss of this regression problem is not decreasing while I have implemented several methods such as making the model simpler, adding early stopping, various learning rates, and also regularizers, but none of them have worked properly. any suggestions would be appreciated. here is my code and my outputs: optimizer = keras.optimizers.Adam(lr=1e-3) model = Sequential() model.add(LSTM(units=50, activation='relu', activity_regularizer=tf.keras.regularizers.l2(1e-2), return_sequences=True, input_shape=(x_train.shape[1], x_train.shape[2]))) model.add(Dropout(0.2)) model.add(LSTM(units=50, activation='relu', activity_regularizer=tf.keras.regularizers.l2(1e-2), return_sequences=False)) model.add(Dropout(0.2)) model.add(Dense(y_train.shape[1])) model.compile(optimizer=optimizer, loss='mae') callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3) history = …
Category: Data Science

How to gather training data for simple voice commands?

I'm trying to build a machine learning model for recognizing simple voice commands like up, down, left, etc. On similar problems based on images, I'd just take the picture and assign a label to it. I can generate features and visualize them using librosa. And I hear CNNs are amazing at this task. So, I was wondering how I'd gather training data for such audio based systems, since I can't record an entire clip considering my commands are only going …
Category: Data Science

How are parameters selected in cross-validation?

Suppose I'm training a linear regression model using k-fold cross-validation. I'm training K times each time with a different training and test data set. So each time I train, I get different parameters (feature coefficients in the linear regression case). So I will have K parameters at the end of cross-validation. How do I arrive at the final parameters for my model? If I'm using it to tune hyperparameters as well, do I have to do another cross-validation after fixing …
Category: Data Science

What parameters to use when normalising training, validation, and testing data?

I know a similar post was made here, but I wanted to ask some follow up questions. I am conducting a cross-validation search to find values of a set of hyper-parameters and need to normalise the data. If we split up the data as follows: 'Training' (call this set 'A' for now) and testing data Split the 'training' into training (call this set 'B' for now) and validation sets what parameters should be used when normalising the datasets? Am I …
Category: Data Science

NNs for fitting highly oscillatory functions

in a scientific computing application of neural networks, I have to maximize several neural networks with scalar output with respect to a target/loss function (coming from a weak form of a PDE). It is known from theoretical considerations that typically the functions that would be optimal with respect to the target function (i.e. the maximizers) are extremly oscillatory functions. I suppose that this is the reason, why - according to my first numerical experiments - typical network architectures, initializations and …
Category: Data Science

Accuracy and Loss in MLP

I am trying to explore models for predicting whether a team will win or lose based on features about the team and their opponent. My training data is 15k samples with 760 numerical features. Each sample represents a game between two teams and the features are long and short term statistics about each team at the time of the game (i.e. avg points over last 10 games). My thought was to use a binary classifier as a multi layered perceptron. …
Category: Data Science

How can I train a model to modify a vector by rewarding the model based on the modified vectors nearest neighbors?

I am experimenting with a document retrieval system in which I have documents represented as vectors. When queries come in, they are turned to vectors by the same method as used for the documents. The query vector's k nearest neighbors are retrieved as the results. Each query has a known answer string. In order to improve performance, I am now looking to create a model that modifies the query vector. What I was looking to do was use a model …
Category: Data Science

Binary classification from local and global feature selection

I want to train a deep leaning model, consisting of images. My question is which scenariowas chosen to train the model? scenario 1 : I train images local context on Output 1, and I train images clobal contet on Output 2, Finally, combine these two outputs to get a binary classification. scenario 2 : Train global and local context directly on the binary classification. This is what I mean by local and global context (This is just an example):
Category: Data Science

Coefficients values in filter in Convolutional Neural Networks

I'm starting to learn how convolutional neural networks work, and I have a question regarding the filters. Are these chosen manually or are they generated by the network in training? If it's the latter, are the coefficients in the filters chosen at random, and then as the network is trained they are "corrected"? Any help or insight you might be able to provide me in this matter is greatly appreciated!
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.