data-augmentation

Methods for augmenting binary datasets

S_S

2022年6月1日 21:04

I have a small (~100 samples) dataset with roughly 20 features which are mostly binary, and a few are numeric (~5). I wanted to use methods for augmenting the training set and see if I can get better test accuracy. What methods/code can I use for augmenting binary datasets?

Topic: boosting data-augmentation smote

Category: Data Science

How to train a keras model on both original and augmented data from ImageDataGenerator?

Mohamed Taha

2022年5月19日 14:02

I have a dataset that contains about 87000 images in a directory, with each class in a separate subfolder. I've tried the class ImageDataGenerator() and the function flow_from_directory() for generating the images, it worked completely fine but I have a question.. Does flow_from_directory() only yield the augmented images? and if this is the case, how can I train my model "which has overfit the training set" on both original and augmented data? Thanks

Topic: data-augmentation overfitting keras

Category: Data Science

after augmentation validation accuracy going down?

eum sangwon

2022年5月8日 06:08

My main question is about augmentation. if I process the augmentation I believe it always better than less data but in my case the validation accuracy going down train : 7000 images , validation: 3000 images : validation accuracy:0.89 train : 40000 images , validation: 17990 images : validation accuracy:0.85 my augmentation code def data_augmentation_folder(trainImagesPath,saveDir): #X_train=load_training_data(trainImagesPath,"train") print("=====================================================") X_train = cleanData(trainImagesPath) X_train = np.array(X_train) print(X_train[0].shape) for i in range(5): #print(i) datagen = ImageDataGenerator(rotation_range=15, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.01, zoom_range=[0.9, 1.25], horizontal_flip=True, vertical_flip=False, fill_mode='reflect', …

Topic: image-preprocessing data-augmentation keras image-classification

Category: Data Science

Data Augmentation for Regression ANN with low Sample Size

Alexander Vocaet

2022年5月7日 09:07

There is a Dataset of 65 tuples. I want to Augment new Data from this set and validate my ANN on the original Data. Is there a possibility, that my ANN already overfits on the augmentet Data. For example that the augmentation Is simulated by the ANN, and how does one prevent this ?

Topic: data-augmentation regression machine-learning

Category: Data Science

Same validation accuracy, different train accuracy for two neural networks models

sneaky_lobster

2022年5月6日 05:06

I'm performing emotion classification over FER2013 dataset. I'm trying to measure different models performance, and when I checked ImageDataGenerator with a model I had already used I came up with the following situation: Model without data augmentation got: train_accuracy = 0.76 val_accuracy = 0.70 Model with data augmentation got: train_accuracy = 0.86 val_accuracy = 0.70 As you can see, validation accuracy is the same in both models, but train accuracy is significantly different. In this case: Should I go with …

Topic: data-augmentation model-selection accuracy neural-network

Category: Data Science

Data Augmentation Keras length of data

Okba

2022年4月28日 00:35

I'm confused when I add data augmentation should I get more data or the same data I tested my x_train length to confirm but I got the same length before augmentation and after augmentation is that correct or should I get the double of my data? print(len(x_train)) output : 5484 after augmentation : datagen = ImageDataGenerator( featurewise_center=True, # set input mean to 0 over the dataset samplewise_center=True, # set each sample mean to 0 featurewise_std_normalization=True, # divide inputs by std …

Topic: cnn data-augmentation keras tensorflow deep-learning

Category: Data Science

Can data augmentation techniques be misleading?

Deepak

2022年4月18日 18:05

In an attempt to handle imbalance in data, especially in the case of extremely imbalanced data, can the various data augmentation techniques create some bias?

Topic: data-augmentation class-imbalance

Category: Data Science

Non-Real Time Data Augmentation for CNN Classification. What are the drawbacks?

Stephanie Lin

2022年4月9日 20:04

When people talk about and use data augmentation, are they mostly referring to real-time data augmentation? In the case of image classification, that would involve augmenting the data right before fitting the model, and a new augmented image is used every epoch. In this case only augmented images are used to train the model and the raw image is never used, so the size of the input doesn’t actually change. But what about non-real-time data augmentation? By this, I mean …

Topic: cnn data-augmentation machine-learning

Category: Data Science

What's the difference between keras api augmentation and data augmentation definition?

Osama Hefny

2022年4月6日 01:05

The augmentation definition is increasing the number of images by using rotation, crop and flip to avoid overfitting. The keras API apply augmentation but no increasing the number of image. What keras augmentation does in images? Is API augmentation such as preprocessing of images? Is augmentation replace the original image with new augmented images?

Topic: data-augmentation

Category: Data Science

Is There Techniques for creating synthetic Data for Regression Problem i tried SMOTE and its variant but these are for classification problem

elya abbas

2022年4月5日 17:16

This is my data "Volume" is my Target variable and all other are Independent variables i just applied labelencoder on Area_categ , wind_direction_labelencod and on current _label_encode and now i want to apply tecnique that increase my dataset rows and colums like in case of classification SMOTE do balance classes.Please give me solution for this if possible with deep learing techniques then please do help us.

Topic: data-augmentation regression deep-learning data-cleaning machine-learning

Category: Data Science

Data augmentation in images

user7080

2022年4月4日 08:57

Suppose there is a ML network that takes grayscale images as the input. The images that I have are RGB images. So, instead of converting these RGB images to grayscale, I treat each individual colour bands as distinct inputs to the network. that is, instead of feeding RGB image A to the network, I feed the R matrix of A as the first input, followed by the G matrix and then the B matrix. This leads to 3 times more …

Topic: data-augmentation image-classification

Category: Data Science

Does synthetic data be over sampled as well?

guestmember123456790

2022年3月31日 06:07

I'm building a binary text classifier, the ratio between the positives and negatives is 1:100 (100 / 10000). By using back translation as an augmentation, I was able to get 400 more positives. Then I decided to do up sampling to balance the data. Do I include only the positive data points (100) or should I also include the 400 that I have generated? I will definitely try both, but I wanted to know if there is any rule of …

Topic: oversampling data-augmentation class-imbalance classification

Category: Data Science

Is there any papers on adaptive discriminator augmentation in 3D?

Fedoruka

2022年3月24日 12:36

I am really impressed with the results of ADA in action. Currently I work with 2D data (normal png images) but I would like to train StyleGAN2 + ADA in 3D space. Is there any papers/implementations available on the internet? Tried to google it but couldn’t find anything valuable.

Topic: gan data-augmentation

Category: Data Science

When using Data augmentation is it ok to validate only with the original images?

Santiago Marin Mejia

2022年3月22日 16:39

I'm working on a multi-classification deep learning algorithm and I was getting big over-fitting: My model is supposed to classify sunglasses on 17 different brands, but I only had around 400 images from each brand so I created a folder with data augmented x3 times, generating images with these parameters: datagen = ImageDataGenerator( rotation_range=30, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest') After doing so i got these results: I don't know if it's correct to do the validation only using the …

Topic: data-augmentation image-recognition image-classification deep-learning neural-network

Category: Data Science

How should I improve my CNN binary classification model from overfitting and underfitting

Ahmed Camara

2022年3月11日 19:06

I am trying to do the cats & dogs classification problem, the problem is that my model is overfitting and I have tried all the techniques I know in order to solve but nothing is working such as dropout, data augmentation, l2 and l1 reg. Can you please help me? After the end of the training, my train accuracy was: 0.7868 and my validation accuracy was 0.7044. my image size are (h=48,w=48 with 3 channels, and batch size = 128) …

Topic: binary-classification cnn data-augmentation overfitting deep-learning

Category: Data Science

Does Mixup requires two loss functions?

codeprof

2022年3月5日 19:01

I created a neural network with multi-label classification using MSE. Now, I would like to use Mixup. Do I need two loss functions (for each target one) or is the result the same if I just combine the two targets like this? target = t * target1 + (1-t)* target2

Topic: data-augmentation tensorflow loss-function

Category: Data Science

Data Augmentation Multi Outputs

Jordy

2022年2月24日 22:03

This question is asked several times here on SE, but I havent been able to find the right answer. I'm trying to build a network with 1 input and 2 outputs. I don't have a lot of data so I would like to use a generator for augmentation (preferably with imgaug). My code: seq = iaa.Sequential([ .... ]) gen = ImageDataGenerator(preprocessing_function=seq.augment_image) batch_size = 64 def generate_data_generator(generator, X, Y1, Y2): genX = gen.flow(X, batch_size=batch_size, seed=42) genY1 = gen.flow(Y1, batch_size=batch_size, seed=42) while …

Topic: neural data-augmentation python machine-learning

Category: Data Science

Removing outliers from a multi-dimensional dataset & Data augmentation

Centauri

2022年2月21日 14:05

Removing the outliers of a single-dimensional data can be easily done by removing the points that are outside of the IQR range. But how should the process of detecting and removing outliers be done if the dataset is composed of multiple dimensions of data? Here's my approach: the dataset consisted seven different dimensions of data. When illustrated on a dataframe, there are seven different columns; each row acting as a metadata explaining the properties of a single data. I looped …

Topic: data-augmentation data outlier dataset

Category: Data Science

Baseline model and transfer learning

industArk

2022年2月4日 12:55

I've tried to find any guidance on using transfer learning when building baseline models for ML projects (CNN in my case) but found no clues on good practices in the matter. My logic says that no baseline model should be pretrained first as it is complicating it without any known reason to do it (as yet it is not proven we need it). But it is not the first time my logic may be wrong in the case of DS. …

Topic: pretraining transfer-learning machine-learning-model data-augmentation neural-network

Category: Data Science

Why we call Mix-up method is a data augmentation technique?

Ahmad

2022年1月31日 03:49

I am bit confused in the Mixup data augmentation technique, let me explain the problem briefly: What is Mixup For further detail you may refer to original paper . We double or quadruple the data using classic augmentation techniques (e.g., Jittering, Scaling, Magnitude Warping). For instance, if the original data set contained 4000 samples, there will be 8000 samples in the data set after the augmentation. On the other hand, according to my understanding, in Mixup data augmentation, we do …

Topic: data-augmentation deep-learning neural-network time-series dataset

Category: Data Science

About