I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work. The relevant part is 3.3. Upsampling is backwards strided convolution Another way to connect coarse outputs to dense pixels is interpolation. For instance, simple bilinear interpolation computes each output $y_{ij}$ from the nearest four inputs by a linear map that depends only on the relative positions of the input and output cells. In …
I am new to machine learning. I have 10,000 examples of 128x256 array of values 0.0-1.0. Each example consists of a pair of a clean example and the other with noise added. I am aiming to train a CNN / (or an autoencoder?) with these examples. I am currently able to train one dense layer without errors. My first problem is my prediction is returning a 128x256 int array rather than floats. My larger question is about finding a starting …
I am dealing with a Spatio-temporal forecasting problem similar to the one dealing with the NYC Taxi Demand Prediction. This case is a good example since it has been already covered in different papers using different models and techniques. The most recent I read were this one (GSTNet: Global Spatial-Temporal Network for Traffic Flow Prediction) were they trying to predict the taxi demand using a GSTN model as well as this one ( Deep Multi-View Spatial-Temporal Network for Taxi Demand …
I have some problems with layers construction on Keras. I explain the whole problem: I have a feature matrix, with dimensions: 2023 (rows) x 65 (features); I tried to build a CNN, with Conv1D as first layer; My code is: def cnn_model(): model = Sequential() model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(Dropout(0.25)) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(Dropout(0.25)) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(64, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='mse', optimizer='adam', metrics=['mse', 'mae']) model.fit(X, Y, epochs=100, batch_size=64, verbose=0) model.evaluate(X, Y) return model scoring = make_scorer(score_func=pearson) # evaluate model with standardized …
I have generated a pre-trained word2vec model using the Gensim framework (https://radimrehurek.com/gensim/auto_examples/index.html#documentation). The dataset has 507 sentiments(sentences) which are labeled as positive or negative. After performing all text processing, I used Gensim to generate the pre-trained word2Vec model. the model has 234 unique words with each vector having 300 dimension. However, I have a question. How can I use the generated word2vec embedding vectors as input to CNN?
This question boils down to "how do convolution layers exactly work. Suppose I have an $n \times m$ greyscale image. So the image has one channel. In the first layer, I apply a $3\times 3$ convolution with $k_1$ filters and padding. Then I have another convolution layer with $5 \times 5$ convolutions and $k_2$ filters. How many feature maps do I have? Type 1 convolution The first layer gets executed. After that, I have $k_1$ feature maps (one for each …
I would like to use a neural network for image classification. I'll start with pre-trained CaffeNet and train it for my application. How should I prepare the input images? In this case, all the images are of the same object but with variations (think: quality control). They are at somewhat different scales/resolutions/distances/lighting conditions (and in many cases I don't know the scale). Also, in each image there is an area (known) around the object of interest that should be ignored …
I am interested in any data, publications, etc about what is the smallest neural network that can achieve a certain level of classification performance. By small I mean few parameters, not few arithmetic operations (=fast). I am interested primarily in convolutional neural networks for vision applications, using something simple like CIFAR-10 without augmentation as the benchmark. Top-performing networks on CIFAR in recent years have had anywhere between 100 million and 0.7 million parameters (!!), so clearly small size is not …
Are there any papers published which show differences of the regularization methods for neural networks, preferably on different domains (or at least different datasets)? I am asking because I currently have the feeling that most people seem to use only dropout for regularization in computer vision. I would like to check if there would be a reason (not) to use different ways of regularization.
As far as I understood it, the pooling layer doesn't learn anything. It has several parameters, most important its pool_size and stride (see Lasagne documentation for more), but none of those is learned. Is it possible to learn these two (or one of those) parameters? Are there papers about it? (I would guess that it is not possible to add this to the objective function in a meaningful way... but I'd like to see if people thought about it.)
I have some basic features which I encoded in a one-hot vector. Length of the feature vector equals to 400. It is sparse. I saw that conv nets is applied to a dense feature vectors. Is there any problems to apply conv nets to a sparse feature vectors?