I am working on weather data and it has few features that are independent variables such as severity, severity_id, urgency_id etc ... Based on these values, I would like to classify alerts into class 0 or 1. For example, below is row item from data source Alert | Severity | Sev_Id | Urg_Id | Event | Sys_Rec(Target Variable) -------------------------------------------------------------------------- dummy | Extreme | 1 | 1 | STORM | 1 dummy | Minor | 3 | 5 | RIPTIDE | …
I would like to reduce multiclass classification targets to binary classification targets. Ideally, this mapping would happen within scikit-learn so the same transformation applies during both training and prediction. I looked at transforming the prediction target (y) documentation but did not see something that would work. Ideally, it would be a classifier version of TransformedTargetRegressor. Something like this mapping: targets_multi = {'A', 'B', 'C', 'D'} targets_binary = {0: {'A', 'B'}, 1: {'C', 'D'}}
Is there any way to at least read the text from the dat file. I have its corresponding mdf file hence I know what all data and columns are there in it. How do I figure out the contents in my dat file. Because all that I am getting currently is some gibberish even if I am opening it in binary mode. from asammdf import MDF dat_file = r"C:\Users\HPO2KOR\Desktop\Work\data1.dat" mdf_file = r"C:\Users\HPO2KOR\Desktop\Work\data1.mdf" df = mdf.to_dataframe() mdf = MDF(mdf_file) df.head() which …
In Keras, I would like to train a network with binary weights in the manner of Coubariaux, et al., but I cannot figure out where the quantization (binarization) should occur within the code. A core aspect of the training method is this: At the beginning of each batch during training, the stored real (e.g., int32) weights are converted to binary values (either by rounding or in a stochastic/probabilistic manner) and stored separately from the real-valued weights. Binary-valued weights are used …
I presently have 2 algorithms that have a numerical output. Using a threshold of 0.9, I get the classification output. Let's say they are: P (high precision, low recall) R (high recall, low precision) Individually, they have poor F-1 scores. Is the naive way of creating a classifier C as: C(*) = x.P(*) + (1-x).R(*) And optimizing for x and threshold a good approach to improve the F-1 score? Or is there some alternate approach I must try. Note: I …
As part of a group project at university, we are given a series of videos of cell cultures over a 24 hour period. A number of these cells (the "knockout" cells) have had a particular gene removed, which is often absent or mutated in malignancy. We are using a blob detection algorithm to identify the cell centers and radii and further processing to match cells frame-to-frame to build up individual paths, which we then use to calculate various features. We …
I am training a binary classifier in a dataset using AUC as a score. The dataset has two main groups (we will refer to them as good and bad population). A property that this dataset has is having a higher proportion of target = 1 in the bad population. For this reason, a relatively dummy classifier would give higher scores to the bad population and lower scores to the good population. In fact, the AUC of the classifier could be …
Let's say I have a dataset consisting of 100 features and a binary target variable. On exploring the data, I see that Feature 10 which is binary, seems to split the data in an 80:20 ratio with respect to the target variable. Does it then make sense to partition the data into "Feature 10 is 0" and "Feature 10 is 1" and try to build separate classifiers for both cases? Or is this what models like Random Forests do under …
I have time series data, where the dependent variable is binary - either 0 or 1. The 0 value means failure, it's rare, and I want to see if I can get close to estimating the times when it will happen in the future. A naive, completely unoptimized application of Facebook Prophet yielded this (the handful of 0 values are at the bottom of the graph; most values are 1): Before I continue tweaking the model, are there any reasons …
I understand that gradient descent is local and it deals only with the inputs to the neuron, what it outputs and what it should output. In all I've seen, gradient descent needs the activation function to be differentiable, so a threshold function cannot be used. Yet, the biological neurons either fire or they don't. The input to the neuron, in my understanding, is the equivalent of the membrane potential. Once it passes a certain threshold, the neuron fires (one or …
I'm trying to create a layer in TensorFlow, which works something like this: And my implementation looks something, like this: class BinaryLayer(Layer): def __init__(self): super(BinaryLayer, self).__init__() def build(self, input_shape): w_init = 0.5 self.w = tf.Variable(name="kernel", initial_value=w_init(dtype='float32'),trainable=True) def call(self, inputs): return tf.math.greater(inputs, self.w) But it gives me an error saying 'float' object is not callable And I also think there will be another problem in the future, which is, it will return boolean values, such as: [[TFT] [TTF] [FFT]], but I …
I have a dataframe in R and am trying to determine the Phi correlation coefficient between 2 binary (aka dichotomous: 0 or 1) variables, each one in a column (column1 and column2). I have installed the "psych" package and ran the code below but not working: x = matrix(c(dataframe$column1, dataframe$column2),ncol=2) phi (x) Any suggestions for code?
We have a very imbalanced dataset (2% of class 1). To the best of our knowledge, there is no baseline in the literature to the problem we want to solve - so we thought of comparing our performance to a random classifier. We evaluate our model as a combination of precision and recall - we vary the threshold at which data points are classified as 1 and compute the rolling threshold and recall. We could use F1-score as well. What …
I would like to run quantization aware training with a keras model which has an LSTM layer. However, just the LSTM layer seems to not be supported. Alan Chiao seems to suggest here that it is possible to use the Quantize Config to allow this layer to be used. However, I can't seem to make the right Quantize Config which allows for the use of the LSTM layer. I attach below both the code for my model and the Quantize …
I am playing with a dimensionality reduction step prior to clustering for a pretty large sparse binary matrix of almost 3000 columns and 50k rows. My idea is to embed the 3000 dimensions into a two-dimensional space with UMAP and then cluster the resulting 50,000 two-dimensional points with HDBScan. I've found that UMAP accepts a number of options, such as the metric, n_neighbors, min_dist and spread, but I cannot figure out what should be the best combination giving me distinct …
I have a binary dependent variable $t$ and categorical features. We can even simplify to binary features since I can one-hot encode the categorical variables. In practice the one-hot encoding induces collinearity in the binary features so for simplicity let's pretend we only have $D$ binary features. The purpose is to estimate the probability of $t=1$. In principle, I can use logistic regression. But, given the categorical nature of the input data they actually define a table of $2^D$ cells. …
I apologise if this is a bit long winded, but it was suggested by another user that I post. I will start by saying that I am very new to the world of machine learning and deep learning. As such, the most important thing I am after is the understanding of what I am doing. I am trying to build an ANN for binary classification. I have a binary feature matrix in the form of N x D, where N …
This may be a stupid, but, I am new to deep learning (and machine learning for that matter) and I can't seem to find any literature to help with my question. All I can see when Googling many different questions (trying to change keywords to try get a hit on my question) is about binary classification. And also, binary classification where the feature matrix consists of real numbers. I would like to know, is it possible to build a binary …