counts

Data normalization of count data for neural networks

Yenaled

2022年3月31日 15:06

I have a sparse matrix of count data that I'm using as input to a neural network. I know, usually, the input data should be normalized (e.g. via min-max scaling, $z$-score standardization, etc.). But for features that are counts, what is a good approach? Should I $\log_2(x+1)$ transform the data and then do a $z$-score standardization? Is there another better approach?

Topic: counts normalization preprocessing feature-scaling neural-network

Category: Data Science

How to compare alternative models based on a positive measure of fitness

gc5

2022年3月23日 20:27

I am still trying to define this question precisely, so please indicate any feedback and I will edit my question. I have $M$ alternative models that need to be compared. The only measure that needs to be taken into account is a positive value $n$ that indicates how many independent sub-items (independent and with same weight on the total value) are supported by the data. The total number of items $N$ is not fixed, but it needs to be taken …

Topic: counts

Category: Data Science

How can improve the performance of clustering algorithm concerning similar/same records?

Mario

2021年10月12日 15:50

I want to check/experiment efficiency improvement of clustering algorithm under the title of Statistical preprocessing was done by including statistical frequency (counts) into dataframe concerning similar/same records. According to this paper: Statistical preprocessing is mainly used to get the frequency of samples having the same features, which are then used as inputs of the DBSCAN algorithm to improve the efficiency of DBSCAN clustering. Statistical preprocessing counts repeated samples with the same features in the URL parameter and uses the statistics …

Topic: counts pyspark python clustering

Category: Data Science

How to group by one column and count frequency from other column for each item in the previous column in python?

Farah

2021年6月24日 12:01

I am trying to group my data by the 'ID' column. Then I want to count the frequency of 'sequence' for each 'ID'. Here is a sample of the data frame: ID Sequence 101 1-2 101 3-1 101 1-2 102 4-6 102 7-8 102 4-6 102 4-6 103 1118-69 104 1-2 104 1-2 I am looking for a count same as: ID Sequence Count 101 1-2 2 3-1 1 102 4-6 3 7-8 1 103 1118-69 1 104 1-2 2 …

Topic: groupby counts python

Category: Data Science

Modeling count data with time-dependent rate

Bridgeburners

2020年10月23日 15:15

For processes of discrete events occurring in continuous time with time-independent rate, we can use count models like Poisson or Negative Binomial. For discrete events that can occur once per sample in continuous time, with a time-dependent rate, we have survival models like Cox Proportional Hazards. What can we use for discrete event data in continuous time where there is an explicit time-dependence that we want to learn? I understand that sometimes people use sequential models where each node is …

Topic: time counts survival-analysis methods

Category: Data Science

How to better represent three sets of categorical data?

alvas

2020年8月10日 09:17

Given three set of data with categorical integer x-axis with the same range (0-10): from itertools import chain from collections import Counter, defaultdict from IPython.display import Image import pandas as pd import numpy as np import seaborn as sns import colorlover as cl import matplotlib.pyplot as plt data1 = Counter({8: 10576, 9: 10114, 7: 9504, 6: 7331, 10: 6845, 5: 5007, 4: 3037, 3: 1792, 2: 908, 1: 368, 0: 158}) data2 = Counter({5: 9030, 6: 8347, 4: 8149, 7: …

Topic: counts plotting historgram seaborn visualization

Category: Data Science

R : Counting the number of observations per category

lorena

2020年8月3日 04:03

I'm currently starting out in R and wondering how to count the number of observations per day, per node, per replicate from the below dataset, and store in a different data set. The original dataset looks like this: Would like the resulting dataset to look like this: Can someone help me find out how I could do this in R? Thanks

Topic: counts preprocessing dataset r

Category: Data Science

Count number of cards stacked one over the other

Suraj Singh

2020年1月11日 10:31

I have a stack of ATM cards and I want to count the number of cards available in the stack. How to proceed through it? I'm using Python 3.6.0 and opencv2. I'm attaching PNG file of the images. Kindly provide help in this direction.[1

Topic: counts deep-learning

Category: Data Science

Poisson Model (w/ multiple levels X)

OctoCatKnows

2019年7月31日 02:04

Question Is Poisson model the best method for predicting counts among multiple levels within nominal variable? Details Imagine data of 7000 observations, where output= Obs.Count {numeric,0,1,2..8} and features=location {factor, 13 levels} . When conducting Poisson regression, the output returns: ## function for glm #p1 <- glm(Count ~ Loc,family = poisson, data = dat) Call: glm(formula = Count ~ Loc, family = "poisson", data = p1) Deviance Residuals: Min 1Q Median 3Q Max -2.49116 -1.32852 0.00775 1.02579 1.55985 Coefficients: Estimate Std. …

Topic: counts linear-regression predictive-modeling categorical-data

Category: Data Science

Is zero-inflated negative binomial regression appropriate for this data? Am I interpreting it correctly?

Billy

2019年2月22日 23:00

I am evaluating whether governance predictor variables are associated with the prevalence of groundwater fecal contamination in a developing country context, as measured by TTC (Thermotolerant Coliform) counts per 100mL of water. In my data TTC is distributed non-normally. There are many zeroes, and also many water sources with TTC of 125+ (our test kits could not measure TTC above this threshold). I ran countfit on TTC and various predictors and it appeared to indicated ZINBR was the appropriate regression …

Topic: counts regression

Category: Data Science

Confidence Intervals for Multi-Categorical Votes

jackisquizzical

2018年10月3日 23:24

I have an ngram-based language model that produces a long tag list for a given sentence. For example, the just-previous sentence, broken into bigrams, and run through the model might produce something like: {I have}=>C1 {have an}=>C2 {an ngram}=>C1 {ngram based}=>C3, etc. resulting in counts: C1=2, C2=1, C3=1 (for the shown segment above). Easy enough to pick the winner by sorting either the counts, or after turning them into percentages, which would control for sentence length. But I want a …

Topic: counts multiclass-classification nlp

Category: Data Science

How to merge columns, value count them and then plot the results?

Nicola

2018年1月13日 16:10

How do I get from a dataframe with multiple columns that have similar values and need to be merged: df1 = pd.DataFrame({'firstcolumn':['ab', 'ca', 'da', 'ta','la'], 'secondcolumn':['ab', 'ca', 'ta', 'da', 'sa'], 'index':[2011,2012,2011,2012,2012]}) To a crosstab that tells me for each year how many values were collected? Index ab ca da ta sa la 2011 2 0 1 1 0 0 2012 0 2 1 1 1 1 Also, how could then plot the table?

Topic: counts plotting descriptive-statistics pandas python

Category: Data Science

About