exploratory-factor-analysis

Performing EDA on a dataset with missing features

user135735

2022年5月15日 05:32

I'm new to DS. I want to perform EDA on such dataset, where these are the missing features stats of my train and test sets: train: Test_0 0 Test_1 31 Test_2 0 Test_3 141 Test_4 0 Test_5 0 Test_6 0 Test_7 0 Test_8 1045 Test_9 0 Test_10 0 Test_11 0 Test_12 0 Test_13 0 Test_14 0 Test_15 2967 Class 0 dtype: int64 test: Test_0 0 Test_1 7 Test_2 0 Test_3 46 Test_4 0 Test_5 0 Test_6 0 Test_7 0 Test_8 …

Topic: exploratory-factor-analysis visualization data-cleaning

Category: Data Science

Filling NaN values

Mohammed Atif Ali

2022年4月20日 13:00

According to my knowledge, before filling nan values we have to check whether data is missing because of MCAR, MAR or MNAR and it depends on how features are correlated with each other and then make a decision, which one to apply. So, my question is, is it a good practice to check the dependency of features with chi square independence test. If not please suggest me, what techniques to use or apply to fill nan values. I will be …

Topic: chi-square-test exploratory-factor-analysis missing-data correlation statistics

Category: Data Science

Industry analysis - multiple industries

Jtrying

2022年4月1日 16:56

I am trying to run logistic regression on marketing leads and use industry as a predictor of whether the lead converts (1/0). Often, when I enrich data from websites like crunchbase the associated company for a lead has multiple industry associations. I am looking for help in finding a method, as well as R script, to separate these values and ultimately and identify the industries that are the best predictors of conversion. If anyone can offer some guidance here it …

Topic: exploratory-factor-analysis logistic-regression data-cleaning

Category: Data Science

How to predict strategy based on given data using Machine Learning?

Shaelander Chauhan

2022年3月26日 17:47

My basic goal is to predict strategy based on given data for instance a) Predict what formation In a football match will maximize my winning rate b) Predict what product combination will maximize my sales rate in the Grocery store How to deal with such problems in machine learning? What approach is used in such problems?

Topic: exploratory-factor-analysis machine-learning

Category: Data Science

Does the sign of correlation matter in feature selection?

Tfovid

2022年2月10日 10:52

If I understand correctly, the correlation between features and the target can be used to quantify whether those features are relevant to keep, hence the ritual of plotting the correlation matrix as a key step in data exploration. However, does the sign of the correlation matter when it comes to feature selection? Isn't the only thing that matters the strength of the correlation (or anti-correlation)?

Topic: exploratory-factor-analysis feature-engineering correlation feature-selection

Category: Data Science

Practical Interpretation of PCAs for a supplier analysis

Zilfalon

2022年1月14日 14:31

I am using PCA to validate and research a set of 13 suppliers of products against a set of about 50 variables and performance indicators against an ideal "wish"-Supplier, mostly based on G. Jankers Book on Factor Analysis for Supplier a Rating System. While using R Studio I use my data to perform the PCA with prcomp. My question is regarding practical statements of the outcomes of the PCA and its factors. My Goal is to identify the perfomance indicators, …

Topic: exploratory-factor-analysis interpretation pca

Category: Data Science

When would you use feature optimization method instead of exploratory analysis to identify best features?

PlatinumMaths

2021年12月23日 21:16

I have a dataset with around 70 features. I'm currently just plotting graphs and trying to identify key information. I also wish to later do a predictive model. What would be the best way to get the best features? Would it be wise to go through every column and try and spot trends and correlation? Or would it be sensible to just use a wrapper method or genetic algorithm search? Or just do a random forest classifier on the whole …

Topic: exploratory-factor-analysis feature-engineering

Category: Data Science

Factor Analysis vs PCA

Zexxxx

2021年12月13日 06:56

Could someone please explain when FA is used or when PCA is used, as I understood FA do dimensionality reduction, however PCA - the main goal is the same. Then which one should I use and in which cases?

Topic: exploratory-factor-analysis pca dimensionality-reduction

Category: Data Science

Creating sub categories

ben121

2021年7月29日 09:25

I have data we have collected quarterly over the last two years from two organisations. They are collected via the use of 29 questions. For each organisation, there are about 500 answers per question. The number which is produced for each quarter, question and organisation is an average score (1-10). Example of 5 questions is below: The issue I am trying to solve is the second column. We use these tags to create a sub category or score. However, having …

Topic: exploratory-factor-analysis feature-selection

Category: Data Science

Determine which factor is responsible for a change in a top-line business metric?

Ben

2021年7月21日 12:47

Are there any techniques for determining which factor(s) is (are) responsible for a change in a top-line business metric? E.g., revenue drops - but was it because of a drop in global visitors, or perhaps a drop in conversion rate at the London store, or maybe there were heavy discounts on the weekend, etc. So far I've explored Value Driver Analysis, Sensitivity Analysis, Root Cause Analysis, Factor Analysis, but I'm not sure if they're useful. Example I have $n$ retail …

Topic: exploratory-factor-analysis

Category: Data Science

Why are correlation matrices used versus a matrix of R^2 values?

Donovin

2021年6月29日 17:46

I'm relatively new to DS, so forgive me if this is a dumb question or in the wrong forum When evaluating features it seems that almost everywhere a correlation matrix is used [df.corr(), cor(df, method="pearson")]. The way I understand it is that a correlation matrix describes the stregnth and directionality of the linear relationship (strong negative through strong positive) between each feature/predictor and all others. HOWEVER If $R^2$ indicates the amount of variability explained by the linear relationship, between each …

Topic: exploratory-factor-analysis feature-engineering model-selection correlation feature-selection

Category: Data Science

Why is Regularization after PCA or Factor Analysis a bad idea?

Poo

2020年8月12日 12:26

I have done Factor Analysis on my data and applied various machine learning models on it. I particularly find it giving high MSE value for Ridge and Lasso Regression compared to other models. I want to know the reason why this happens.

Topic: exploratory-factor-analysis ridge-regression pca regularization machine-learning

Category: Data Science

What conclusion can I get when the variable is influenced by other but there isn't any correlation?

Tlaloc-ES

2020年8月2日 17:47

I am doing an analytic exploratory analysis. If the target is a continuous variable and the attributes are all categorical (discrete values), in order to know if exist any influence on the target from the each attribute I am doing the ANOVA-test like this: fvalue, pvalue = stats.f_oneway(df[y], df[x]) pvalue < 0.5 If that condition is true, there is a dependency between variables. For all variables I get true dependency with ANOVA, but the values of the correlation are between …

Topic: exploratory-factor-analysis anova correlation statistics

Category: Data Science

Factor Analysis with Mixed Data Concurrent Approach with PCAmixdata in R

Poo

2020年6月29日 15:12

I am trying to perform Factor Analysis over Mixed Data using R with PCAmixdata package. My dataset is huge with almost 115000 records and almost 40 features of both categorical and continuous. When I tried to run PCAmixdata, I am getting memory issue that total memory allocation is reached and I am not able to proceed, I wanted to know if it is a right way to split the dataset row-wise like 30000 records at a time and combine the …

Topic: exploratory-factor-analysis pca dimensionality-reduction r

Category: Data Science

SEM (Structural Equation Modelling) with Exploratory Factor Analysis

Vineet Agarwal

2020年1月23日 01:24

Problem Statement: I need to do some Structural Equation Modelling at work to get the main factors in a marketing survey data-set. There are no assumed equations to perform SEM on so what would be the best exploratory way to create those equations out of the data All variables are very highly correlated across all the variables so how can we deal with that Please let me know if I can help you with any supplements.

Topic: exploratory-factor-analysis structural-equation-modelling data

Category: Data Science

About