How to fix my CSV files? (ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required)
I have tried to import two csv files into df1
and df2
. Concatenated them to make df3
. I tried to call the mutual_info_regression
on them but I am getting a value error ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required
. I have checked the dimensions of X
, y
, and discrete_features
. They all seem okay.
Since the code works with other csv
files (I have tested), I think the problem is with my csv
files and not the code.
import numpy as np
import pandas as pd
df1 = pd.read_csv(WT_MDE.csv, index_col=0)
df1[Interact] = 1
df2 = pd.read_csv(M_MDE.csv, index_col=0)
df2[Interact] = 0
data = pd.concat([df1, df2])
X = data.copy()
y = X.pop(Interact)
discrete_features = X.dtypes == float
from sklearn.feature_selection import mutual_info_regression
def make_mi_scores(X, y, discrete_features):
mi_scores = mutual_info_regression(X, y, discrete_features = discrete_features)
mi_scores = pd.Series(mi_scores, name=MI Scores, index=X.columns)
mi_scores = mi_scores.sort_values(ascending=False)
return mi_scores
mi_scores = make_mi_scores(X, y, discrete_features)
Google Drive Link to The CSV Files
I would really appreciate if anyone could help.
Topic mutual-information scikit-learn python data-cleaning
Category Data Science