How to solve this IndexError?
I have created a training dataframe Traindata as following:
dataFile='/content/drive/Colab Notebooks/.../Normal_Anomalous_8Digits.csv'
data8=pd.read_csv(dataFile)
And Traindata looks like the following: Here Output is predicted variable which is not included in test data.
Col1 Col2 Output
0 0.001655 0.464986 1
1 0.943110 0.902166 0
2 0.071235 0.674283 1
... ... ... ..
1007 0.698048 0.058458 1
1008 0.289333 0.702763 1
1009 rows × 3 columns
Now the model is trained as following commands:
from pgmpy.models import BayesianModel, BayesianNetwork
from pgmpy.estimators import MaximumLikelihoodEstimator
model = BayesianNetwork([('Col1', 'Output'), ('Col2', 'Output')])
model.fit(data8, estimator=MaximumLikelihoodEstimator)
I have created a tested dataframe dataP by following method:
Col1=pickle.load(open('/content/drive/Colab Notebooks/.../col1.pickle', 'rb'))
Col2=pickle.load(open('/content/drive/ColabNotebooks/.../col2.pickle', 'rb'))
dataT=np.array([Col1[:179], Col2]).T
#Col1 has 184 rows while Col2 has 179 rows, so I reduced the rows of col1 to make concatenation easy.
dataP=pd.DataFrame(dataT, columns=['Col1','Col2'])
dataP.reset_index(drop=True, inplace=True)
I have a dataframe dataP as:
Col1 Col2
0 0.832946 0.583372
1 0.783141 0.583948
2 0.745327 0.587644
3 0.762367 0.585629
4 0.783265 0.590721
.. ... ...
174 0.686461 0.578358
175 0.689001 0.583951
176 0.683956 0.577511
177 0.687347 0.584231
178 0.695827 0.578313
[179 rows x 2 columns]
When I passed this dataframe to my model for prediction :
Y=model.predict(dataP)
It raises the following index error:
IndexError Traceback (most recent call last)
ipython-input-16-489f2f25f1bc in module()
---- 2 Y=model.predict(dataP)
5 frames
/usr/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
-- 384 raise self._exception
385 else:
386 return self._result
IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices
Then I checked the indices of dataframe as
print(dataP.index)
OUTPUT
RangeIndex(start=0, stop=179, step=1)
Then I check the datatype of my index as
dataP.index.is_numeric()
dataP.index.is_integers()
In both of above cases, it gives
TRUE
TRUE
Now if the indices of dataframe dataP is integers, then why it is raising such an error. Kindly guide me in this respect.
Regards,
Topic dataframe pandas data-indexing-techniques indexing python
Category Data Science