Predicting Disease Drugs
I have a dataset in the format:
Keywords Disease/Drugs
bradycardia, insomnia, hypotension, hearinglos... NSAIDS Poisoning
vomiting, nausea, diarrhea, seizure, edema, an... NSAIDS Poisoning
pancreatitis, gi, symptoms, restlessness, leuk... Chronic abacavir use (Nucleoside Analog Revers..
ards, apnea, hepatotoxicity, dyspnea, pulmonar... Chronic stavudine and didanosine use (Nucleosi...
There are many data but it is in this format.
Converted above data into the format, exploded, and created new rows according to ,
Keywords Disease/Drugs
bradycardia NSAIDS Poisoning
insomnia NSAIDS Poisoning
pancreatitis Chronic stavudine and didanosine use (Nucleosi...
Now I created the prediction system using DecisionTreeClassifier
after encoding the Input column Keywords
.
Also, I found the top 10 predictions using:
p_probability = model.predict_proba([[t]])
best_n = np.argsort(p_probability, axis=1)[:,-10:]
When I input the single symptom like bradycardia
, it shows 10 best predictions.
Also when I input a list of 5 symptoms, then it will show 50 best predictions.
Since, a list of symptoms can have common disease/drugs, I want to create a system, that when inputted the list of any number of symptoms, will show the 10 best predictions only.
Topic classifier decision-trees scikit-learn python predictive-modeling
Category Data Science