Can we extract data types of features/variables from pickled model for Logistic Regression, Decision Tree, Random Forest?

I am trying to extract data types of variables/features from a pickled ML model file. I could see there is no information of the data types of variables in pickle file except for XG Boost. Is there any way to extract the data type information from pickle model file for other ML algorithms?
Category: Data Science

The actual results and results from pickle files are not matching in pandas for DBSCAN clustering

I've built a DBSCAN clustering model. The output result and the result after using the pickle files are not matching. Based on HD and MC column, I am clustering WT column. data = HD,MC Target = WT Below, for 1st record the cluster is 0. But after running it from 'pkl' file, it is showing predicted result as [-1] Dataframe: HD MC WT Cluster 200 Other 4.5 0 150 Pep 5.6 0 100 Pla 35 -1 50 Same 15 0 …
Category: Data Science

PicklingError in pyspark (PicklingError: Can't pickle <class '__main__.Person'>: attribute lookup Person on __main__ failed)

I am unable to pickle the below class. I am using data bricks 6.5 ML (includes Apache Spark 2.4.5, Scala 2.11) import pickle class Person: def __init__(self, name, age): self.name = name self.age = age p1 = Person("John", 36) pickle.dump(p1,open('d.pkl','wb'))``` PicklingError: Can't pickle &lt;class '__main__.Person'&gt;: attribute lookup Person on __main__ failed
Category: Data Science

ValueError when upload pkl model file

I'm saving a .pkl file and, but trying to load it I get: ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long' The save and load methods: def save_model(self): filename = os.path.join(self.main_dir, self.trained_model_filename) print(filename) with open(filename, 'wb') as f: pickle.dump(self.model, f) def load_model(self): filename = os.path.join(self.main_dir, self.trained_model_filename) with open(filename, 'rb') as f: model = pickle.load(f) self.model = model When I was looking for the problem it said it could happen due to differences in the env, but I ran it …
Category: Data Science

How can I solve it ,TypeError: cannot pickle 'dict_keys' object?

My code: class spatial_dataset(Dataset): def __init__(self, dic, root_dir, mode, transform=None): self.keys = dic.keys() self.values=dic.values() self.root_dir = root_dir self.mode =mode self.transform = transform def __len__(self): return len(self.keys) def load_ucf_image(self,video_name, index): path = self.root_dir + '/' + video_name.split('_')[0] + '/' + 'v_' + video_name + '/' print(path+'Image'+str(index)+'.jpg') img = Image.open(path+'Image'+str(index)+'.jpg') transformed_img = self.transform(img) img.close() return transformed_img def __getitem__(self, idx): print(idx) if self.mode == 'train': video_name, nb_clips = list(self.keys)[idx].split(' ') print(video_name, nb_clips) num_clip = int(nb_clips) print(num_clip) clips = [] clips.append(random.randint(1, int(num_clip/3))) clips.append(random.randint(int(num_clip/3), int(num_clip*2/3))) …
Category: Data Science

Can anyone help with me how to train new data with already saved pickle file?

Can anyone help me with the code me to train new data with already saved pickle file? I've trained the model with RandomForestClassifier from sklearn and saved the model into .pickle Now I'm trying to retrain the model on new data with same features. I want to use pickle file to train one new data. Can anyone help me with this in terms of code?
Category: Data Science

Drastic increase in accuracy while using pickle file with sklearn

I trained a xgboost classifier and it gave an accuracy of 49.99 % and i saved that model into a pickle file. When i ran the same data with pickle file (.pkl) it's giving an accuracy of 88.99 percent. I don't know why it's happening. Please help me out from this situation. bank_dataset = pd.read_csv(r&quot;dataset.csv&quot;) missing_val = pd.DataFrame(bank_dataset.isnull().sum()) bank_dataset[' Balance'] = bank_dataset[' Balance'].fillna(bank_dataset[' Balance'].mean()) from sklearn.preprocessing import LabelEncoder le = LabelEncoder() objList = bank_dataset.select_dtypes(include = &quot;object&quot;).columns for feat in objList: …
Category: Data Science

Convert bin model to pickle

I trained a word2vec model using Gensim library which is of type .bin Q1: can we convert this trained model in bin format to pickle? Q2: would it speed up the execution time?
Topic: pickle gensim
Category: Data Science

extract classifier properties from pickled file

I have *.clf file which I get from fit() of sklearn. I fit my data with SVM or KNN and want to show its properties when using it for predictions. For example I open earlier pickled classifier file and when I print it I get something like this: SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf', max_iter=-1, probability=True, random_state=None, shrinking=True, tol=0.001, verbose=True) How can I get the value of, for example, gamma to print out it somewhere else except for …
Category: Data Science

Attribute error when loading pickle file in python jupyter notebook

I have created a config file with the following code for an object detection task and saved in the local disk. # Create the config C = Config() C.use_horizontal_flips = horizontal_flips C.use_vertical_flips = vertical_flips C.rot_90 = rot_90 C.record_path = record_path C.model_path = output_weight_path C.num_rois = num_rois C.base_net_weights = base_weight_path with open(config_output_filename, 'wb') as config_f: pickle.dump(C,config_f) I am trying to load this pickle file in another jupyter notebook. with open(config_output_filename, "rb") as f: C = pickle.load(f) # turn off any data …
Category: Data Science

No module named "" error when loading the pickle file

I created a model and I saved it in a pickle file using the Algorithm SVR(Support Vector Regression) import pickle pickle.dump(model,open('carb patients data/Pickles/svr.pickle', 'wb')) In jupyter notebook it gives an error can't pickle _thread.RLock objects So I converted that jupyter file in to a .py file and downloaded it and executed using the Python Idle. Then it got saved in that particular location. But when I try to load my pickle file from another Jupyter Notebook it gives an error …
Category: Data Science

TypeError: 'GridSearchCV' object is not callable - how do I use a pickle of an SVM (Scikit-learn)?

I have created an SVM in Scikit-learn for classification. It works; it prints out either 1 or 0 depending on the class. I converted it to a pickle file and tried to use it, but I am receiving this error: TypeError: 'GridSearchCV' object is not callable (occurs during the last line of the program) How can I overcome this? Code: import pandas as pd from sklearn.feature_extraction.text import CountVectorizer from sklearn.naive_bayes import MultinomialNB, GaussianNB from sklearn import svm from sklearn.model_selection import …
Category: Data Science

How to save/load a Model (Pickle) with a specific path/directory

I seems like a very basic think but I couldnt find an answer to it. I want to save my model to a specific directory using pickle. The two algorithms below work fine for saving it in the same directory as the code itself but I want to save all my models in a dedicated folder. I tried to just change the "filename" to "filepath" and well, make it a path but the world isnt that easy it seems. Example …
Category: Data Science

Using a trained Model from Pickle

I trained and saved a model that should predict a sons hight based on his fathers height. I then saved the model to Pickle. I can now load the model and want to use it but unfortunately a second variable is demanded from me (besides the height of the father) I think I did something wrong when training the model? I will post the part of the code wher I think the error is in, please ask if you need …
Category: Data Science

Transform test data when using a persistent model

I'm quite new to data science and only slowly following the necessary steps to get valid results using scikit-learn. As far as I understand you fit and transform the training data and only transform the test data (using the parameters retrieved by the earlier fitting). For my project a persistent model is necessary, for that I export the trained model using joblib. When applying the model on test data later, is there a way to retrieve the parameters (for transformation) …
Category: Data Science

Can I update my model using partial_fit after training my model using fit?

I have trained my model using MLPClassifier using fit method and saving in pickle object for the first time. clf = MLPClassifier(hidden_layer_sizes=(50,50,5), max_iter=100, alpha=0.0001, solver='sgd', verbose=10, random_state=21,tol=0.000000001) clf.fit(X_old, y_old) joblib.dump(clf, 'saved_model_clf.pkl') After few hours I am getting new data from online and I am updating my model using partial_fit . clf_load = joblib.load('saved_model_clf.pkl') y_new = to_categorical(y_new,num_classes=3) #I knew classes from training set clf_load.partial_fit(X_new, y_new, classes=list(range(y_new.shape[1]))) joblib.dump(clf_load, 'saved_model_clf.pkl') After partial_fit I'm predicting my results clf_pred = joblib.load('saved_model_clf.pkl') predictions = clf_pred.predict(df[0:1]) I …
Category: Data Science

Should I load my model pkl object every time when I predict?

I am predicting the data in real time , when I hit the url it should predict but I should not load my model again and again . Can someone help to optimize the code ? Can I load globally or how can I persist my model object session while prediction ? @app.route("/predict",methods = ['GET']) def predictData(): try: message = request.args.get('message', default = '') message = message.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation))) text_clf_nb_fit = joblib.load('UpdatedAmtAlertClassifierModel.pkl') result = text_clf_nb_fit.predict([message]) Thanks in advance .
Topic: pickle python
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.