Plotting decision boundary from Random Forest model for multiclass MNIST dataset

I am using the MNIST dataset with 10 classes (the digits 0 to 9). I am using a compressed version with 49 predictor variables(x1,x2,...,x49). I have trained a Random Forest model and have created a Test data set, which is a grid, on which I have used the trained model to generate predictions as class probabilities as well as the classes. I am trying to generalise the code here that generates a decision boundary when there are only two outcome …
Category: Data Science

Multiple regression (using machine learning - how plot data)

I wonder how I can use machine learning to plot multiple linear regression in a figure. I have one independent variable (prices of apartments) and five independent (floor, builtyear, roomnumber, square meter, kr/sqm). The task is first to use machine learning which gives the predicted values and the actual values. Then you have to plot those values in a figure. I have used this code: x_train, x_test, y_train, y_test = tts(xx1, y, test_size=3) Outcome: LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False) regr.fit(x_train, y_train) …
Category: Data Science

How to plot using facet_wrap, over multiple pages as a .pdf files in r cran

I am using ggplot, to compare 114 unique studies for a particular variable I'm interested in. This is what I have used. ggplot(steps, aes(x=factor(edu))) + geom_bar(aes(y = (..count..), group = id_study,)) + facet_wrap(~id_study,) Whilst this works, all 114 studies are plotted on one page and the formatting is all squashed. How do I split this over 4x4 pages ? Many thanks S edit **** As there are 114 unique studies, I have 5 pages in total 1) ggplot(steps, aes(x=factor(edu))) + …
Category: Data Science

How to plot the bar charts of precision, recall, and f-measure?

I have used 4 machine learning models on a task and now I am struggling to plot their bar charts just like shown below in the image. I am printing classification report to get precision, recall etc. My code is shown: def Statistics(data): # Classification Report print("Classification Report is shown below") print(classification_report(data['actual labels'],data['predicted labels'])) # Confusion matrix print("Confusion matrix is shown below") cm=confusion_matrix(data['actual labels'],data['predicted labels']) plt.figure(figsize=(10,7)) sn.heatmap(cm, annot=True,cmap='Blues', fmt='g') plt.xlabel('Predicted') plt.ylabel('Truth') Statistics(data) How can I plot this type of chart …
Category: Data Science

Plotting SVM hyperplane margin

I'm trying to understand how to plot SVM hyperplane and its margins by this example: https://scikit-learn.org/stable/auto_examples/svm/plot_svm_margin.html And I got stuck at the plotting the parallels part: # plot the parallels to the separating hyperplane that pass through the # support vectors (margin away from hyperplane in direction # perpendicular to hyperplane). This is sqrt(1+a^2) away vertically in # 2-d. margin = 1 / np.sqrt(np.sum(clf.coef_ ** 2)) yy_down = yy - np.sqrt(1 + a ** 2) * margin yy_up = yy …
Category: Data Science

Python: How to construct a joyplot with values taken from a column in pandas dataframe as y axis

I have a dataframe df in which the column extracted_day consists of dates ranging between 2022-05-08 to 2022-05-12. I have another column named gas_price, which consists of the price of the gas. I want to construct a joyplot such that for each date, it shows the gas_price in the y axis and has minutes_elapsed_from_start_of_day in the x axis. We may also use ridgeplot or any other plot if this doesn't work. This is the code that I have written, but …
Category: Data Science

chart x-axis spacing terminology question

In the following hand made charts I show some value for years. In the first chart I've evenly spaced each year. On the second chart I've spaced them relativelly to their actual year value within time (i.e 2016 is closer to 2017 than 2010). Is there a terminology for the spacing of the second chart? Imagine building a software which would have a toggle control to switch the view from A to B. How would you call it?
Category: Data Science

Many regression lines in a plot

How do you plot many regression lines in a plot? This concerns the textbook question from "Forecasting: Principles and Practice". A dataset concerns winning times of Olympic running events. Categories include year, length (distance), sex (gender). The question askes to make a regression line for each plot to see the average rate of change for each event. I could brute force this question but I want to find an efficient way to make multiple regression lines in a single plot. …
Category: Data Science

How interpret keras training loss without compare with validation loss?

I have several implementation of the same neural network, but each one with different starting parameter. This is one of my plot comparing the training loss of the base experiment with the training loss of another experiment. I have also other exaples: May anyone point me to some instruction on how understand these output from the keras fit()? Note that I don't have any validation set. Thanks
Category: Data Science

Plot three series on the same plot grouping data by day and month

I have a dataset containing three years of data which I would like to plot and compare by date and month; but, I am having a hard time with the final result. I am nearly there, but for some strange reason, while plotting I continue to get an annoying gap in between the data points, even if this does not seem to be included in the data series. The whole dataset is this: Day Visits 0 2018-04-01 1 1 2018-04-02 …
Category: Data Science

Time Series Plot for floating values

I have a Dataframe which looks as shown below I am trying to make a line plot for looking at the peaks for both columns (a,b), I have gotten as far as sns.set_style("darkgrid") plt.plot(wr['a'][:100]) plt.show() but the plot looks shabby, wr.set_index(['Date_x'],inplace=True) wr['a'][:100].plot() wr['b'][:100].plot() I am looking to have something like this Any Help is Appreciated.
Category: Data Science

Plot six variables

I would like to plot a landscape spanned by six variables. The numerical target variable is explained by five numerical variables. Ultimately, it is about to get a visual impression for optima and the parameter landscape itself. Any advice how to proceed? I would prefer R or Python but I am open to alternatives.
Category: Data Science

Why does changing the cluster number change the plot in Kmeans?

This might be a dumb questions but I can't find the answer to it. I don't have the perfect mathematical understanding of kmeans, so apologies if it is. I'm just wondering why I see a different plot when I change the number of clusters in a kmeans plot? Here's the code that I'm using: set.seed(1) k <- kmeans(data, centers = x) plotcluster(data, k$cluster) I vary x to see how the plot looks like. Below are the results for x = …
Category: Data Science

plot gridsearch csv results how?

how can i plot my results from gridsearch csv? clf = GridSearchCV(pipeline, parameters, cv=3,return_train_score=True) clf.fit(x, y) df = pd.DataFrame(clf.cv_results_) i'm trying to get a similar plot to what is here: https://matthewbilyeu.com/blog/2019-02-05/validation-curve-plot-from-gridsearchcv-results , but this uses the grid search object and i have tried and failed at trying to get the same using just the gridsearch df (from above). can anybody help in how i go about this?
Category: Data Science

Plotting three lines on the same plot (with 4-hour frequency)

I want to plot the graph with datetime . I saw so many questions similiar to my questions. But that answers didn't work for me. Here is the code that I used: data = pd.read_csv('p1.csv') df1 = pd.DataFrame(df1, columns=['date', 'time', 'x1', 'x2','x3']) data['date_time'] = pd.to_datetime(data['date'] + ' ' + data['time']) data['date_time'] = pd.to_datetime(data['date_time'], format='%m/%d/%Y %H:%M:%S') data.set_index('date_time',inplace = True) plt.plot(data.index, data.x2) Plot graph : Can anyone suggest me a solution for this? CSV file after reading it: I want to plot …
Category: Data Science

Showing standard deviation for training curve

I am training a neural network and I wanted to plot the evolution of different metrics (MSE…) during training. To get an idea of the variations between between different trainings, I am using several models and plotting average value and standard deviation. My problem is the following: I did not manage to find a good way to plot this curve and also did not find any good explanation on internet. Let's denote by y the metric to be displayed, and …
Category: Data Science

Ordered categorical xlabel number - what to call xlabel

Say I have 105 brand names from a store, and I know the average retrun percentage for the products of the different brands. . For example: Brand = Nike, return_rate = 30% Then I order all these brands and simply put in an integer instead of the name (since I can't put all brands on the xlabel) So now Nike is simply number 50: Brand = 50, return_rate = 30% The graph looks like this I have no clue what …
Category: Data Science

Plotly Express Choropleth Map Animation loading extremely long

I am making an animated choropleth map of regions in Czechia. When I run it without the animation, purely on one set of the data it takes 7.5 seconds. Here is the code for that. However when I tried making the animation I had to stop it after 16 hours without a solution. And the HTML file I was saving it in got almost to 1GB in size. Here is the code I use for the animation. fig = px.choropleth(df_anim, …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.