visualization

Changes in the standard Heatmap plot - symmetric bar colors, show only diagonal values, and column names at x,y axis ticks

Serendipity

2022年6月4日 11:01

I have a heatmap image (correlation between all matrix columns) and I'm straggling to preform all the changes below within the same image: bar colors should be symmetric around zero (e.g., correlation of 1 and -1 should be with the same color) change the correlation matrix to a diagonal matrix, since correlation values are symmetric - and show only upper matrix triangle (mask out the lower triangle ) show the correlation values in every cell of the diagonal matrix x,y …

Topic: heatmap matplotlib correlation visualization python

Category: Data Science

A clear visualization of a two-way ANOVA

Reza Norouzian

2022年6月3日 17:03

To provide a full yet simple picture of a 3-level, one-way ANOVA, I use the following visualization where variation within each group (the filled circles) and variation between the groups (black arrows) are simple to be understood. But I'm wondering if it could be possible to extend the current visualization to a 2 x 3 two-way ANOVA (adding another way with two groups to the current visualization)? (Note: the dashed vertical lines denote each group's mean)

Topic: visualization statistics r

Category: Data Science

How to summarize very large neural networks?

Oskar

2022年5月29日 13:48

I am doing a lot of work with transfer learning at the moment (using keras and tensorflow if that is relevant). I am having a lot of issues in sufficiently summarizing the very large models. This post: How do you visualize neural network architectures? shows a lot of useful methods for visualizing architectures, and they are great for networks such VGG16, but none of them are reasonable to include in a report if the models are very large (such as …

Topic: transfer-learning machine-learning-model deep-learning visualization machine-learning

Category: Data Science

How do you visualize neural network architectures?

Martin Thoma

2022年5月29日 07:15

When writing a paper / making a presentation about a topic which is about neural networks, one usually visualizes the networks architecture. What are good / simple ways to visualize common architectures automatically?

Topic: deep-learning neural-network visualization machine-learning

Category: Data Science

Visualizing large number of points as a 3D density map

deeep

2022年5月28日 18:05

The result of my computational simulation is a (time-dependent) system of large number (~100k) of moving points in a confined space. Each point has its own Cartesian coordinates as well as a weight (w) in the form of $(x_i,y_i,z_i;w_i)$. I'm looking for a software/app/package to create a snapshot of the 3D spatial density map of these points. (something like this). Like you see in this figure, the points are not going to be displayed individually, but only a transparent cloud …

Topic: visualization

Category: Data Science

Geo Add-on Choropleth Widget not found

Sanderson Macedo

2022年5月27日 15:00

I am trying to create a Choropleth map using the Chorolpeth widget from the Geo Add-on in Orange. However, this widget is not appearing? Any ideas? My current version: Orange version 3.24.1 for Windows

Topic: heatmap orange3 orange geospatial visualization

Category: Data Science

How to arrange web scraped data in a table using R?

user151030

2022年5月27日 08:10

Original Code library(netstat) library(RSelenium) library(tidyverse) obj<-rsDriver(browser="chrome",chromever="101.0.4951.15",verbose=F,port=free_port()) remDr<-obj$client remDr$navigate('https://www.imdb.com/search/title/?year=2022&title_type=feature&') Title<-remDr$findElements(using='css','.lister-item-header a') lapply(Title,function(x) { x$getElementText()%>% unlist() }) o/p: [[1]] 1 "Doctor Strange in the Multiverse of Madness" [[2]] 1 "Senior Year" My attempts to arrange data in tabular form- 1.movies=data.frame(Title,stringsAsFactors=FALSE) view(movies) **Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘structure("webElement", package = "RSelenium")’ to a data.frame** 2.movies=data.frame(x,stringsAsFactors=FALSE) view(movies) **Error in data.frame(X, stringsAsFactors = FALSE) : object 'X' not found** 3.Part of original code tweaked- lapply(Title,function(x) { **t<-list(x$getElementText()%>% unlist())** }) l=data.frame("movie"=t,stringsAsFactors …

Topic: structured-data web-scraping visualization r

Category: Data Science

Visualizing 28 different variables with 28 different colors?

kmace

2022年5月27日 00:04

ColorBrewer seems to be very useful in selecting a color pallet to represent factors that have up to 12 possible values. I have 28. Is it a horrible idea to represent 28 variables with color? If so, could you suggest an alternative visual indicator? Currently I'm using the colors for column side colors in a heatmap shown below. As you can see, the Strain column is not very informative:

Topic: visualization

Category: Data Science

How to perform (modified) t-test for multiple variables and multiple models on Python (Machine Learning)

Shounak Ray

2022年5月25日 03:06

I have created and analyzed around 16 machine learning models using WEKA. Right now, I have a CSV file which shows the models' metrics (such as percent_correct, F-measure, recall, precision, etc.). I am trying to conduct a (modified) student's t-test on these models. I am able to conduct one (according to THIS link) where I compare only ONE variable common to only TWO models. I want to perform a (or multiple) t-tests with MULTIPLE variables and MULTIPLE models at once. …

Topic: visualization pandas python statistics machine-learning

Category: Data Science

Unable to generate useful insights on a highly cardinal data

dark_rush

2022年5月24日 06:21

I'm working on CRM data, did some cleaning, encoding and ran a decision tree classifier from which i plotted a feature_importance graph From that I found that Sales person column is one of the important feature which is highly cardinal column(around 1300+ categories/sales person). Now i'm trying to generate some insights on this column with respect to target column(binary values). Would like to know in general how to create insights from such a large categorical column? P.S: Other columns are …

Topic: data-science-model data-analysis visualization python machine-learning

Category: Data Science

How to represent the number of neurons in an LSTM for architecture schematic?

Capeboom

2022年5月23日 11:01

I'm trying to visualise a neural network schematic and found a great tool for building schematics here http://alexlenail.me/NN-SVG/index.html. I've edited the SVG file to change one of the dense layers into a LSTM layer, and the input to time series instead of singular neurons. At the bottom of the image there is some set notation detailing how many neurons is in each layer. I'm not too familiar with set notation. I'm not quite sure how to represent the LSTM layers …

Topic: lstm rnn neural-network visualization

Category: Data Science

Visualizing decision tree with feature names

torBhakt

2022年5月23日 00:02

from scipy.sparse import hstack X_tr1 = hstack((X_train_cc_ohe, X_train_csc_ohe, X_train_grade_ohe, X_train_price_norm, X_train_tnppp_norm, X_train_essay_bow, X_train_pt_bow)).tocsr() X_te1 = hstack((X_test_cc_ohe, X_test_csc_ohe, X_test_grade_ohe, X_test_price_norm, X_test_tnppp_norm, X_test_essay_bow, X_test_pt_bow)).tocsr() X_train_cc_ohe and all are vectorized categorical data, and X_train_pt_bow is bag of words vectorized text data. Now, I applied a decision tree classifier on this model and got this: I took max_depth as 3 just for visualization purposes. My question is: I would like to get feature names in my output instead of index as X2599, X4 etc. …

Topic: decision-trees visualization

Category: Data Science

Cluster Evaluation with Jaccard and Rand Index

Mirko

2022年5月22日 19:00

I've clusterized my data according to 3 criteria in 3 groups. I used kmeans to obtain those cluster so the label for each cluster is random and changes at each script run. To evaluate the consistency of my clusters I decided to use Jaccard index but I can't understand how to apply it properly. Let's say I have this data where alpha beta and gamma are the 3 methods, and the Cluster Index is the value returned by K-means for …

Topic: model-evaluations jaccard-coefficient visualization python clustering

Category: Data Science

Visualize Softmax values in CNN prediction

Jane Mänd

2022年5月19日 04:02

What is the most convenient way to visualize Softmax values after calling the CNN prediction function? Do I have to collect different probability values and feed them to the matplotlib or are there any more convenient ways/libraries to do this? Below is one example what I mean:

Topic: cnn visualization

Category: Data Science

How to plot segmented bar chart (stacked bar graph) with Python?

Paw in Data

2022年5月17日 16:06

cat = {'A':1, 'B':2, 'C':3} dog = {'A':2, 'B':2, 'C':4} owl = {'A':3, 'B':3, 'C':3} Suppose I have 3 dictionary, each containing pairs of (subcategory, count). How can I plot a segmented bar chart (i.e stacked bar graph) using Python with x being 3 categories (cat, dog, owl) and y being proportion (of each subcategory)? What I have in mind looks like this:

Topic: bar-chart matplotlib visualization python

Category: Data Science

How to plot the bar charts of precision, recall, and f-measure?

Hamza

2022年5月17日 05:01

I have used 4 machine learning models on a task and now I am struggling to plot their bar charts just like shown below in the image. I am printing classification report to get precision, recall etc. My code is shown: def Statistics(data): # Classification Report print("Classification Report is shown below") print(classification_report(data['actual labels'],data['predicted labels'])) # Confusion matrix print("Confusion matrix is shown below") cm=confusion_matrix(data['actual labels'],data['predicted labels']) plt.figure(figsize=(10,7)) sn.heatmap(cm, annot=True,cmap='Blues', fmt='g') plt.xlabel('Predicted') plt.ylabel('Truth') Statistics(data) How can I plot this type of chart …

Topic: plotly matplotlib plotting visualization python

Category: Data Science

Are there any methods of supervised learning that return a bitmap instead of a set of parameters?

Zubetto

2022年5月16日 22:05

For example, the SVM or ANN methods perform search of a surface which would separate the data points in a best way. This surface is returned in the vector or parametric form. Are there methods returning a spatial bitmap each voxel of which contains a numeric value defining a class for all points lying within a given voxel? I would like to share some of the results of my attempts in this direction. Since I'm relatively new in machine learning …

Topic: machine-learning-model classification visualization machine-learning

Category: Data Science

Coloring labels using scatterplot3d in R

trolkura

2022年5月16日 00:07

I am trying to visualize data using R and scatterplot3d. I have loaded data and used: colors <- c("#999999", "#E69F00", "#56B4E9" ) scatterplot3d(output$X2,output$X6 , output$X7 , color=colors, pch="X9") X9 is label column in my dataset. it contains 3 categories : A , B , C. By documentation: color : colors of points in the plot, optional if x is an appropriate structure. Will be ignored if highlight.3d = TRUE. pch: plotting "character", i.e. symbol to use. Yet I still get …

Topic: visualization r

Category: Data Science

Performing EDA on a dataset with missing features

user135735

2022年5月15日 05:32

I'm new to DS. I want to perform EDA on such dataset, where these are the missing features stats of my train and test sets: train: Test_0 0 Test_1 31 Test_2 0 Test_3 141 Test_4 0 Test_5 0 Test_6 0 Test_7 0 Test_8 1045 Test_9 0 Test_10 0 Test_11 0 Test_12 0 Test_13 0 Test_14 0 Test_15 2967 Class 0 dtype: int64 test: Test_0 0 Test_1 7 Test_2 0 Test_3 46 Test_4 0 Test_5 0 Test_6 0 Test_7 0 Test_8 …

Topic: exploratory-factor-analysis visualization data-cleaning

Category: Data Science

Visualization with many lines, colors, and markers

Robyc

2022年5月13日 22:02

I have a bunch of plots as the one reported below. The data is from measurements performed on different times and different days. In the plot (which is a cumulative distribution function, if that matters), the colors differentiate data relevant to different days; the markers are used to further differentiate the data within each day. The problem is that the plot is very crowded and a bit ugly. Some markers can be barely seen. Question: Any idea how I can …

Topic: matplotlib python-3.x visualization

Category: Data Science

About