How to plot using facet_wrap, over multiple pages as a .pdf files in r cran

I am using ggplot, to compare 114 unique studies for a particular variable I'm interested in. This is what I have used. ggplot(steps, aes(x=factor(edu))) + geom_bar(aes(y = (..count..), group = id_study,)) + facet_wrap(~id_study,) Whilst this works, all 114 studies are plotted on one page and the formatting is all squashed. How do I split this over 4x4 pages ? Many thanks S edit **** As there are 114 unique studies, I have 5 pages in total 1) ggplot(steps, aes(x=factor(edu))) + …
Category: Data Science

Plot six variables

I would like to plot a landscape spanned by six variables. The numerical target variable is explained by five numerical variables. Ultimately, it is about to get a visual impression for optima and the parameter landscape itself. Any advice how to proceed? I would prefer R or Python but I am open to alternatives.
Category: Data Science

How to plot a table with multiple columns as a box plot

I am trying to plot a box plot with the Trinucleotide as the x axis (so 64 trinucleotides on the x axis) and the frequency of each trinucleotide in each of 6 samples then color code the plot according to the sample. This is a snippet of the table and the code I have so far as well as the type of graph I want. ibrary(tidyverse) library(readxl) marte <- read_xlsx("TrinucleotideFrequency06182021.xlsx") marte <- gather (marte, "xzl.mmu.C57.testis.wt.adult.40S_crosslink.rep1+rept1.RPF.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.40S.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.40S_crosslink.rep2+rept2.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.40S.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.40S_crosslink.rep3+rept3.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.40S.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.80S_crosslink.rep1+rept1.RPF.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.RPF.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.80S_crosslink.rep2+rept2.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.RPF.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.80S_crosslink.rep3+rept3.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.RPF.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6",key="gene", …
Topic: ggplot2 rstudio r
Category: Data Science

Using glue to include information regarding selected observation

I would like my ggplot to display the state I had selected for better clarity but it seems like glue is only seeking for the first observation rather than my desired output. library(tidyverse) library(glue) death_state=read_csv("https://raw.githubusercontent.com/MoH-Malaysia/covid19-public/main/epidemic/deaths_state.csv") death_state=death_state%>% select(date,state,deaths_new,deaths_new_dod)%>% filter(between(date,max(date)-months(6),max(date)))%>% rename(Reported=deaths_new,Actual=deaths_new_dod)%>% pivot_longer(c(Reported,Actual),names_to="Reported/Actual",values_to="Deaths")%>% rename_with(str_to_title)%>% mutate(State=ifelse(State %in% c("W.P. Kuala Lumpur","Selangor","W.P. Putrajaya"), 'Klang Valley', State)) %>% group_by(Date,State,`Reported/Actual`) %>% summarise(Deaths = sum(Deaths), .groups = 'drop') Up until this point, the dataframe produced is as follow: # A tibble: 5,180 x 4 Date State `Reported/Actual` Deaths <date> …
Topic: ggplot2 r
Category: Data Science

making a contingency table with TRUE and FALSE values

I made the following contingency table already, however there should only be TRUE or FALSE and not all of them showing up on the table. How can I change that? my code is the following: library(tidyverse) library(haven) read_xpt("~/downloads/DEMO_J.XPT") -> demo17 demo17%>% select (subjectID= SEQN, Lebensalter=RIDAGEYR, Geschlecht=RIAGENDR, Ethnie = RIDRETH3, Einwohner=WTMEC2YR, Ratio=INDFMPIR)%>% mutate(Geschlecht=fct_recode(factor(Geschlecht), "Männlich"="1", "Weiblich"="2"))%>% mutate(Ethnie=fct_recode(factor (Ethnie), "Mexican American"="1", "Other Hispanic"="2", "NH White"="3", "NH Black"="4", "NH Asian"="6", "Other"="7")) -> D2 read_xpt("~/downloads/BMX_J.XPT") -> bmx17 bmx17%>% select (subjectID = SEQN, Körpergröße= BMXHT, Gewicht …
Category: Data Science

How to plot a PCA table

I'm studying PCA method with the package PCAmixdata because I have a dataset with numerical and categorical variable. This is my example code in R: library(dplyr) library(PCAmixdata) data <- starwars db_quali <- as.data.frame(starwars[,4:6]) db_quanti <- as.data.frame(starwars[,2:3]) pca_table <- PCAmix(X.quanti = db_quanti, X.quali = db_quali, rename.level=TRUE, graph = TRUE) Gender <- factor(data$gender) par(xpd=TRUE,mar=rep(8,4)) plot(pca_table ,choice="ind",label=FALSE, posleg=xy.coords(2,-10), main="Observations", coloring.ind = Gender) I know that the function ggplot can be used only with data.frame and at the moment I have a list. Is …
Category: Data Science

How to change legend labels in line plot with ggplot2?

Context: I am trying to change the legend labels for the Indices variable which contains "Positive" and "Negative" in "d_posneg" data frame. Problem: However, my attempts have not yet worked. At present this is the code line that I am attempting to rename labels with in the graph below (line 6 of the ggplot): scale_fill_discrete(name = "Indices", labels = c("Positive Emotion", "Negative Emotion")) + Question: Does anyone know how to solve this? See attached file for plot and code below …
Topic: ggplot2 r
Category: Data Science

Graph legend for plot in Base R for class differentiated data gives incorrect representation of actual category

I am new to R. While working on my university assignments, I found that legends for Base R plot do not show correct information, hence I switched to ggplot2 wherever legends were needed. I observed although Base R color code the data (example differentiated by CLASS as was required in our assignment) but legend failed to show right CLASS with respect to color scheme i.e. In graph if Cyan is actually representative CLASS A5 (given the position of points), legend …
Category: Data Science

Fit non-linear customised model

I have a data.frame that have two cols, $x=mz$ and $y=res$. There are about ~2 million rows in the DF. When I plot the graph I get the below. What I'd like to do is find a way to define two quadratics to fit to get the two curves badly sketched in orange. it would be nice to be able to do it in ggplot. I have tried to fit a stat_smooth but I haven't been able to come close …
Category: Data Science

ggplot2 for Cluster analysis (non-readible row names)

I have made a cluster analysis and ended up with dendrogram; however the row names are not readible (made a red rectangle). May I ask if there is way to adjust it? library("reshape2") library("purrr") library("dplyr") library("dendextend") dendro <- as.dendrogram(aggl.clust.c) dendro.col <- dendro %>% set("branches_k_color", k = 5, value = c("darkslategray", "darkslategray4", "darkslategray3", "gold", "gold2")) %>% set("branches_lwd", 0.6) %>% set("labels_colors", value = c("darkslategray")) %>% set("labels_cex", 0.5) ggd1 <- as.ggdend(dendro.col) ggplot(ggd1, theme = theme_minimal()) + labs(x = "Num. observations", y = "Height", …
Category: Data Science

Plot two categorical variables against two numeric variable in ggplot

In my dataset, I have two numeric revenue features, one for each month, and two categorical features one for region and other for value segment. what I want to do is compare these two revenues col by col for each region and facet wrap by value segment. Is there any way to do that in ggplot2? sample data : the image in my mind:
Topic: ggplot2 r
Category: Data Science

Error in hclustfun(distMatrixR, method = method) when I am making a heat map

I am trying to create a heat map with this data in pastebin. However I am getting the error: Error in hclustfun(distMatrixR, method = method) : NA/NaN/Inf in foreign function call (arg 10) In addition: Warning message: In cor(t(x), use = "pa") : the standard deviation is zero which I am failing to trouble shoot. I have attached the code I currently have library("gplots") library("heatmap3") library("RColorBrewer") library("readxl") colfunc <- colorRampPalette(c("mistyrose", "red")) Data <- read_excel("TE_polymophism.xlsx") # N=100 # Data <- Data[sample(nrow(Data), …
Topic: ggplot2 r
Category: Data Science

How to plot multiple columns with ggplot in R?

I do have a data frame with different categorical and numerical columns with the following schema: Id | num_col_1 | num_col_2 | num_col_3 | cat_col_1 | cat_col_2 Now I want to draw a combined plot with ggplot where I (box)plot certain numerical columns (num_col_2, num_col_2) with boxplot groups according cat_col_1 factor levels per numerical columns. Along y axis is the spread of the respective selected columns (not other column). So far I couldn' solve this combined task. Thank you.
Category: Data Science

Using ggplot2 to create a bar chart

So I'm trying to create a simple bar chart of Survive vs Not Survive for the common Titanic data set in R. I keep getting just the number of No's and Yes's, and not the frequencies or counts associated with each no and yes. This is obviously not what is wanted. I am trying to just practice with ggplot2 and make some graphs. What am I doing wrong here? The Bad Barchart: #install.packages("tidyverse") #install.packages("titanic") library(tidyverse) library(titanic) view(Titanic) titanic <- as.data.frame(Titanic) …
Category: Data Science

How can I plot a line for time series data with categorical intervals in R

I am working with single time-series measurements that I want to plot for the time window of about 1 week. This is the data I am working with. This is my R script: library(tidyverse) library(ggplot2) filesource <- "C:/ ... /testData.csv" df <-read.csv(filesource, header = TRUE) ggplot() + geom_line(data = df, aes(x = date, y = value, group = 1), color = "red") + ggtitle("Some Measure over Time") + xlab("Time") + ylab("Some Measure in %") This produces this plot. What I …
Category: Data Science

How to add RMSE value on a plot with ggplot

I added r2 value and the formula of the regression function but I also want RMSE value on my plot, maybe I need to add something but I could not see a proper answer to this question neither here nor google... ggplot(data = AGB.rf$pred) + geom_point(mapping = aes(x = pred, y = obs, color = pred, shape=1))+ geom_smooth(mapping = aes(x = pred, y = obs), method="lm", se = FALSE)+ stat_cor(aes(x = pred, y = obs, label = ..rr.label..),label.y = 3000)+ …
Topic: rmse ggplot2 r
Category: Data Science

Overlay Bar Plot

I am trying to turn the first overlay bar plot into the second which allows for comparison of 2 variables instead of just one. Included my current code below which creates the first chart comparing one variable to its mean. Thanks in advance for taking a look! output$p1 <- renderPlot({ ggplot(rating()) + geom_col(aes(x = factor(RATING,levels =c('Cash','AAA','AA+','AA','AA-')), y = pct_rating), fill = "blue", width = 0.2) + geom_col(aes(x = RATING, y = mean_pct_rating), alpha = 0.3, fill = "red", width = …
Category: Data Science

How to get a (descriptive) overview of a large database?

I'm facing a data framework with ~ 20 k observations and 151 variables across 2078 subjects At first I am primarily interested in how the data looks like related to a single parameter. But I cannot plot 2078 subjects on the x-axis and make a bar plot out of it or so. What would be useful methods for such a situation? I prefer some visualizations but I think they won't be applicable. I'm afraid even non-visualization methods are not really …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.