I am using ggplot, to compare 114 unique studies for a particular variable I'm interested in. This is what I have used. ggplot(steps, aes(x=factor(edu))) + geom_bar(aes(y = (..count..), group = id_study,)) + facet_wrap(~id_study,) Whilst this works, all 114 studies are plotted on one page and the formatting is all squashed. How do I split this over 4x4 pages ? Many thanks S edit **** As there are 114 unique studies, I have 5 pages in total 1) ggplot(steps, aes(x=factor(edu))) + …
I would like to plot a landscape spanned by six variables. The numerical target variable is explained by five numerical variables. Ultimately, it is about to get a visual impression for optima and the parameter landscape itself. Any advice how to proceed? I would prefer R or Python but I am open to alternatives.
I am trying to plot a box plot with the Trinucleotide as the x axis (so 64 trinucleotides on the x axis) and the frequency of each trinucleotide in each of 6 samples then color code the plot according to the sample. This is a snippet of the table and the code I have so far as well as the type of graph I want. ibrary(tidyverse) library(readxl) marte <- read_xlsx("TrinucleotideFrequency06182021.xlsx") marte <- gather (marte, "xzl.mmu.C57.testis.wt.adult.40S_crosslink.rep1+rept1.RPF.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.40S.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.40S_crosslink.rep2+rept2.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.40S.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.40S_crosslink.rep3+rept3.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.40S.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.80S_crosslink.rep1+rept1.RPF.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.RPF.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.80S_crosslink.rep2+rept2.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.RPF.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6", "xzl.mmu.C57.testis.wt.adult.80S_crosslink.rep3+rept3.RPF.R1.trimmed.gz.x_rRNA.x_hairpin.mm10v1.unique.+jxn.bed13.RPF.sense.hybrid.utr3.1up.5end.PNLDC1.rep1.bed6",key="gene", …
I would like my ggplot to display the state I had selected for better clarity but it seems like glue is only seeking for the first observation rather than my desired output. library(tidyverse) library(glue) death_state=read_csv("https://raw.githubusercontent.com/MoH-Malaysia/covid19-public/main/epidemic/deaths_state.csv") death_state=death_state%>% select(date,state,deaths_new,deaths_new_dod)%>% filter(between(date,max(date)-months(6),max(date)))%>% rename(Reported=deaths_new,Actual=deaths_new_dod)%>% pivot_longer(c(Reported,Actual),names_to="Reported/Actual",values_to="Deaths")%>% rename_with(str_to_title)%>% mutate(State=ifelse(State %in% c("W.P. Kuala Lumpur","Selangor","W.P. Putrajaya"), 'Klang Valley', State)) %>% group_by(Date,State,`Reported/Actual`) %>% summarise(Deaths = sum(Deaths), .groups = 'drop') Up until this point, the dataframe produced is as follow: # A tibble: 5,180 x 4 Date State `Reported/Actual` Deaths <date> …
I made the following contingency table already, however there should only be TRUE or FALSE and not all of them showing up on the table. How can I change that? my code is the following: library(tidyverse) library(haven) read_xpt("~/downloads/DEMO_J.XPT") -> demo17 demo17%>% select (subjectID= SEQN, Lebensalter=RIDAGEYR, Geschlecht=RIAGENDR, Ethnie = RIDRETH3, Einwohner=WTMEC2YR, Ratio=INDFMPIR)%>% mutate(Geschlecht=fct_recode(factor(Geschlecht), "Männlich"="1", "Weiblich"="2"))%>% mutate(Ethnie=fct_recode(factor (Ethnie), "Mexican American"="1", "Other Hispanic"="2", "NH White"="3", "NH Black"="4", "NH Asian"="6", "Other"="7")) -> D2 read_xpt("~/downloads/BMX_J.XPT") -> bmx17 bmx17%>% select (subjectID = SEQN, Körpergröße= BMXHT, Gewicht …
I'm studying PCA method with the package PCAmixdata because I have a dataset with numerical and categorical variable. This is my example code in R: library(dplyr) library(PCAmixdata) data <- starwars db_quali <- as.data.frame(starwars[,4:6]) db_quanti <- as.data.frame(starwars[,2:3]) pca_table <- PCAmix(X.quanti = db_quanti, X.quali = db_quali, rename.level=TRUE, graph = TRUE) Gender <- factor(data$gender) par(xpd=TRUE,mar=rep(8,4)) plot(pca_table ,choice="ind",label=FALSE, posleg=xy.coords(2,-10), main="Observations", coloring.ind = Gender) I know that the function ggplot can be used only with data.frame and at the moment I have a list. Is …
Context: I am trying to change the legend labels for the Indices variable which contains "Positive" and "Negative" in "d_posneg" data frame. Problem: However, my attempts have not yet worked. At present this is the code line that I am attempting to rename labels with in the graph below (line 6 of the ggplot): scale_fill_discrete(name = "Indices", labels = c("Positive Emotion", "Negative Emotion")) + Question: Does anyone know how to solve this? See attached file for plot and code below …
I am new to R. While working on my university assignments, I found that legends for Base R plot do not show correct information, hence I switched to ggplot2 wherever legends were needed. I observed although Base R color code the data (example differentiated by CLASS as was required in our assignment) but legend failed to show right CLASS with respect to color scheme i.e. In graph if Cyan is actually representative CLASS A5 (given the position of points), legend …
I have a data.frame that have two cols, $x=mz$ and $y=res$. There are about ~2 million rows in the DF. When I plot the graph I get the below. What I'd like to do is find a way to define two quadratics to fit to get the two curves badly sketched in orange. it would be nice to be able to do it in ggplot. I have tried to fit a stat_smooth but I haven't been able to come close …
I have made a cluster analysis and ended up with dendrogram; however the row names are not readible (made a red rectangle). May I ask if there is way to adjust it? library("reshape2") library("purrr") library("dplyr") library("dendextend") dendro <- as.dendrogram(aggl.clust.c) dendro.col <- dendro %>% set("branches_k_color", k = 5, value = c("darkslategray", "darkslategray4", "darkslategray3", "gold", "gold2")) %>% set("branches_lwd", 0.6) %>% set("labels_colors", value = c("darkslategray")) %>% set("labels_cex", 0.5) ggd1 <- as.ggdend(dendro.col) ggplot(ggd1, theme = theme_minimal()) + labs(x = "Num. observations", y = "Height", …
In my dataset, I have two numeric revenue features, one for each month, and two categorical features one for region and other for value segment. what I want to do is compare these two revenues col by col for each region and facet wrap by value segment. Is there any way to do that in ggplot2? sample data : the image in my mind:
I have the following plot, is there any way in ggplot to display just the numbers 1 to 10 instead of all of them? Numbers from 10 and after are not so important, but I need to display the ones before. Thank you
I am trying to create a heat map with this data in pastebin. However I am getting the error: Error in hclustfun(distMatrixR, method = method) : NA/NaN/Inf in foreign function call (arg 10) In addition: Warning message: In cor(t(x), use = "pa") : the standard deviation is zero which I am failing to trouble shoot. I have attached the code I currently have library("gplots") library("heatmap3") library("RColorBrewer") library("readxl") colfunc <- colorRampPalette(c("mistyrose", "red")) Data <- read_excel("TE_polymophism.xlsx") # N=100 # Data <- Data[sample(nrow(Data), …
I do have a data frame with different categorical and numerical columns with the following schema: Id | num_col_1 | num_col_2 | num_col_3 | cat_col_1 | cat_col_2 Now I want to draw a combined plot with ggplot where I (box)plot certain numerical columns (num_col_2, num_col_2) with boxplot groups according cat_col_1 factor levels per numerical columns. Along y axis is the spread of the respective selected columns (not other column). So far I couldn' solve this combined task. Thank you.
So I'm trying to create a simple bar chart of Survive vs Not Survive for the common Titanic data set in R. I keep getting just the number of No's and Yes's, and not the frequencies or counts associated with each no and yes. This is obviously not what is wanted. I am trying to just practice with ggplot2 and make some graphs. What am I doing wrong here? The Bad Barchart: #install.packages("tidyverse") #install.packages("titanic") library(tidyverse) library(titanic) view(Titanic) titanic <- as.data.frame(Titanic) …
I am working with single time-series measurements that I want to plot for the time window of about 1 week. This is the data I am working with. This is my R script: library(tidyverse) library(ggplot2) filesource <- "C:/ ... /testData.csv" df <-read.csv(filesource, header = TRUE) ggplot() + geom_line(data = df, aes(x = date, y = value, group = 1), color = "red") + ggtitle("Some Measure over Time") + xlab("Time") + ylab("Some Measure in %") This produces this plot. What I …
I added r2 value and the formula of the regression function but I also want RMSE value on my plot, maybe I need to add something but I could not see a proper answer to this question neither here nor google... ggplot(data = AGB.rf$pred) + geom_point(mapping = aes(x = pred, y = obs, color = pred, shape=1))+ geom_smooth(mapping = aes(x = pred, y = obs), method="lm", se = FALSE)+ stat_cor(aes(x = pred, y = obs, label = ..rr.label..),label.y = 3000)+ …
I am trying to turn the first overlay bar plot into the second which allows for comparison of 2 variables instead of just one. Included my current code below which creates the first chart comparing one variable to its mean. Thanks in advance for taking a look! output$p1 <- renderPlot({ ggplot(rating()) + geom_col(aes(x = factor(RATING,levels =c('Cash','AAA','AA+','AA','AA-')), y = pct_rating), fill = "blue", width = 0.2) + geom_col(aes(x = RATING, y = mean_pct_rating), alpha = 0.3, fill = "red", width = …
I have tried generating higher quality of data visualization plots from RStats. I tried increasing frame dimensions but it's still max at about ~350kb. How do I generate higher quality images from RStats?
I'm facing a data framework with ~ 20 k observations and 151 variables across 2078 subjects At first I am primarily interested in how the data looks like related to a single parameter. But I cannot plot 2078 subjects on the x-axis and make a bar plot out of it or so. What would be useful methods for such a situation? I prefer some visualizations but I think they won't be applicable. I'm afraid even non-visualization methods are not really …