ggplot2 - R running average for non-time data -


this plot i'm having now. enter image description here

it's generated code:

ggplot(data1, aes(x=pos,y=diff,colour=gt)) +    geom_point() +   facet_grid(~ chrom,scales="free_x",space="free_x") +    theme(strip.text.x = element_text(size=40),         strip.background = element_rect(color='lightblue',fill='lightblue'),         legend.position="top",         legend.title = element_text(size=40,colour="lightblue"),         legend.text = element_text(size=40),         legend.key.size = unit(2.5, "cm")) +   guides(fill = guide_legend(title.position="top",                              title = "legend:gt='ref'+'alt'"),          shape = guide_legend(override.aes=list(size=10))) +   scale_y_log10(breaks=trans_breaks("log10", function(x) 10^x, n=10)) +    scale_x_continuous(breaks = pretty_breaks(n=3)) +   geom_line(stat = "hline",             yintercept = "mean",             size = 1) 

the last line, geom_line creates mean line each panel.

but want have more specific running average inside each panel.

i.e. if panel1('chr01') has x-axis range 0 100,000,000, want have mean value each 1,000,000 range.

mean1 = mean(x=0 x=1,000,000)

mean2 = mean(x=1,000,001 x=2,000,000)

like that.

one way provide running mean geom_smooth() using loess local regression method. in order demonstrate proposed solution, created fake genomic dataset using r functions. can adjust span parameter of geom_smooth make running mean smoother (closer 1.0) or rougher (closer 1/number of data points).

# create example data. set.seed(27182)  y1 = rnorm(10000) +       c(rep(0, 1000), dnorm(seq(-2, 5, length.out=8000)) * 3, rep(0, 1000)) y2 = c(rnorm(2000), rnorm(1000, mean=1.5), rnorm(1000, mean=-1, sd=2),         rnorm(2000, sd=2)) y3 = rnorm(4000) pos = c(sort(runif(10000, min=0, max=1e8)),         sort(runif(6000,  min=0, max=6e7)),         sort(runif(4000,  min=0, max=4e7))) chr = rep(c("chr01", "chr02", "chr03"), c(10000, 6000, 4000))  data1 = data.frame(chrom=chr, pos=pos, diff=c(y1, y2, y3))  # plot. p = ggplot(data1, aes(x=pos, y=diff)) +     geom_point(alpha=0.1, size=1.5) +     geom_smooth(colour="darkgoldenrod1", size=1.5, method="loess", degree=0,          span=0.1, se=false) +     scale_x_continuous(breaks=seq(1e7, 3e8, 1e7),          labels=paste(seq(10, 300, 10)), expand=c(0, 0)) +     xlab("position, megabases") +     theme(axis.text.x=element_text(size=8)) +     facet_grid(. ~ chrom, scales="free", space="free")  ggsave(filename="plot_1.png", plot=p, width=10, height=5, dpi=150) 

enter image description here


Comments

Popular posts from this blog

html - Sizing a high-res image (~8MB) to display entirely in a small div (circular, diameter 100px) -

java - IntelliJ - No such instance method -

identifier - Is it possible for an html5 document to have two ids? -