今天我们要分享的R包是 ggpubr 包,它是一款基于ggplot2的可视化包,功能非常强大,能够一行命令绘制出符合出版物要求的图形。ggpubr 包可绘制的图形类型非常多,有密度图、直方图、柱状图、饼图、棒棒糖图、Cleveland 点图、箱线图、小提琴图、点带图、点图、散点图、线图、误差棒图……哈哈,有木有很期待接下来这个 ggpubr 包的学习了呢!别急,这个包的内容实在是有点多哈,接下来我们会分3期进行详细的讲解,记得紧跟我们的学习哦~
先简单介绍一下今天要分享的绘图内容,分别有:密度图、直方图、柱状图、饼图、棒棒糖图、Cleveland 点图。
接下来,先安装 ggpubr 包:
# Install from CRAN:install.packages("ggpubr")library(ggpubr)
密度图
#先构建数据集
set.seed(1234)df1 <- data.frame(sex=factor(rep(c("F", "M"), each=200)), weight=c(rnorm(200, 55), rnorm(200, 58)))
head(df1)# sex weight# 1 F 53.79293# 2 F 55.27743# 3 F 56.08444# 4 F 52.65430# 5 F 55.42912# 6 F 55.50606
tail(df1)# sex weight# 395 M 58.52875# 396 M 58.78939# 397 M 58.45710# 398 M 58.53883# 399 M 58.01464# 400 M 57.08351
# 基础样式,添加均值线和地毯线,密度图展示不同性别分组下体重的分布,X轴为体重,Y轴为自动累计的密度ggdensity(df1, x = "weight", fill = "lightgray", add = "mean", rug = TRUE)
# 根据分组设置线条颜色和填充
ggdensity(df1, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex", palette = c("#00AFBB", "#E7B800"))
# 更改自定义颜色ggdensity(df1, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex", palette = "npg")
# 限定x轴取值范围ggdensity(df1, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex", palette = "npg") + xlim(53,60)
# 设置分组分面,更改边框线类型ggdensity(df1, x = "weight", facet.by = "sex", linetype = "dashed", add = "mean", rug = TRUE, color = "sex", fill = "sex", palette = c("#00AFBB", "#E7B800"))
接下来我们详细解读一下用到的 ggdensity 函数:用法:
ggdensity(data, x, y = "..density..", combine = FALSE, merge = FALSE, color = "black", fill = NA, palette = NULL, size = NULL, linetype = "solid", alpha = 0.5, title = NULL, xlab = NULL, ylab = NULL, facet.by = NULL, panel.labs = NULL, short.panel.labs = TRUE, add = c("none", "mean", "median"), add.params = list(linetype = "dashed"), rug = FALSE, label = NULL, font.label = list(size = 11, color = "black"), label.select = NULL, repel = FALSE, label.rectangle = FALSE, ggtheme = theme_pubr(), ...)
参数:篇幅有限,没有用到的自行探索~,有的参数在后面的其他图形还会讲解
data
所需的数据框 dataframe
x
进行作图所需的数据
y
设置为密度/数量(density/count)
combine
对于多个变量的数据是否分面
merge
对于多个变量的数据是否合并,默认是FALSE
color, fill
线条颜色与填充色
palette
自定义颜色画板
size
设置点和轮廓的大小
linetype
线条类型
alpha
透明度设置
title
设置标题
xlab
设置x轴标题
ylab
设置y轴标题
facet.by
设置分组分面
panel.labs
设置分面各组的标题
short.panel.labs
是否缩写分面标题,逻辑值,默认是TRUE。
add
添加均值线或中位数线,选项有"mean" or "median"
add.params
给add参数的对象添加其他参数/属性
rug
逻辑值,若为TRUE,在X轴上添加地毯线显示样本的分布
label
设置列标签
font.label
设置标签字体
repel
逻辑值,是否使用ggrepel避免字体重叠
label.rectangle
是否给标签添加方框
ggtheme
设置画图主题
直方图
用到的数据还是 df1,直方图只是把密度还原成了原始数据counts值gghistogram(df1, x="weight", add = "mean", rug = TRUE, fill = "lightgray")
# 按照分组设置边框颜色gghistogram(df1, x="weight", add = "mean", rug = TRUE, color ="sex", palette = c("#00AFBB", "#E7B800"))
# 按照分组设置填充颜色
gghistogram(df1, x="weight", add = "mean", rug = TRUE, fill = "sex", palette = c("#00AFBB", "#E7B800"))
# 同时设置边框和填充颜色gghistogram(df1, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex", palette = c("#00AFBB", "#E7B800"))
# 修改x轴区间个数:gghistogram(df1, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex", palette = c("#00AFBB", "#E7B800"),bins = 50)
# 添加密度曲线gghistogram(df1, x = "weight", add ="mean", rug = TRUE, fill ="sex", palette = c("#00AFBB", "#E7B800"), add_density =TRUE)
# y 设置为密度gghistogram(df1, x = "weight", y = "..density..", add ="mean", rug = TRUE, fill ="sex", palette = c("#00AFBB", "#E7B800"), add_density =TRUE)
# 设置分组分面
gghistogram(df1, x = "weight", facet.by = "sex", add ="mean", rug = TRUE, fill ="sex", palette = c("#00AFBB", "#E7B800"), add_density =TRUE)
# 设置分面各组的标题
gghistogram(df1, x = "weight", facet.by = "sex", panel.labs = list(sex = c("Female", "Mmale")), add ="mean", rug = TRUE, fill ="sex", palette = c("#00AFBB", "#E7B800"), add_density =TRUE)
gghistogram 函数:
用法:
gghistogram(data, x, y = "..count..", combine = FALSE, merge = FALSE, color = "black", fill = NA, palette = NULL, size = NULL, linetype = "solid", alpha = 0.5, bins = NULL, binwidth = NULL, title = NULL, xlab = NULL, ylab = NULL, facet.by = NULL, panel.labs = NULL, short.panel.labs = TRUE, add = c("none", "mean", "median"), add.params = list(linetype = "dashed"), rug = FALSE, add_density = FALSE, label = NULL, font.label = list(size = 11, color = "black"), label.select = NULL, repel = FALSE, label.rectangle = FALSE, ggtheme = theme_pubr(), ...)
参数:
data
所需的数据框(dataframe)
x
x轴作图所需的数据
y
设置为密度或count数("..density.." or "..count..")
combine
对于多个变量的数据是否分面。逻辑值,默认是FALSE。
merge
对于多个变量数据是否合并,默认是FALSE
color, fill
线条颜色与填充色
palette
自定义颜色画板
size
设置点和轮廓的大小
linetype
线条类型
alpha
透明度设置
bins
bin(x轴的区间)的个数,默认最高到30
binwidth
bin的宽度,数值在(0,1)
title
设置标题
xlab
设置x轴标题
ylab
设置y轴标题
facet.by
设置分组分面
panel.labs
设置分面各组的标题
short.panel.labs
是否缩写分面标题,逻辑值,默认是TRUE。
add
添加均值或中位数线("mean" or "median")
add.params
给add参数的对象添加其他参数/属性
rug
是否添加边际线
add_density
是否添加密度曲线
label
设置列标签
font.label
设置标签字体
repel
逻辑值,是否使用ggrepel避免字体重叠
label.rectangle
是否给标签添加方框
ggtheme
设置画图主题
柱状图
#先构建数据集
df2 <- data.frame(dose=c("D0.5", "D1", "D2"), len=c(4.2, 10, 29.5))head(df2)# dose len# 1 D0.5 4.2# 2 D1 10.0# 3 D2 29.5
# 柱状图基础版本:ggbarplot(df2, x = "dose", y = "len")
# 添加y的值作为标签ggbarplot(df2, x = "dose", y = "len", label = TRUE, label.pos = "out")
# 更改柱子的宽度ggbarplot(df2, x = "dose", y = "len", width = 0.5)
# 变换坐标轴的方向ggbarplot(df2, "dose", "len", orientation = "horiz")
# 指定排列顺序ggbarplot(df2, "dose", "len", order = c("D2", "D1", "D0.5"))
# 更改填充色和边框色,并且将标签放在柱子内ggbarplot(df2, "dose", "len", fill = "steelblue", color = "black", label = TRUE, lab.pos = "in", lab.col = "white")
# 按照 x 轴"dose"的分组设定颜色ggbarplot(df2, "dose", "len", color = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"))
ggbarplot(df3, "dose", "len", fill = "dose", color = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"))
# 分组绘图# 构建数据df3 <- data.frame(supp=rep(c("VC", "OJ"), each=3), dose=rep(c("D0.5", "D1", "D2"),2), len=c(6.8, 15, 33, 4.2, 10, 29.5))print(df3)## supp dose len## 1 VC D0.5 6.8## 2 VC D1 15.0## 3 VC D2 33.0## 4 OJ D0.5 4.2## 5 OJ D1 10.0## 6 OJ D2 29.5
ggbarplot(df3, "dose", "len", fill = "supp", color = "supp", label = TRUE, lab.col = "white", lab.pos = "in")
# 更改排列方式ggbarplot(df3, "dose", "len", fill = "supp", color = "supp", palette = "Paired", label = TRUE, position = position_dodge())
ggbarplot(df3, "dose", "len", fill = "supp", color = "supp", palette = "Paired", label = TRUE, position = position_fill())
# 添加散点和误差——使用的是"ToothGrowth"数据集,ToothGrowth描述了维生素C对豚鼠牙齿生长的影响。3个变量60个观测值。[,1] 是len数字牙齿长度。[,2]是补充因子类型(VC或OJ)。[,3]是以毫克为单位的剂量。data("ToothGrowth")df4 <- ToothGrowthhead(df4)
# len supp dose# 1 4.2 VC 0.5# 2 11.5 VC 0.5# 3 7.3 VC 0.5# 4 5.8 VC 0.5# 5 6.4 VC 0.5# 6 10.0 VC 0.5
# 可以看出 x 轴的每一个分组都对应 y 轴上的多个取值ggbarplot(df4, x = "dose", y = "len")
# 展示每个组的均值ggbarplot(df4, x = "dose", y = "len", add = "mean")
# 添加不同类型的误差棒(mean_sd, mean_ci, median_iqr, ....),调整 label 位置,其他的误差棒类型自行探索哈~ggbarplot(df4, x = "dose", y = "len", add = "mean_se", label = TRUE, lab.vjust = -1.6)
# 添加 jitter pointsggbarplot(df4, x = "dose", y = "len", add = c("mean_se", "jitter"))
# 添加 dot 小圆点
ggbarplot(df4, x = "dose", y = "len", add = c("mean_se", "dotplot"))
# 多个分组的条形图ggbarplot(df4, x = "dose", y = "len", color = "supp", add = "mean_se", palette = c("#00AFBB", "#E7B800"), position = position_dodge())
# 高级版来啦,用到的数据集是"mtcars"data("mtcars")df5 <- mtcarsView(df5)
df5$cyl <- factor(df5$cyl) #不转化成因子的话后面画图会报错df5$name <- rownames(df5) #添加一行namehead(df5[, c("name", "wt", "mpg", "cyl")])
# name wt mpg cyl# Mazda RX4 Mazda RX4 2.620 21.0 6# Mazda RX4 Wag Mazda RX4 Wag 2.875 21.0 6# Datsun 710 Datsun 710 2.320 22.8 4# Hornet 4 Drive Hornet 4 Drive 3.215 21.4 6# Hornet Sportabout Hornet Sportabout 3.440 18.7 8# Valiant Valiant 3.460 18.1 6
# 柱状图展示不同车的速度,按cyl为分组信息进行填充颜色,颜色按nature配色方法,数值按降序排列。ggbarplot(df5, x="name", y="mpg", fill = "cyl", color = "white", palette = "npg", #杂志nature的配色 sort.val = "desc", #下降排序 sort.by.groups=FALSE, #不按组排序 x.text.angle=60)
# 组内进行排序
ggbarplot(df5, x="name", y="mpg", fill = "cyl", color = "white", palette = "aaas", #杂志Science的配色 sort.val = "asc", #上升排序,区别于desc,具体看图演示 sort.by.groups=TRUE,x.text.angle=60) #按组排序 x.text.angle=90
ggbarplot 函数 :
用法:
ggbarplot(data, x, y, combine = FALSE, merge = FALSE, color = "black", fill = "white", palette = NULL, size = NULL, width = NULL, title = NULL, xlab = NULL, ylab = NULL, facet.by = NULL, panel.labs = NULL, short.panel.labs = TRUE, select = NULL, remove = NULL, order = NULL, add = "none", add.params = list(), error.plot = "errorbar", label = FALSE, lab.col = "black", lab.size = 4, lab.pos = c("out", "in"), lab.vjust = NULL, lab.hjust = NULL, sort.val = c("none", "desc", "asc"), sort.by.groups = TRUE, top = Inf, position = position_stack(), ggtheme = theme_pubr(), ...)
参数:
data
所需的数据框 dataframe
x,y
进行作图所需的数据
combine
对于多个变量的数据是否分面。默认是FALSE
merge
对于多个变量的数据是否合并,默认是FALSE。
color
轮廓线的颜色
fill
填充色
palette
自定义颜色画板
size
设置点和轮廓线的大小
width
设置柱子的宽度,取值范围 0~1
title
设置标题
xlab
设置x轴标题,当 xlab = FALSE 时,可以将标题隐藏
ylab
设置y轴标题,当 ylab = FALSE 时,可以将标题隐藏
x.text.angle
x轴文本旋转角度
orientation
变换坐标轴的方向
facet.by
设置分组分面
panel.labs
设置分面各组的标题
short.panel.labs
是否缩写分面标题,逻辑值,默认是TRUE。
select
选择需要展示的变量
remove
移除不需要展示的变量
order
选定变量的排列顺序
add
添加图片元素:
"none", "dotplot", "jitter", "boxplot", "point", "mean", "mean_se", "mean_sd", "mean_ci", "mean_range", "median", "median_iqr", "median_mad", "median_range"
add.params
给add参数中添加的元素添加属性:olor, shape, size, fill, linetype
eg:add.params = list(color = "red")
error.plot
添加误差棒,选项有"pointrange", "linerange", "crossbar", "errorbar", "upper_errorbar", "lower_errorbar", "upper_pointrange", "lower_pointrange", "upper_linerange", "lower_linerange"。默认是"pointrange" or "errorbar".
label
设置列标签
lab.col, lab.size
设置 label 文本颜色和大小
lab.pos
设置 label 的位置,选项有:
"out" (for outside)
"in" (for inside).
Ignored when lab.vjust != NULL.
lab.vjust
垂直方向上调整 label 的位置。
Provide negative value (e.g.: -0.4) to put labels outside the bars or positive value to put labels inside (e.g.: 2).
lab.hjust
水平方向上调整 label 的位置。
sort.val
是否进行排序:
"none":无排序
"asc":升序(ascending)
"desc":降序(descending)
sort.by.groups
逻辑值,当为TRUE时,按照分组排序
top
确定要显示出来的 top elements 的数目
top elements
设置排列方式
legend.title
设置图例标题
ggtheme
设置画图主题,默认是theme_pubr()。
ggplot2 official themes: theme_gray(), theme_bw(), theme_minimal(), theme_classic(), theme_void()
参与评论
登录后参与讨论 0/1000