Le Yan
HPC@LSU
datasets
, gcookbook` and
ggplot2“help(library='<name>')
?<dataset>
library(help='datasets')
Information on package 'datasets'
Description:
Package: datasets
Version: 3.3.3
Priority: base
Title: The R Datasets Package
Author: R Core Team and contributors worldwide
Maintainer: R Core Team <R-core@r-project.org>
Description: Base R datasets.
License: Part of R 3.3.3
Built: R 3.3.3; ; 2017-03-06 14:15:22 UTC; windows
Index:
AirPassengers Monthly Airline Passenger Numbers 1949-1960
BJsales Sales Data with Leading Indicator
BOD Biochemical Oxygen Demand
CO2 Carbon Dioxide Uptake in Grass Plants
ChickWeight Weight versus age of chicks on different diets
DNase Elisa assay of DNase
EuStockMarkets Daily Closing Prices of Major European Stock
...
First, let's examine the data:
str(pressure)
'data.frame': 19 obs. of 2 variables:
$ temperature: num 0 20 40 60 80 100 120 140 160 180 ...
$ pressure : num 0.0002 0.0012 0.006 0.03 0.09 0.27 0.75 1.85 4.2 8.8 ...
summary(pressure)
temperature pressure
Min. : 0 Min. : 0.0002
1st Qu.: 90 1st Qu.: 0.1800
Median :180 Median : 8.8000
Mean :180 Mean :124.3367
3rd Qu.:270 3rd Qu.:126.5000
Max. :360 Max. :806.0000
?pressure
pressure {datasets} R Documentation
Vapor Pressure of Mercury as a Function of Temperature
Description
Data on the relation between temperature in degrees Celsius and vapor pressure of mercury in millimeters (of mercury).
Usage
pressure
Format
A data frame with 19 observations on 2 variables.
[, 1] temperature numeric temperature (deg C)
[, 2] pressure numeric pressure (mm)
Source
Weast, R. C., ed. (1973) Handbook of Chemistry and Physics. CRC Press.
References
McNeil, D. R. (1977) Interactive Data Analysis. New York: Wiley.
We can use the plot()
function in the base plot system to create a scatter plot:
# Simply specify the x and y variables.
plot(pressure$temperature,pressure$pressure)
Since there are only two variables, we can simply run:
plot(pressure)
type
argument of plot()
can be used to specify plot typeLine plot with “l”:
plot(pressure,type="l")
Or dot and line with “b”:
plot(pressure,type="b")
There are a few functions that can be used to add more elements/layers to the plot
# Create the plot with title and axis labels.
plot(pressure,type="l",
main="Vapor Pressure of Mercury",
xlab="Temperature",
ylab="Vapor Pressure")
# Add points
points(pressure,size=4,col='red')
# Add annotation
text(150,700,"Source: Weast, R. C., ed. (1973) Handbook \n
of Chemistry and Physics. CRC Press.")
boxplot()
for boxplotsdataset:
str(mpg)
Classes 'tbl_df', 'tbl' and 'data.frame': 234 obs. of 11 variables:
$ manufacturer: chr "audi" "audi" "audi" "audi" ...
$ model : chr "a4" "a4" "a4" "a4" ...
$ displ : num 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
$ year : int 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
$ cyl : int 4 4 4 4 6 6 6 4 4 4 ...
$ trans : chr "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
$ drv : chr "f" "f" "f" "f" ...
$ cty : int 18 21 20 21 16 18 18 18 16 20 ...
$ hwy : int 29 29 31 30 26 26 27 26 25 28 ...
$ fl : chr "p" "p" "p" "p" ...
$ class : chr "compact" "compact" "compact" "compact" ...
summary(mpg)
manufacturer model displ year
Length:234 Length:234 Min. :1.600 Min. :1999
Class :character Class :character 1st Qu.:2.400 1st Qu.:1999
Mode :character Mode :character Median :3.300 Median :2004
Mean :3.472 Mean :2004
3rd Qu.:4.600 3rd Qu.:2008
Max. :7.000 Max. :2008
cyl trans drv cty
Min. :4.000 Length:234 Length:234 Min. : 9.00
1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00
Median :6.000 Mode :character Mode :character Median :17.00
Mean :5.889 Mean :16.86
3rd Qu.:8.000 3rd Qu.:19.00
Max. :8.000 Max. :35.00
hwy fl class
Min. :12.00 Length:234 Length:234
1st Qu.:18.00 Class :character Class :character
Median :24.00 Mode :character Mode :character
Mean :23.44
3rd Qu.:27.00
Max. :44.00
?mpg
Fuel economy data from 1999 and 2008 for 38 popular models of car
Description
This dataset contains a subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.
Usage
mpg
Format
A data frame with 234 rows and 11 variables
manufacturer
model
model name
displ
engine displacement, in litres
year
year of manufacture
cyl
number of cylinders
trans
type of transmission
drv
f = front-wheel drive, r = rear wheel drive, 4 = 4wd
cty
city miles per gallon
hwy
highway miles per gallon
fl
fuel type
class
"type" of car
boxplot(hwy ~ cyl, data=mpg)
# Use the title function to add title and labels.
title("Highway Mileage per Gallon",
xlab = "Number of cylinders",
ylab = "Mileage (per gallon)")
hist()
can used to create histogramshist(mpg$hwy)
hist(mpg$hwy, breaks=c(5,15,25,30,50))
The curve()
function draws a function over a specified range.
curve(cos,-3*pi, 3*pi)
title("Cosine Function")
# The abline() function adds one or more straight lines to the current plot
abline(h=c(-1,0,1),
col = 2, lty = 2, lwd = 1.5)
plot()
without X and Y being specified, it will generate a panel grid of plots.str(airquality)
'data.frame': 153 obs. of 6 variables:
$ Ozone : int 41 36 12 18 NA 28 23 19 8 NA ...
$ Solar.R: int 190 118 149 313 NA NA 299 99 19 194 ...
$ Wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
$ Temp : int 67 72 74 62 56 66 65 59 61 69 ...
$ Month : int 5 5 5 5 5 5 5 5 5 5 ...
$ Day : int 1 2 3 4 5 6 7 8 9 10 ...
plot(airquality)
Example: saving to a PNG file
png("test.png",width=5*240,height=3*240)
plot(pressure, type="l")
points(pressure,col="red")
dev.off()
Here is the saved graph:
ggplot2
package in R is an implementation of it
qplot()
function from ggplot2
package is similar to the plot()
function in the base system.Examine the data:
str(heightweight)
'data.frame': 236 obs. of 5 variables:
$ sex : Factor w/ 2 levels "f","m": 1 1 1 1 1 1 1 1 1 1 ...
$ ageYear : num 11.9 12.9 12.8 13.4 15.9 ...
$ ageMonth: int 143 155 153 161 191 171 185 142 160 140 ...
$ heightIn: num 56.3 62.3 63.3 59 62.5 62.5 59 56.5 62 53.8 ...
$ weightLb: num 85 105 108 92 112 ...
summary(heightweight)
sex ageYear ageMonth heightIn weightLb
f:111 Min. :11.58 Min. :139.0 Min. :50.50 Min. : 50.5
m:125 1st Qu.:12.33 1st Qu.:148.0 1st Qu.:58.73 1st Qu.: 85.0
Median :13.58 Median :163.0 Median :61.50 Median :100.5
Mean :13.67 Mean :164.1 Mean :61.34 Mean :101.0
3rd Qu.:14.83 3rd Qu.:178.0 3rd Qu.:64.30 3rd Qu.:112.0
Max. :17.50 Max. :210.0 Max. :72.00 Max. :171.5
?heightweight
heightweight {gcookbook} R Documentation
Height and weight of schoolchildren
Description
Height and weight of schoolchildren
Variables
sex
ageYear: Age in years.
ageMonth: Age in months.
heightIn: Height in inches.
weightLb: Weight in pounds.
Source
Lewis, T., & Taylor, L.R. (1967), Introduction to Experimental Ecology, Academic Press.
qplot(weightLb, heightIn, data=heightweight, geom="point")
qplot(weightLb, heightIn, data=heightweight, geom ="text", label=ageYear)
This is what is under the hood:
ggplot(heightweight, aes(x=weightLb, y=heightIn, color=sex, shape=sex)) +
geom_point(size=3.5) +
ggtitle("School Children\nHeight ~ Weight") +
labs(y="Height (inch)", x="Weight (lbs)") +
stat_smooth(method=loess, se=T, color="black", fullrange=T) +
annotate("text",x=145,y=75,label="Locally weighted polynomial fit with 95% CI",color="Green",size=6) +
scale_color_brewer(palette = "Set1", labels=c("Female", "Male")) +
guides(shape=F) +
theme_bw() +
theme(plot.title = element_text(size=20, hjust=0.5),
legend.position = c(0.9,0.2),
axis.title.x = element_text(size=20), axis.title.y = element_text(size=20),
legend.title = element_text(size=15),legend.text = element_text(size=15))
Don't Panic!!!
Grammar of Graphics components:
ggplot
function to indicate what data to usegeom_xxx
functions to indicate what types of visual marks to use
aes()
function) to map variables to visual marks
ggplot(heightweight, # What data to use
aes(x=weightLb,y=heightIn)) + # Aesthetic specifies variables
geom_point() # Geom specifies visual marks
This is equivalent to:
qplot(weightLb, heightIn, data=heightweight, geom="point")
ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth=5, fill="white", color="black")
ggplot(faithfuld, aes(waiting, eruptions, z = density))+
geom_raster(aes(fill = density)) +
geom_contour(colour = "white")
maps
package, one can create geographical graphseast_asia <- map_data("world",
region=c("Japan","China","North Korea","South Korea"))
ggplot(east_asia, aes(x=long,y=lat,group=group, fill=region)) +
geom_polygon(color="black") +
scale_fill_brewer(palette="Set2")
There are more than 30 geoms in ggplot2
geom
functions
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(shape=13,size=5,color='red',alpha=0.5)
colors()
colors()
[1] "white" "aliceblue" "antiquewhite"
[4] "antiquewhite1" "antiquewhite2" "antiquewhite3"
[7] "antiquewhite4" "aquamarine" "aquamarine1"
[10] "aquamarine2" "aquamarine3" "aquamarine4"
[13] "azure" "azure1" "azure2"
[16] "azure3" "azure4" "beige"
[19] "bisque" "bisque1" "bisque2"
[22] "bisque3" "bisque4" "black"
[25] "blanchedalmond" "blue" "blue1"
[28] "blue2" "blue3" "blue4"
[31] "blueviolet" "brown" "brown1"
[34] "brown2" "brown3" "brown4"
[37] "burlywood" "burlywood1" "burlywood2"
[40] "burlywood3" "burlywood4" "cadetblue"
[43] "cadetblue1" "cadetblue2" "cadetblue3"
[46] "cadetblue4" "chartreuse" "chartreuse1"
[49] "chartreuse2" "chartreuse3" "chartreuse4"
[52] "chocolate" "chocolate1" "chocolate2"
[55] "chocolate3" "chocolate4" "coral"
[58] "coral1" "coral2" "coral3"
[61] "coral4" "cornflowerblue" "cornsilk"
[64] "cornsilk1" "cornsilk2" "cornsilk3"
[67] "cornsilk4" "cyan" "cyan1"
[70] "cyan2" "cyan3" "cyan4"
[73] "darkblue" "darkcyan" "darkgoldenrod"
[76] "darkgoldenrod1" "darkgoldenrod2" "darkgoldenrod3"
[79] "darkgoldenrod4" "darkgray" "darkgreen"
[82] "darkgrey" "darkkhaki" "darkmagenta"
[85] "darkolivegreen" "darkolivegreen1" "darkolivegreen2"
[88] "darkolivegreen3" "darkolivegreen4" "darkorange"
[91] "darkorange1" "darkorange2" "darkorange3"
[94] "darkorange4" "darkorchid" "darkorchid1"
[97] "darkorchid2" "darkorchid3" "darkorchid4"
[100] "darkred" "darksalmon" "darkseagreen"
[103] "darkseagreen1" "darkseagreen2" "darkseagreen3"
[106] "darkseagreen4" "darkslateblue" "darkslategray"
[109] "darkslategray1" "darkslategray2" "darkslategray3"
[112] "darkslategray4" "darkslategrey" "darkturquoise"
[115] "darkviolet" "deeppink" "deeppink1"
[118] "deeppink2" "deeppink3" "deeppink4"
[121] "deepskyblue" "deepskyblue1" "deepskyblue2"
[124] "deepskyblue3" "deepskyblue4" "dimgray"
[127] "dimgrey" "dodgerblue" "dodgerblue1"
[130] "dodgerblue2" "dodgerblue3" "dodgerblue4"
[133] "firebrick" "firebrick1" "firebrick2"
[136] "firebrick3" "firebrick4" "floralwhite"
[139] "forestgreen" "gainsboro" "ghostwhite"
[142] "gold" "gold1" "gold2"
[145] "gold3" "gold4" "goldenrod"
[148] "goldenrod1" "goldenrod2" "goldenrod3"
[151] "goldenrod4" "gray" "gray0"
[154] "gray1" "gray2" "gray3"
[157] "gray4" "gray5" "gray6"
[160] "gray7" "gray8" "gray9"
[163] "gray10" "gray11" "gray12"
[166] "gray13" "gray14" "gray15"
[169] "gray16" "gray17" "gray18"
[172] "gray19" "gray20" "gray21"
[175] "gray22" "gray23" "gray24"
[178] "gray25" "gray26" "gray27"
[181] "gray28" "gray29" "gray30"
[184] "gray31" "gray32" "gray33"
[187] "gray34" "gray35" "gray36"
[190] "gray37" "gray38" "gray39"
[193] "gray40" "gray41" "gray42"
[196] "gray43" "gray44" "gray45"
[199] "gray46" "gray47" "gray48"
[202] "gray49" "gray50" "gray51"
[205] "gray52" "gray53" "gray54"
[208] "gray55" "gray56" "gray57"
[211] "gray58" "gray59" "gray60"
[214] "gray61" "gray62" "gray63"
[217] "gray64" "gray65" "gray66"
[220] "gray67" "gray68" "gray69"
[223] "gray70" "gray71" "gray72"
[226] "gray73" "gray74" "gray75"
[229] "gray76" "gray77" "gray78"
[232] "gray79" "gray80" "gray81"
[235] "gray82" "gray83" "gray84"
[238] "gray85" "gray86" "gray87"
[241] "gray88" "gray89" "gray90"
[244] "gray91" "gray92" "gray93"
[247] "gray94" "gray95" "gray96"
[250] "gray97" "gray98" "gray99"
[253] "gray100" "green" "green1"
[256] "green2" "green3" "green4"
[259] "greenyellow" "grey" "grey0"
[262] "grey1" "grey2" "grey3"
[265] "grey4" "grey5" "grey6"
[268] "grey7" "grey8" "grey9"
[271] "grey10" "grey11" "grey12"
[274] "grey13" "grey14" "grey15"
[277] "grey16" "grey17" "grey18"
[280] "grey19" "grey20" "grey21"
[283] "grey22" "grey23" "grey24"
[286] "grey25" "grey26" "grey27"
[289] "grey28" "grey29" "grey30"
[292] "grey31" "grey32" "grey33"
[295] "grey34" "grey35" "grey36"
[298] "grey37" "grey38" "grey39"
[301] "grey40" "grey41" "grey42"
[304] "grey43" "grey44" "grey45"
[307] "grey46" "grey47" "grey48"
[310] "grey49" "grey50" "grey51"
[313] "grey52" "grey53" "grey54"
[316] "grey55" "grey56" "grey57"
[319] "grey58" "grey59" "grey60"
[322] "grey61" "grey62" "grey63"
[325] "grey64" "grey65" "grey66"
[328] "grey67" "grey68" "grey69"
[331] "grey70" "grey71" "grey72"
[334] "grey73" "grey74" "grey75"
[337] "grey76" "grey77" "grey78"
[340] "grey79" "grey80" "grey81"
[343] "grey82" "grey83" "grey84"
[346] "grey85" "grey86" "grey87"
[349] "grey88" "grey89" "grey90"
[352] "grey91" "grey92" "grey93"
[355] "grey94" "grey95" "grey96"
[358] "grey97" "grey98" "grey99"
[361] "grey100" "honeydew" "honeydew1"
[364] "honeydew2" "honeydew3" "honeydew4"
[367] "hotpink" "hotpink1" "hotpink2"
[370] "hotpink3" "hotpink4" "indianred"
[373] "indianred1" "indianred2" "indianred3"
[376] "indianred4" "ivory" "ivory1"
[379] "ivory2" "ivory3" "ivory4"
[382] "khaki" "khaki1" "khaki2"
[385] "khaki3" "khaki4" "lavender"
[388] "lavenderblush" "lavenderblush1" "lavenderblush2"
[391] "lavenderblush3" "lavenderblush4" "lawngreen"
[394] "lemonchiffon" "lemonchiffon1" "lemonchiffon2"
[397] "lemonchiffon3" "lemonchiffon4" "lightblue"
[400] "lightblue1" "lightblue2" "lightblue3"
[403] "lightblue4" "lightcoral" "lightcyan"
[406] "lightcyan1" "lightcyan2" "lightcyan3"
[409] "lightcyan4" "lightgoldenrod" "lightgoldenrod1"
[412] "lightgoldenrod2" "lightgoldenrod3" "lightgoldenrod4"
[415] "lightgoldenrodyellow" "lightgray" "lightgreen"
[418] "lightgrey" "lightpink" "lightpink1"
[421] "lightpink2" "lightpink3" "lightpink4"
[424] "lightsalmon" "lightsalmon1" "lightsalmon2"
[427] "lightsalmon3" "lightsalmon4" "lightseagreen"
[430] "lightskyblue" "lightskyblue1" "lightskyblue2"
[433] "lightskyblue3" "lightskyblue4" "lightslateblue"
[436] "lightslategray" "lightslategrey" "lightsteelblue"
[439] "lightsteelblue1" "lightsteelblue2" "lightsteelblue3"
[442] "lightsteelblue4" "lightyellow" "lightyellow1"
[445] "lightyellow2" "lightyellow3" "lightyellow4"
[448] "limegreen" "linen" "magenta"
[451] "magenta1" "magenta2" "magenta3"
[454] "magenta4" "maroon" "maroon1"
[457] "maroon2" "maroon3" "maroon4"
[460] "mediumaquamarine" "mediumblue" "mediumorchid"
[463] "mediumorchid1" "mediumorchid2" "mediumorchid3"
[466] "mediumorchid4" "mediumpurple" "mediumpurple1"
[469] "mediumpurple2" "mediumpurple3" "mediumpurple4"
[472] "mediumseagreen" "mediumslateblue" "mediumspringgreen"
[475] "mediumturquoise" "mediumvioletred" "midnightblue"
[478] "mintcream" "mistyrose" "mistyrose1"
[481] "mistyrose2" "mistyrose3" "mistyrose4"
[484] "moccasin" "navajowhite" "navajowhite1"
[487] "navajowhite2" "navajowhite3" "navajowhite4"
[490] "navy" "navyblue" "oldlace"
[493] "olivedrab" "olivedrab1" "olivedrab2"
[496] "olivedrab3" "olivedrab4" "orange"
[499] "orange1" "orange2" "orange3"
[502] "orange4" "orangered" "orangered1"
[505] "orangered2" "orangered3" "orangered4"
[508] "orchid" "orchid1" "orchid2"
[511] "orchid3" "orchid4" "palegoldenrod"
[514] "palegreen" "palegreen1" "palegreen2"
[517] "palegreen3" "palegreen4" "paleturquoise"
[520] "paleturquoise1" "paleturquoise2" "paleturquoise3"
[523] "paleturquoise4" "palevioletred" "palevioletred1"
[526] "palevioletred2" "palevioletred3" "palevioletred4"
[529] "papayawhip" "peachpuff" "peachpuff1"
[532] "peachpuff2" "peachpuff3" "peachpuff4"
[535] "peru" "pink" "pink1"
[538] "pink2" "pink3" "pink4"
[541] "plum" "plum1" "plum2"
[544] "plum3" "plum4" "powderblue"
[547] "purple" "purple1" "purple2"
[550] "purple3" "purple4" "red"
[553] "red1" "red2" "red3"
[556] "red4" "rosybrown" "rosybrown1"
[559] "rosybrown2" "rosybrown3" "rosybrown4"
[562] "royalblue" "royalblue1" "royalblue2"
[565] "royalblue3" "royalblue4" "saddlebrown"
[568] "salmon" "salmon1" "salmon2"
[571] "salmon3" "salmon4" "sandybrown"
[574] "seagreen" "seagreen1" "seagreen2"
[577] "seagreen3" "seagreen4" "seashell"
[580] "seashell1" "seashell2" "seashell3"
[583] "seashell4" "sienna" "sienna1"
[586] "sienna2" "sienna3" "sienna4"
[589] "skyblue" "skyblue1" "skyblue2"
[592] "skyblue3" "skyblue4" "slateblue"
[595] "slateblue1" "slateblue2" "slateblue3"
[598] "slateblue4" "slategray" "slategray1"
[601] "slategray2" "slategray3" "slategray4"
[604] "slategrey" "snow" "snow1"
[607] "snow2" "snow3" "snow4"
[610] "springgreen" "springgreen1" "springgreen2"
[613] "springgreen3" "springgreen4" "steelblue"
[616] "steelblue1" "steelblue2" "steelblue3"
[619] "steelblue4" "tan" "tan1"
[622] "tan2" "tan3" "tan4"
[625] "thistle" "thistle1" "thistle2"
[628] "thistle3" "thistle4" "tomato"
[631] "tomato1" "tomato2" "tomato3"
[634] "tomato4" "turquoise" "turquoise1"
[637] "turquoise2" "turquoise3" "turquoise4"
[640] "violet" "violetred" "violetred1"
[643] "violetred2" "violetred3" "violetred4"
[646] "wheat" "wheat1" "wheat2"
[649] "wheat3" "wheat4" "whitesmoke"
[652] "yellow" "yellow1" "yellow2"
[655] "yellow3" "yellow4" "yellowgreen"
geom_xxx
functionsggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point() +
geom_quantile(quantiles = c(0.25,0.5,0.75)) +
geom_text(label=rownames(heightweight), vjust=-0.5)
aes()
functionggplot
function or individual layersggplot
are default, but can be overriden in individual layersExample: use the “sex” variable to group data points by shape and color:
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(aes(shape=sex,color=sex))
Specify the shapes and colors manually (more on stat
function later):
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(aes(shape=sex,color=sex),size=4) +
scale_shape_manual(values=c(1,4)) +
scale_color_manual(values=c("blue","green"))
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(aes(shape=sex,color=sex,size=ageYear))
Use stat_smooth
function to add a fitted model to the plot:
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(aes(shape=sex,color=sex),size=4) +
scale_shape_manual(values=c(1,4)) +
scale_color_manual(values=c("blue","green")) +
stat_smooth(method = lm, level=0.95)
Moving the color=sex
statement to the ggplot
function produces two lines:
ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) +
geom_point(aes(shape=sex),size=4) +
scale_shape_manual(values=c(1,4)) +
scale_color_manual(values=c("blue","green")) +
stat_smooth(method = lm, level=0.95)
To label data points, use either annotate
or geom_text
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(aes(shape=sex,color=sex,size=ageYear)) +
annotate("text",x=150,y=68,label="Some label",color="darkgreen",size=12)
ggplot(heightweight, aes(x=weightLb,y=heightIn)) +
geom_point(aes(shape=sex,color=sex,size=ageYear)) +
geom_text(aes(label=ageYear),vjust=0.5)
stat_xxx
function to choose a common transformation to visualize.We have seen the stat_smooth()
function:
ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) +
geom_point(aes(shape=sex),size=4) +
scale_shape_manual(values=c(1,4)) +
scale_color_manual(values=c("blue","green")) +
stat_smooth(method = lm, level=0.95)
Another example: stat_bin()
function creates a frequency count:
ggplot(mpg,aes(x=hwy)) + stat_bin(binwidth = 5)
This is equivalent to:
ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth = 5)
Or:
ggplot(mpg,aes(x=hwy)) +
geom_histogram(stat="bin", binwidth = 5)
# The "bin" stat is the implied default for histogram
Density plot with stat_density
:
ggplot(mpg,aes(x=hwy)) + stat_density()
Or the same plot with geom_histogram
:
ggplot(mpg,aes(x=hwy)) + geom_histogram(stat="density")
ggplot
plot can be saved in an objectp <- ggplot(heightweight, aes(x=weightLb,y=heightIn))
p + geom_point(aes(shape=sex,color=sex,size=ageYear))
# Here we use the saved plot object "p"
p + geom_smooth(method=lm)
With ggplot2
one can use the ggsave()
function to save a plot:
ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) +
geom_point(aes(shape=sex),size=4) +
scale_shape_manual(values=c(1,4)) +
scale_color_manual(values=c("blue","green")) +
stat_smooth(method = lm, level=0.99)
ggsave("hw.png",width=6,height=6)
To add a title, use either ggtitle
or labs(title=)
p <- ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) +
geom_point(aes(shape=sex),size=4) +
scale_shape_manual(values=c(1,4)) +
scale_color_manual(values=c("blue","green"))
p + ggtitle("Height ~ weight of school children")
Note the title is left-aligned by default.
To add axis labels, use either (x|y)lab
or labs(x=,y=)
p + ggtitle("Height ~ weight of school children") +
xlab("Weight (lbs)") + ylab("Height (inch)")
labs(<aes>=)
to specify legend titlesp + ggtitle("Height ~ weight of school children") +
xlab("Weight (lbs)") + ylab("Height (inch)") +
labs(color='Gender', shape='Gender')
guides
function to set legend type for each aesthetic properties.Before:
p <- ggplot(heightweight, aes(x=weightLb,y=heightIn,color=ageYear)) +
geom_point(aes(shape=sex))
p
After:
p + guides(shape='none',color='legend')
ggplot2
provides a few pre-defined themes for users to choose fromThe classic theme:
p <- ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) +
geom_point(aes(shape=sex),size=4)
p + theme_classic()
The dark theme:
p + theme_dark()
ggthemes
packageExample: Excel theme
p + theme_excel()
theme()
function.
Removing the grid lines:
p + theme_bw() +
theme(panel.grid = element_blank())
theme()
function.
Or just removing the vertical ones:
p + theme_bw() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank())
Change the base size and font family:
p + theme_bw(base_size = 24, base_family = "Times")
Or fine tune each element:
p + theme_bw(base_size = 24, base_family = "Times") +
theme(legend.title = element_text(size=20,color="blue"),# Legend title
legend.text = element_text(size=18,color="red"), # Legend text
axis.title.x = element_text(size=18,color="red"), # X axis label
axis.title.y = element_blank(), # Remove Y axis label
)
The element_blank()
function can be used to remove undesired elements.
p + theme_bw(base_size = 24, base_family = "Times") +
theme(legend.position = "bottom")
p + theme_bw(base_size = 24, base_family = "Times") +
theme(legend.position = c(0.9,0.1))
Elements that can be adjusted with the theme() function:
theme(line, rect, text, title, aspect.ratio, axis.title, axis.title.x,
axis.title.x.top, axis.title.y, axis.title.y.right, axis.text, axis.text.x,
axis.text.x.top, axis.text.y, axis.text.y.right, axis.ticks, axis.ticks.x,
axis.ticks.y, axis.ticks.length, axis.line, axis.line.x, axis.line.y,
legend.background, legend.margin, legend.spacing, legend.spacing.x,
legend.spacing.y, legend.key, legend.key.size, legend.key.height,
legend.key.width, legend.text, legend.text.align, legend.title,
legend.title.align, legend.position, legend.direction, legend.justification,
legend.box, legend.box.just, legend.box.margin, legend.box.background,
legend.box.spacing, panel.background, panel.border, panel.spacing,
panel.spacing.x, panel.spacing.y, panel.grid, panel.grid.major,
panel.grid.minor, panel.grid.major.x, panel.grid.major.y, panel.grid.minor.x,
panel.grid.minor.y, panel.ontop, plot.background, plot.title, plot.subtitle,
plot.caption, plot.margin, strip.background, strip.placement, strip.text,
strip.text.x, strip.text.y, strip.switch.pad.grid, strip.switch.pad.wrap, ...,
complete = FALSE, validate = TRUE)
theme_grey()
theme_set()
to change the defaultWith old default:
p
With new default:
theme_set(theme_light())
p
mytheme <- theme_bw(base_size = 24, base_family = "Times") +
theme(legend.title = element_text(size=20,color="blue"),# Legend title
legend.text = element_text(size=18,color="red"), # Legend text
axis.title.x = element_text(size=18,color="red"), # X axis label
axis.title.y = element_blank(), # Remove Y axis label
)
p + mytheme
Functions that control the coordination system
coord_cartesian
- the default cartesian coordinatescoord_flip
- flip X and Ycoord_polar
- polar coordinatescoord_trans
- transform cartesian coordinatesOriginal:
g <- ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth=5, fill="white", color="black")
g
With flipped coorinates:
g + coord_flip()
Original:
g
With transformed Y coordinate:
g + coord_trans(y="sqrt")
xlim()
and ylim()
functions to set the range of axes:p + theme_light() +
xlim(0,200) +
ylim(50,100)
scale_<aes>_(continuous|discrete|manual|identity|...)
family of functions controls how data points are mapped to aesthetic values
scale_x_continuous
: scale for X, which is a continuous variable
p + theme_bw() +
ylim(50,100) +
scale_x_continuous(limits=c(0,200),
breaks=c(50,110,170),
labels=c("Thin","Medium\nSize","Chubby"))
p + theme_economist_white() +
scale_x_log10(breaks=c(10,20,50,100,200),
limits=c(5,500)) + # Plot X on a log10 scale
scale_y_reverse() # Reverse the Y scale
ggplot(mpg,aes(x=drv,y=cty,fill=drv)) +
geom_boxplot()
ggplot(mpg,aes(x=drv,y=cty,fill=drv)) +
geom_boxplot() +
scale_fill_discrete(limits=c("f","r","4"),
labels=c("Front","Rear","4 Wheel Drive"))
By default:
ggplot(mpg,aes(x=displ,y=hwy,size=cyl,
color=drv,shape=fl)) +
geom_point(aes(alpha=cty))
Re-scaled
ggplot(mpg,aes(x=displ,y=hwy,size=cyl,color=drv, alpha=cty)) +
geom_point() +
scale_size_identity() + # Use the values of "cyl" variable for size
scale_color_manual(values=c("darkblue","rosybrown2","#24FA22")) +
scale_alpha_continuous(range=c(0.1,1))
ggplot2
is managed by the functions facet_grid
and facet_wrap
.facet_grid
: create a row of panels defined by the variable “drv”:
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_grid(. ~ drv)
facet_grid
: creates a column of panels defined by the variable “fl”:
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_grid(fl ~ .)
facet_grid
: creates a matrix of panels defined by the variables “fl” and “drv”:
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_grid(fl ~ drv)
facet_wrap
: wraps 1d sequence of panels into 2d:
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(~class, nrow=3)
Open your browser and try: https://plot.ly/~lyan1/2/
The code:
library(plotly)
nmmaps<-read.csv("chicago-nmmaps.csv", as.is=T)
nmmaps$date<-as.Date(nmmaps$date)
nmmaps<-nmmaps[nmmaps$date>as.Date("1996-12-31"),]
nmmaps$year<-substring(nmmaps$date,1,4)
g <- ggplot(nmmaps, aes(date, temp, color=factor(season)))+ geom_point() +
scale_color_manual(values=c("dodgerblue4", "darkolivegreen4",
"darkorchid3", "goldenrod1"))
#ggplotly(g) # offline
api_create(g, filename = NULL, fileopt = "new", sharing = "public")
R Graphics Cookbook
is a good reference