[R] 기본 plot 함수

1. Basic graphic function plot()
- 1. 1. Options
2. points()
3. lines()
4. abline()
5. curve()
6. text()
7. polygon()
8. arrows()
9. legend()
10. boxplot()
11. hist()

1. Basic graphic function `plot()`

시각화로 많이 쓰이는 ggplot2 패키지나 interactive 한 plot을 제공하는 plotly 패키지 등 R에서는 다양한 시각화 함수를 제공하고 있습니다.
그러한 함수를 배우기 이전에 앞서 R에서 기본으로 내장되어 있는 함수 plot() 에 대해서 설명을 하고 더 나아가 필요한 함수들에 대해서 설명하겠습니다.
앞서 글을 포스팅하는데 있어 한국외국어대학교 이석호 교수님의 탐색적자료분석 강의 교안을 참고하였음을 알립니다.

기본적으로 plot() 함수는 산점도(scatter plot)를 그리는 함수지만 산점도뿐만 아니라 일반적으로 객체를 시각화하는데 모두 사용할 수 있는 일반적인 함수입니다.

다음은 plot() 함수가 어떤 객체들을 그려줄 수 있는지 나타낸 것 입니다.

##  [1] plot.acf*           plot.data.frame*    plot.decomposed.ts*
##  [4] plot.default        plot.dendrogram*    plot.density*      
##  [7] plot.ecdf           plot.factor*        plot.formula*      
## [10] plot.function       plot.hclust*        plot.histogram*    
## [13] plot.HoltWinters*   plot.isoreg*        plot.lm*           
## [16] plot.medpolish*     plot.mlm*           plot.ppr*          
## [19] plot.prcomp*        plot.princomp*      plot.profile.nls*  
## [22] plot.R6*            plot.raster*        plot.spec*         
## [25] plot.stepfun        plot.stl*           plot.table*        
## [28] plot.ts             plot.tskernel*      plot.TukeyHSD*     
## see '?methods' for accessing help and source code

예를 들자면, plot.lm() 함수는 lm 이라는 선형모형 클래스에 정의된 plot 메서드로, 자동적으로 선형모형 객체의 plot이 호출됩니다. 즉, 인자로 주어진 객체에 따라 다르게 처리됩니다.

```
methods("plot")
```
산점도를 표시하는 함수의 형태는 다음과 같습니다.
- x : x좌표, y : y좌표
```
x = 1:10
y = x^2
plot(x, y)
```
plot(x, y)

1. 1. Options

plot() 함수의 주요 옵션은 다음과 같습니다.

Graph options Meaning

xlab, ylab x, y 축 이름

main 그래프 제목

pch 점의 종류

cex 점의 크기

col 색상

xlim, ylim x, y 축의 값 범위

type 그래프의 유형

Graph options	Meaning
`xlab`, `ylab`	x, y 축 이름
`main`	그래프 제목
`pch`	점의 종류
`cex`	점의 크기
`col`	색상
`xlim`, `ylim`	x, y 축의 값 범위
`type`	그래프의 유형

축 이름과 그래프 제목

xlab = "", ylab = "" 그리고 main = "" argument를 이용하여 축 이름과 제목을 설정할 수 있습니다.

plot(x, y, xlab = "x축 이름", ylab = "y축 이름", main = "제목")

점의 종류

pch argument를 통해 점의 종류를 지정할 수 있습니다.
pch는 숫자 또는 문자로 지정할 수 있는데 example(points) 를 실행하면 각 pch 값이 어떤 기호를 의미하는지 볼 수 있습니다.

점의 크기

산점도에 보인 점의 크기는 cex argument로 지정합니다.

x <- rep(1:5, rep(5,5))
y <- rep(5:1, 5)
plot(x, y, cex = 0.5)

색상

색상은 col argument로 지정합니다. 기존에 언급했던 pch, cex 옵션은 산점도를 출력할 때만 해당하지만 색상은 점, 선 등 모두에 적용할 수 있는 argument 입니다.
```
plot(x, y, col = "red")
```
R에서 제공하는 색상에 대한 값은 다음과 같습니다.

좌표 축 값의 범위

plot() 함수가 기본으로 지정하는 x, y 축 값의 범위를 바꾸고 싶다면 xlim, ylim argument를 적용하여 바꿀 수 있습니다.
단, 값의 범위를 지정하면서 그래프의 일부분이 짤릴 수도 있으니 min(), max() 등의 함수를 이용해 최소, 최대 값을 고려하여 적절하게 지정해주실 필요가 있습니다.
- xlim = c(하한, 상한), ylim = c(하한, 상한)

plot(x, y, 
     xlim = c(0, 3), 
     ylim = c(1, 4))

그래프의 유형

type argument를 통해 그래프의 유형을 지정할 수 있습니다.
- type = "p" : for points (default)
- type = "l" : for lines
- type = "b" : for both points and lines
- type = "c" : for the lines part along of “b”
- type = "o" : for both overplotted
- type = "h" : for histogram (vertical lines)
- type = "s" : for stair steps
- type = "S" : for other steps
- type = "n" : for no plotting

par(mfrow = c(2, 3))     # divide the window: 2 rows and 3 columns

x = 1:10
y = x^2

plot(x, y, type = "l", main = "lines")
plot(x, y, type = "h", main = "histogram")
plot(x, y, type = "n", main = "no plotting")
plot(x, y, type = "b", main = "both points and lines")
plot(x, y, type = "s", main = "stair steps")
plot(x, y, main = "default : points")

선 유형

lty argument를 통해 선의 유형을 지정할 수 있습니다.
- lty = 0, lty = "blank" : 그리지 않음
- lty = 1, lty = "solid" : 실선 (기본값)
- lty = 2, lty = "dashed" : 대시
- lty = 3, lty = "dotted" : 점
- lty = 4, lty = "dotdash" : 점과 대시
- lty = 5, lty = "longdash" : 긴 대시

par(mfrow = c(2, 3))
plot(x, y, type = "l", lty = 0)
plot(x, y, type = "l", lty = 1)
plot(x, y, type = "l", lty = 2)
plot(x, y, type = "l", lty = 3)
plot(x, y, type = "l", lty = 4)
plot(x, y, type = "l", lty = 5)

축 설정

axis() 함수를 이용하여 축에 대한 옵션도 설정할 수 있습니다.

자세한 argument는 ?axis를 입력하여 help를 참조하시길 바랍니다.

side = 1 : 하단 축부분, side = 2 : 좌측 축부분, side = 3 : 상단 축부분, side = 4 우측 축부분
at : 축의 표시 단위
labels : 축에 표시될 값
pos : 축의 위치

plot(1:5, type = "l", main = "axis", axes = FALSE, xlab = "", ylab = "")
axis(side = 1, at = 1:5, labels = LETTERS[1:5], line = 2)
axis(side = 2, tick = FALSE, col.axis = "blue")
axis(side = 3, outer = TRUE)
axis(side = 3, at = c(1, 3, 5), pos = 3, col = "blue", col.axis = "red")
axis(side = 4, lty = 2, lwd = 2)

axis(side, at = NULL, labels = TRUE, tick = TRUE, pos = NA, outer = FALSE, ...)

그래프 배열

par(mfrow()) 함수를 통하여 그래프 배열을 지정할 수 있습니다.
- 그래프를 nr개의 행, nc개의 컬럼으로 배열합니다.
par(mfrow = c(nr, nc))

또한 layout() 함수를 이용해서 배열을 지정할 수 있습니다.

##      [,1] [,2]
## [1,]    1    1
## [2,]    2    3

layout(mat = m)
plot(cars, main = 'scatter plot of cars data')
hist(cars$speed, col = 'lightblue', border = 'white')
hist(cars$dist, col = 'darkgray', border = 'white')

m <- matrix(c(1, 1, 2, 3), ncol = 2, byrow = TRUE)
m

2. `points()`

points() 함수는 점을 그리는 함수로 plot()을 연달아 출력하는 경우 매번 새로운 그래프가 그려지는 것과 달리 points()는 이미 생성된 plot에 점을 추가로 그려줍니다.
- R에 내장되어 있는 iris 데이터의 Sepal.Width, Sepal.Length를 plot() 함수로 그린 다음, Petal.Width, Petal.Length 를 같은 그래프 위에 points() 함수로 덧그리는 예시입니다.

plot(iris$Sepal.Width, iris$Sepal.Length, cex = 0.5, pch = 20,
     xlab = "Width", ylab = "Length", main = "iris")
points(iris$Petal.Width, iris$Petal.Length, cex = 0.5, pch = "+", col = "green")

3. `lines()`

lines()는 points()와 마찬가지로 plot()으로 출력된 그래프 위에 꺾은선을 추가하는 함수입니다.
꺾은선은 시계열 데이터에서 추세를 표현하거나 여러 범주의 데이터를 서로 다른 색상 또는 선 유형으로 표현하는데 사용합니다.

x <- seq(from = 0, to = 2*pi, by = 0.1)
y <- sin(x)

plot(x, y, type = "n")
lines(x, y, lty = 3)

4. `abline()`

abline() 함수는 $y = \beta_0 + \beta_1 x$ 형태의 직선이나 $y = h$ 형태의 가로로 그은 직선, 또는 $x = v$ 형태의 세로로 그은 직선을 그래프에 그리는 함수입니다.
lines() 함수와 달리 꺾은선이 아닌 직선을 그리는 함수로 둘 사이의 차이가 존재합니다.

plot(x, y)
abline(v = 3, lty = 2)  # vertical
abline(h = 0, lty = 3)  # horizontal
abline(a = -1, b = 1, col = "red")  # y = -1 + x

5. `curve()`

curve() 함수는 주어진 표현식에 대한 곡선을 그리는 함수입니다.
```
curve(expr = sin, from = 0, to = 2*pi)
```

6. `text()`

text() 함수는 그래프에 문자열을 표시하는데 사용합니다.
```
plot(4:6, 4:6, xlab = "", ylab = "", type = "n")
text(5, 5, "X")
text(5, 5, "00", adj = c(0, 0))
text(5, 5, "01", adj = c(0, 1))
text(5, 5, "10", adj = c(1, 0))
text(5, 5, "11", adj = c(1, 1))
```
- adj는 텍스트의 위치를 지정하는 옵션 : (0,0) 우측 상단, (0, 1) 우측 하단, (1, 0) 좌측 상단, (1, 1) 좌측 하단
text(x좌표, y좌표, labels = "표시할 문자", adj = NULL)

7. `polygon()`

polygon() 함수는 다각형을 그리는데 사용합니다.
신뢰구간 같은 범위 값을 표현하는데 유용하게 사용됩니다.

theta <- seq(-pi, pi, length.out = 12)
x <- cos(theta)
y <- sin(theta)
plot(1:6, type = "n", main = "polygon", xlab = "", ylab = "", axes = FALSE)

x1 = x + 2 ; y1 = y + 4.5
polygon(x1, y1)
text(2, 5.7, adj=0.5, 'default')

x2 = x + 2 ; y2 = y + 2
polygon(x2, y2, col='gold')
text(2, 3.2, adj=0.5, 'col=\'gold\'')

x3 = x + 5 ; y3 = y + 4.5
polygon(x3, y3, density = 10)
text(5, 5.7, adj=0.5, 'density=10')

x4 = x + 5 ; y4 = y + 2
polygon(x4, y4, lty = 2, lwd = 2)
text(5, 3.2, adj=0.5, 'lty=2, lwd=2')

8. `arrows()`

화살표를 그릴 수 있는 함수로 arrows() 함수가 있습니다.

plot(1:9, type = "n", axes = FALSE, xlab = "", ylab = "", main = "arrows")
arrows(1, 9, 4, 9, angle = 30, length = 0.25, code = 2)
arrows(1, 8, 4, 8, length = 0.5)
arrows(1, 7, 4, 7, length = 0.1)
arrows(1, 6, 4, 6, angle = 60)
arrows(1, 5, 4, 5, angle = 90)
arrows(1, 4, 4, 4, angle = 120)
arrows(1, 3, 4, 3, code = 0)
arrows(1, 2, 4, 2, code = 1)
arrows(1, 1, 4, 1, code = 3)
text(4.5, 9, adj = 0, 'angle=30, length=0.25, code=2 (default)')
text(4.5, 8, adj = 0, 'length=0.5')
text(4.5, 7, adj = 0, 'length=0.1')
text(4.5, 6, adj = 0, 'angle=60')
text(4.5, 5, adj = 0, 'angle=90')
text(4.5, 4, adj = 0, 'angle=120')
text(4.5, 3, adj = 0, 'code=0')
text(4.5, 2, adj = 0, 'code=1')
text(4.5, 1, adj = 0, 'code=3')

arrows(x0 = x시작값, y0 = y시작값, x1 = x끝값, y1 = y끝값, length = 길이, angle = 화살표 각, code = 종류)

9. `legend()`

legend() 함수는 범례를 표시하는 데 사용됩니다.

좌표 값 대신 "left", "topleft", "bottom", "center" 등 위치를 직접 지정할 수 있습니다.

plot(1:10, type = "n", xlab = "", ylab = "", main = "legend")
legend('bottomright', 'c(x,y)', pch = 1, title = 'bottomright')
legend('bottom', '(x,y)', pch = 1, title = 'bottom')
legend('bottomleft', '(x,y)', pch = 1, title = 'bottomleft')
legend('left', '(x,y)', pch = 1, title = 'bottomleft')
legend('topleft', '(x,y)', pch = 1, title = 'topleft')
legend('top', '(x,y)', pch = 1, title = 'top')
legend('topright', '(x,y)', pch = 1, title = 'topright')
legend('right', '(x,y)', pch = 1, title = 'right')
legend('center', '(x,y)', pch = 1, title = 'center')

legends = c('Legend1', 'Legend2')
legend(3, 8, legend = legends, pch = 1:2, col = 1:2)
legend(7, 8, legend = legends, pch = 1:2, col = 1:2, lty = 1:2)
legend(3, 4, legend = legends, fill = 1:2)
legend(7, 4, legend = legends, fill = 1:2, density = 30)

legend(x좌표, y좌표, 표시할범례)

10. `boxplot()`

상자 그림은 데이터의 분포를 보여주는 그림으로 제1사분위수, 중앙값, 제3사분위수 등을 보여줍니다. 함수는 boxplot() 입니다.
- iris 데이터의 Sepal.Width 변수에 대한 box-plot 을 출력해본 결과 입니다.
```
boxplot(iris$Sepal.Width)
```
상자 그림에 표시된 값들을 정확히 확인하려면 boxplot()의 반환 값을 살펴보시면 됩니다.
```
## $stats
##      [,1]
## [1,]  2.2
## [2,]  2.8
## [3,]  3.0
## [4,]  3.3
## [5,]  4.0
## 
## $n
## [1] 150
## 
## $conf
##          [,1]
## [1,] 2.935497
## [2,] 3.064503
## 
## $out
## [1] 4.4 4.1 4.2 2.0
## 
## $group
## [1] 1 1 1 1
## 
## $names
## [1] "1"
```
- 반환 값은 리스트의 형태로 출력되며, $stats에는 (lower whisker, lower hinge, 중앙값, upper hinge, upper whisker)를 포함하고 있고, $out에는 이상점을 표시하고 있습니다.
```
stats
```
```
stats <- boxplot(iris$Sepal.Width)
```

11. `hist()`

데이터의 분포를 알아보는 데 유용한 또 다른 그래프는 히스토그램입니다. 히스토그램을 그리는데 이용되는 함수는 hist() 입니다.
- iris 데이터의 Sepal.Width 변수에 대한 히스토그램을 출력해본 결과 입니다.
```
hist(iris$Sepal.Width)
```

'Basic' 카테고리의 다른 글

[R] 적합도 검정 (0)	2017.07.04
[R] 범주형 자료에서 독립성 검정 (0)	2017.07.04
[R] 표본 추출 (0)	2017.07.03
[R] 난수생성과 기초통계량 (0)	2017.07.03
[R] 데이터 불러오기 (0)	2017.07.02

TAGS.

제이드의 낙서장

카테고리

방문자수

[R] 기본 plot 함수

1. Basic graphic function `plot()`

1. 1. Options

축 이름과 그래프 제목

점의 종류

점의 크기

색상

좌표 축 값의 범위

그래프의 유형

선 유형

축 설정

그래프 배열

2. `points()`

3. `lines()`

4. `abline()`

5. `curve()`

6. `text()`

7. `polygon()`

8. `arrows()`

9. `legend()`

10. `boxplot()`

11. `hist()`

'Basic' 카테고리의 다른 글

Comments

티스토리툴바

카테고리

방문자수

[R] 기본 plot 함수

1. Basic graphic function plot()

1. 1. Options

축 이름과 그래프 제목

점의 종류

점의 크기

색상

좌표 축 값의 범위

그래프의 유형

선 유형

축 설정

그래프 배열

2. points()

3. lines()

4. abline()

5. curve()

6. text()

7. polygon()

8. arrows()

9. legend()

10. boxplot()

11. hist()

'Basic' 카테고리의 다른 글

Comments

티스토리툴바

1. Basic graphic function `plot()`

2. `points()`

3. `lines()`

4. `abline()`

5. `curve()`

6. `text()`

7. `polygon()`

8. `arrows()`

9. `legend()`

10. `boxplot()`

11. `hist()`