6 Factors
6.1 Factors
A factor is a data type similar to a vector. However, the values contained in a factor can only be selected from a set of levels.
houses_vector <- c("Bungalow", "Flat", "Flat",
"Detached", "Flat", "Terrace", "Terrace")
houses_vector
## [1] "Bungalow" "Flat" "Flat" "Detached" "Flat" "Terrace"
## [7] "Terrace"
houses_factor <- factor(c("Bungalow", "Flat", "Flat",
"Detached", "Flat", "Terrace", "Terrace"))
houses_factor
## [1] Bungalow Flat Flat Detached Flat Terrace Terrace
## Levels: Bungalow Detached Flat Terrace
6.2 table
The function table can be used to obtain a tabulated count for each level.
houses_factor <- factor(c("Bungalow", "Flat", "Flat",
"Detached", "Flat", "Terrace", "Terrace"))
houses_factor
## [1] Bungalow Flat Flat Detached Flat Terrace Terrace
## Levels: Bungalow Detached Flat Terrace
## houses_factor
## Bungalow Detached Flat Terrace
## 1 1 3 2
6.3 Specified levels
A specific set of levels can be specified when creating a factor by providing a levels argument.
houses_factor_spec <- factor(
c("People Carrier", "Flat", "Flat", "Hatchback",
"Flat", "Terrace", "Terrace"),
levels = c("Bungalow", "Flat", "Detached",
"Semi", "Terrace"))
table(houses_factor_spec)
## houses_factor_spec
## Bungalow Flat Detached Semi Terrace
## 0 3 0 0 2
6.4 (Unordered) Factors
In statistics terminology, (unordered) factors are categorical (i.e., binary or nominal) variables. Levels are not ordered.
income_nominal <- factor(
c("High", "High", "Low", "Low", "Low",
"Medium", "Low", "Medium"),
levels = c("Low", "Medium", "High"))
income_nominal > "Low"
## Warning in Ops.factor(income_nominal, "Low"): '>' not meaningful for
## factors
## [1] NA NA NA NA NA NA NA NA
6.5 Ordered Factors
In statistics terminology, ordered factors are ordinal variables. Levels are ordered.
income_ordered <- ordered(
c("High", "High", "Low", "Low", "Low",
"Medium", "Low", "Medium"),
levels = c("Low", "Medium", "High"))
income_ordered > "Low"
## [1] TRUE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
## [1] Low Low Low Low Medium Medium High High
## Levels: Low < Medium < High