9 Data frames

9.1 Data Frames

A data frame is equivalent to a named list where all elements are vectors of the same length.

##    Name Age       Role
## 1 Maria  47  Professor
## 2  Pete  34 Researcher
## 3 Sarah  32 Researcher

Data frames are the most common way to represent tabular data in R. Matrices and lists can be converted to data frames.

9.2 Selection

Selection is similar to vectors and lists.

##    Name Age      Role
## 1 Maria  47 Professor
## [1] Maria Pete  Sarah
## Levels: Maria Pete Sarah

9.3 Selection

Selection is similar to vectors and lists.

## [1] Maria Pete  Sarah
## Levels: Maria Pete Sarah
## [1] Maria
## Levels: Maria Pete Sarah

9.4 Value assignment

Values can be assigned to cells through filtering and <-

##    Name Age       Role
## 1 Maria  47  Professor
## 2  Pete  34 Researcher
## 3 Sarah  33 Researcher

9.5 Column processing

Operations can be performed on columns, and new columns created.

##    Name Age       Role Year_of_birth
## 1 Maria  47  Professor          1973
## 2  Pete  34 Researcher          1986
## 3 Sarah  33 Researcher          1987

9.6 tibble

A tibble is a modern reimagining of the data.frame within tidyverse

  • they do less
    • don’t change variable names or types
    • don’t do partial matching
  • complain more
    • e.g. when a variable does not exist

This forces you to confront problems earlier, typically leading to cleaner, more expressive code.