R Data Types & Data Structures

Vector, List, Matrix and Array

Ha Khanh Nguyen

Vector

  • In R, vector includes: atomic vector and list.

Atomic Vector

  • There are 6 types of atomic vectors in R:
    • Double
    • Integer
    • Character
    • Logical
    • Complex
    • Raw

Double Vector

  • double means regular number.
  • double data type is also called numeric.
  • R will store any number you type in R as a double vector of length 1.
## [1] 4.5
## [1] TRUE
## [1] "double"
## [1] 1 2 3 4 5 6
  • die is a double vector of length 6.
## [1] "double"
  • typeof() returns the type of the data (double, integer, character, complex, or raw).

Integer Vector

  • integer refers to numbers that can be written without a decimal component.
  • By default, R stores any number as double.
  • To specify that you want a number to be stored as integer:
## [1] -1  2  4
## [1] "integer"

Character Vector

  • Character vector stores small pieces of text.
## [1] "Hello" "World"
## [1] "character"

Logical Vector

  • Logical vector stores TRUE and FALSE, also called Boolean data type.
  • When applying calculation functions to logical vector, TRUE = 1 and FALSE = 0.
## [1]  TRUE  TRUE FALSE
## [1] 2
## [1] "logical"

Complex and Raw Vector

  • Complex and raw vectors are not common in R.
  • Complex vector stores complex numbers.
## [1] 1+2i
## [1] "complex"
  • Raw vector stores raw bytes of data.
## [1] 00 00 00
## [1] "raw"

List

  • How are lists different from atomic vectors?
Atomic Vectors Lists
- Group data into a one-dimensional array - Group data into a one-dimensional array
- Group individual values of the same data type - Group R objects such as atomic vectors and other lists.
## [1] "1"    "one"  "TRUE"
## [[1]]
## [1] 1
## 
## [[2]]
## [1] "one"
## 
## [[3]]
## [1] TRUE

Creating a List

  • list() creates a list in the same way c() creates a vector.
  • Elements in lists can have different lengths, dimensions, or type of objects.
## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
## 
## [[2]]
## [1] "2020"
## 
## [[3]]
## [[3]][[1]]
## [1] "january"
## 
## [[3]][[2]]
## [1] 2 3 4 5 6 7 8
## [[1]]
## [1] "stat385"    "spring2020"
## 
## [[2]]
## [1] 1 2 3 4 5
## 
## [[3]]
## [[3]][[1]]
## [1] "homework"
## 
## [[3]][[2]]
## [1] "projects"
## [[1]]
## [[1]][[1]]
## [1] "Ha"
## 
## [[1]][[2]]
## [1] 24
## 
## 
## [[2]]
## [[2]][[1]]
## [[2]][[1]][[1]]
## [1] "Alex"
## 
## [[2]][[1]][[2]]
## [1] 20
## 
## 
## [[2]][[2]]
## [[2]][[2]][[1]]
## [1] "Dave"
## 
## [[2]][[2]][[2]]
## [1] 21

Selecting Values in a List

  • How do we select an element of a list?
## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
  • big_list[1] returns a list of length 1 containing an atomic vector of type double.
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
  • big_list[[1]] returns an atomic vector of type double.

  • Notes:
    • [[1]] means list.
    • [1] means vector.
  • Do you notice a difference in the code below?

## [1] "stat385"    "spring2020"
## [1] "stat385"
## [[1]]
## [1] "stat385"    "spring2020"
## [[1]]
## [1] "stat385"    "spring2020"
  • Using [[]] and [], we can access any element in a list.
## [[1]]
## [[1]][[1]]
## [1] "Alex"
## 
## [[1]][[2]]
## [1] 20
## 
## 
## [[2]]
## [[2]][[1]]
## [1] "Dave"
## 
## [[2]][[2]]
## [1] 21
## [[1]]
## [1] "Alex"
## 
## [[2]]
## [1] 20
## [1] "Alex"
  • Question: How do we get Ha’s age from stat385?
## [[1]]
## [1] "Ha"
## 
## [[2]]
## [1] 24
## [1] 24

Matrix

Creating a Matrix

  • Matrix stores values of the same data type in a two-dimensional array.
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
  • To create a matrix, supply matrix() with an atomic vector to reorganize into a matrix and specify how many rows/columns should be in the matrix.
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
  • What’s about specifying both the number of rows and columns?
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
  • By default, matrix() will fill up the matrix column by column. We can fill the matrix row by row by including the argument byrow = TRUE:
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

Selecting an Element in a Matrix

  • To select element(s) in a matrix, use [ , ]:
## [1] 1
## [1] 5
## [1] 1 2 3
## [1] 3 6
##      [,1] [,2]
## [1,]    1    2
## [2,]    4    5
## [1] 1 3

Array

Creating an Array

  • The array() function creates \(n\)-dimensional array.
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    3    5    7    9   11
## [2,]    2    4    6    8   10   12
  • What happens if we give more data points than the dimension allows?
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    3    5    7    9   11
## [2,]    2    4    6    8   10   12
  • my_array is a two-dimensional array, which is a matrix.
  • What’s about three-dimensional array?
## , , 1
## 
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
## 
## , , 2
## 
##      [,1] [,2]
## [1,]    5    7
## [2,]    6    8
## 
## , , 3
## 
##      [,1] [,2]
## [1,]    9   11
## [2,]   10   12

Selecting an Element in an Array

  • For a three-dimensional array, use [ , , ]:
## [1] 1
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
## [1] 1 2

Summary

  • Elements of an atomic vector, a matrix, or an array have to be of the same data type.
  • Elements of a list can be of different R objects.
  • Atomic vector and list store data in a one-dimensional array.
  • Matrix stores data in a two-dimensional array.
  • Array can store data in a \(n\)-dimensional array.

Extra Question

  • What is the difference between data types and R objects?
    • Data types: basic elements of a programming language.
      • Think double, integer, character, logical, complex, and raw.
    • R objects: bigger structure, containing elements that belong to one of the data types.
      • Think atomic vector, list, matrix, array, dataframe, etc.

Attributes

  • Attribute = extra information about the object.
  • Attributes won’t affect the values of the object.
  • Think of attributes as “metadata”.
  • Use attributes() to see which attributes an object has.
## NULL
  • NULL means the object has no attribute.

Names

  • Names is one of the most common attributes of an R object.
  • names() is the helper function associated with the names attribute.
## NULL
## $names
## [1] "one"   "two"   "three" "four"  "five"  "six"
##   one   two three  four  five   six 
##     1     2     3     4     5     6
##   one   two three  four  five   six 
##     2     3     4     5     6     7
## small small small   big   big   big 
##     1     2     3     4     5     6
## [1] 1 2 3 4 5 6

Dimension

  • Another common attribute is dimension.
  • dim() returns the dimension of an R Object.
## NULL
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

To-do

  • Make sure your R and RStudio are working.
  • Redo in-class examples on your own.
  • Continue reading Chapter 3 of Hands-On Programming with R.

References