In this lecture, we will talk about loops. Loops are very powerful in terms to do some tasks repeatedly (like thousands of times). Computers are very good with doing repeated tasks, which we as human are pretty bad. Even though we present examples in R here, the logic applies to pretty much any programming language.
Loops repeat two steps: evaluation and execution. For evaluation, we tell the computer to test for a specific condition, and there are two outcomes of this evaluation: condition satisfied and not satisfied. Then for each scenario, we can let the computer know what to do respectively.
Three types of loops in R:
repeat
while
for
repeat
repeat
is the simplest loop to repeat the same
expression
To stop repeating the expression, you can use the keyword
break
. To skip to the next iteration in a loop, you can use
the command next
.
## [1] 5
## [1] 10
## [1] 15
## [1] 20
## [1] 25
If you do not include a break
command, the R code will
be an infinite loop.
while
Another useful construction is while
loops, which repeat
an expression while a condition is true:
## [1] 5
## [1] 10
## [1] 15
## [1] 20
## [1] 25
You can also use break
and next
inside
while loops. The break
statement is used to stop iterating
through a loop. The next
statement skips to the next loop
iteration without evaluating the remaining expressions in the loop
body.
for
Results are not printed inside a loop unless you explicitly
call the print
function.
Add print
.
## [1] "1^2 = 1"
## [1] "2^2 = 4"
## [1] "3^2 = 9"
## [1] "4^2 = 16"
## [1] "5^2 = 25"
## 1^2 = 1
## 2^2 = 4
## 3^2 = 9
## 4^2 = 16
## 5^2 = 25
a <- c(2, 6)
for (i in 1:2) {
print(paste('i', '=', i, sep=' '))
print(paste(c('a', '=', a), collapse=' '))
cat('execution\n')
a[i] <- a[i] + 10
print(paste(c('a', '=', a), collapse=' '))
print(paste('i', '=', i, sep=' '))
cat('next\n\n')
}
## [1] "i = 1"
## [1] "a = 2 6"
## execution
## [1] "a = 12 6"
## [1] "i = 1"
## next
##
## [1] "i = 2"
## [1] "a = 12 6"
## execution
## [1] "a = 12 16"
## [1] "i = 2"
## next
## [1] 12 16
## [1] 12 16
The variable var
that is set in a for loop is
changed in the calling environment.
## [1] 2
## [1] "2" "6"
## [1] "2" "6"
## [1] "Year 2016"
## [1] "Year 2017"
## [1] "Year 2018"
## [1] "Year 2019"
## [1] "Year 2020"
## [1] "Year 2021"
## [1] "Year 2022"
## [1] "Year 2023"
## [1] "Year 2024"
## [1] "Year 2025"
## [1] "Year 2016" "Year 2017" "Year 2018" "Year 2019" "Year 2020" "Year 2021"
## [7] "Year 2022" "Year 2023" "Year 2024" "Year 2025"
Except loops such as for()
, The base R packages include
a set of different functions to apply a function to each element (or a
subset of elements) of an object.
To apply a function to parts of an array (or matrix), use the
apply
function:
## [,1] [,2] [,3] [,4]
## [1,] 1 6 11 16
## [2,] 2 7 12 17
## [3,] 3 8 13 18
## [4,] 4 9 14 19
## [5,] 5 10 15 20
## [1] 16
## [1] 17
## [1] 18
## [1] 19
## [1] 20
## [1] 16 17 18 19 20
## [1] 5
## [1] 10
## [1] 15
## [1] 20
## [1] 5 10 15 20
A more complicated example:
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 10 13 16
## [2,] 11 14 17
## [3,] 12 15 18
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 19 22 25
## [2,] 20 23 26
## [3,] 21 24 27
## [1] "1, 4, 7, 10, 13, 16, 19, 22, 25" "2, 5, 8, 11, 14, 17, 20, 23, 26"
## [3] "3, 6, 9, 12, 15, 18, 21, 24, 27"
## [1] "1, 2, 3, 10, 11, 12, 19, 20, 21" "4, 5, 6, 13, 14, 15, 22, 23, 24"
## [3] "7, 8, 9, 16, 17, 18, 25, 26, 27"
## [1] "1, 2, 3, 4, 5, 6, 7, 8, 9" "10, 11, 12, 13, 14, 15, 16, 17, 18"
## [3] "19, 20, 21, 22, 23, 24, 25, 26, 27"
## [,1] [,2] [,3]
## [1,] "1, 10, 19" "4, 13, 22" "7, 16, 25"
## [2,] "2, 11, 20" "5, 14, 23" "8, 17, 26"
## [3,] "3, 12, 21" "6, 15, 24" "9, 18, 27"
To apply a function to each element in a vector or a list and
return a list, you can use the function
lapply
. The function lapply
requires two
arguments: an object X
and a function FUN
.
(You may specify additional arguments that will be passed to
FUN
.)
## [[1]]
## [1] 2
##
## [[2]]
## [1] 4
##
## [[3]]
## [1] 8
##
## [[4]]
## [1] 16
##
## [[5]]
## [1] 32
## $x
## [1] 2 4 8 16 32
##
## $y
## [1] 64 128 256 512 1024
## $x
## [1] 3
##
## $y
## [1] 8
Sometimes, you might prefer to get a vector, matrix, or array instead
of a list. To do this, use the sapply
function. This
function works exactly the same way as lapply
, except that
it returns a vector or matrix (when appropriate).
## x y
## [1,] 2 64
## [2,] 4 128
## [3,] 8 256
## [4,] 16 512
## [5,] 32 1024
## x y
## [1,] 2 64
## [2,] 4 128
## [3,] 8 256
## [4,] 16 512
## [5,] 32 1024
Another related function is mapply
, the “multivariate”
version of sapply
.
mapply(paste,
c(1, 2, 3, 4, 5),
c("a","b","c","d","e"),
c("A","B","C","D","E"),
MoreArgs = list(sep = "-")
)
## [1] "1-a-A" "2-b-B" "3-c-C" "4-d-D" "5-e-E"
The functions mentioned above are all from base R libraries. The
popular tidyverse
packages also have functions that are
similar to the apply set functions.
Data frame: work with data frame row by row.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## x y z
## 1 0.2543282 0.7899347 0.7909086
## 2 0.5812300 0.4571883 0.3104377
## 3 0.9034680 0.6047601 0.5428415
## 4 0.3123348 0.2626777 0.5094999
## 5 0.6594451 0.5237566 0.6582903
## 6 0.9122501 0.3042793 0.1354054
## # A tibble: 6 × 4
## # Rowwise:
## x y z m
## <dbl> <dbl> <dbl> <dbl>
## 1 0.254 0.790 0.791 0.612
## 2 0.581 0.457 0.310 0.450
## 3 0.903 0.605 0.543 0.684
## 4 0.312 0.263 0.509 0.362
## 5 0.659 0.524 0.658 0.614
## 6 0.912 0.304 0.135 0.451
Work with some columns that meet specific conditions.
## 'data.frame': 6 obs. of 5 variables:
## $ x: num 0.254 0.581 0.903 0.312 0.659 ...
## $ y: num 0.79 0.457 0.605 0.263 0.524 ...
## $ z: num 0.791 0.31 0.543 0.509 0.658 ...
## $ a: chr "a" "b" "c" "d" ...
## $ b: chr "A" "B" "C" "D" ...
# convert columns with characters to factors
for(i in 1:ncol(df)){
if(is.character(df[,i])){
df[,i] = as.factor(df[,i])
}
}
str(df)
## 'data.frame': 6 obs. of 5 variables:
## $ x: num 0.254 0.581 0.903 0.312 0.659 ...
## $ y: num 0.79 0.457 0.605 0.263 0.524 ...
## $ z: num 0.791 0.31 0.543 0.509 0.658 ...
## $ a: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
## $ b: Factor w/ 6 levels "A","B","C","D",..: 1 2 3 4 5 6
# dplyr version to convert factors to characters back
df = mutate_if(.tbl = df, .predicate = is.factor, .funs = as.character)
str(df)
## 'data.frame': 6 obs. of 5 variables:
## $ x: num 0.254 0.581 0.903 0.312 0.659 ...
## $ y: num 0.79 0.457 0.605 0.263 0.524 ...
## $ z: num 0.791 0.31 0.543 0.509 0.658 ...
## $ a: chr "a" "b" "c" "d" ...
## $ b: chr "A" "B" "C" "D" ...
Work with vectors, lists, or multiple of them.
## [1] 1 4 9 16 25
## [1] 1 4 9 16 25
## [1] "A" "B" "C"
## [[1]]
## [1] TRUE
##
## [[2]]
## [1] TRUE
##
## [[3]]
## [1] FALSE
##
## [[4]]
## [1] TRUE
##
## [[5]]
## [1] FALSE
Two variables.
## [1] "a-A" "b-B" "c-C"
## [1] 0 0 0
More than two variables.
## [[1]]
## [1] 111
##
## [[2]]
## [1] 221
##
## [[3]]
## [1] 331
# Matching arguments by position
purrr::pmap(list(x, y, z), function(first, second, third) (first + third) * second)
## [[1]]
## [1] 1010
##
## [[2]]
## [1] 4020
##
## [[3]]
## [1] 9030