Control structures in R allow you to control the flow of execution ofa series of R expressions. Basically, control structures allow you toput some “logic” into your R code, rather than just always executingthe same R code every time. Control structures allow you to respond toinputs or to features of the data and execute different R expressionsaccordingly.
Commonly used control structures are
if
andelse
: testing a condition and acting on itfor
: execute a loop a fixed number of timeswhile
: execute a loop while a condition is truerepeat
: execute an infinite loop (mustbreak
out of it to stop)break
: break the execution of a loopnext
: skip an interation of a loop
Most control structures are not used in interactive sessions, butrather when writing functions or longer expresisons. However, theseconstructs do not have to be used in functions and it’s a good idea tobecome familiar with them before we delve into functions.
13.1 if
-else
The if
-else
combination is probably the most commonly used controlstructure in R (or perhaps any language). This structure allows you totest a condition and act on it depending on whether it’s true orfalse.
For starters, you can just use the if
statement.
if(<condition>) { ## do something} ## Continue with rest of code
The above code does nothing if the condition is false. If you have anaction you want to execute when the condition is false, then you needan else
clause.
if(<condition>) { ## do something} else { ## do something else}
You can have a series of tests by following the initial if
with anynumber of else if
s.
if(<condition1>) { ## do something} else if(<condition2>) { ## do something different} else { ## do something different}
Here is an example of a valid if/else structure.
## Generate a uniform random numberx <- runif(1, 0, 10) if(x > 3) { y <- 10} else { y <- 0}
The value of y
is set depending on whether x > 3
or not. Thisexpression can also be written a different, but equivalent, way in R.
y <- if(x > 3) { 10} else { 0}
Neither way of writing this expression is more correct than theother. Which one you use will depend on your preference and perhapsthose of the team you may be working with.
Of course, the else
clause is not necessary. You could have a seriesof if clauses that always get executed if their respective conditionsare true.
if(<condition1>) {}if(<condition2>) {}
13.2 for
Loops
For loops are pretty much the only looping construct that you willneed in R. While you may occasionally find a need for other types ofloops, in my experience doing data analysis, I’ve found very fewsituations where a for loop wasn’t sufficient.
In R, for loops take an interator variable and assign it successivevalues from a sequence or vector. For loops are most commonly used foriterating over the elements of an object (list, vector, etc.)
> for(i in 1:10) {+ print(i)+ }[1] 1[1] 2[1] 3[1] 4[1] 5[1] 6[1] 7[1] 8[1] 9[1] 10
This loop takes the i
variable and in each iteration of the loopgives it values 1, 2, 3, …, 10, executes the code within the curlybraces, and then the loop exits.
The following three loops all have the same behavior.
> x <- c("a", "b", "c", "d")> > for(i in 1:4) {+ ## Print out each element of 'x'+ print(x[i]) + }[1] "a"[1] "b"[1] "c"[1] "d"
The seq_along()
function is commonly used in conjunction with forloops in order to generate an integer sequence based on the length ofan object (in this case, the object x
).
> ## Generate a sequence based on length of 'x'> for(i in seq_along(x)) { + print(x[i])+ }[1] "a"[1] "b"[1] "c"[1] "d"
It is not necessary to use an index-type variable.
> for(letter in x) {+ print(letter)+ }[1] "a"[1] "b"[1] "c"[1] "d"
For one line loops, the curly braces are not strictly necessary.
> for(i in 1:4) print(x[i])[1] "a"[1] "b"[1] "c"[1] "d"
However, I like to use curly braces even for one-line loops, becausethat way if you decide to expand the loop to multiple lines, you won’tbe burned because you forgot to add curly braces (and you will beburned by this).
13.3 Nested for
loops
for
loops can be nested inside of each other.
x <- matrix(1:6, 2, 3)for(i in seq_len(nrow(x))) { for(j in seq_len(ncol(x))) { print(x[i, j]) } }
Nested loops are commonly needed for multidimensional or hierarchicaldata structures (e.g.matrices, lists). Be careful with nestingthough. Nesting beyond 2 to 3 levels often makes it difficult toread/understand the code. If you find yourself in need of a largenumber of nested loops, you may want to break up the loops by usingfunctions (discussed later).
13.4 while
Loops
While loops begin by testing a condition. If it is true, then theyexecute the loop body. Once the loop body is executed, the conditionis tested again, and so forth, until the condition is false, afterwhich the loop exits.
> count <- 0> while(count < 10) {+ print(count)+ count <- count + 1+ }[1] 0[1] 1[1] 2[1] 3[1] 4[1] 5[1] 6[1] 7[1] 8[1] 9
While loops can potentially result in infinite loops if not writtenproperly. Use with care!
Sometimes there will be more than one condition in the test.
> z <- 5> set.seed(1)> > while(z >= 3 && z <= 10) {+ coin <- rbinom(1, 1, 0.5)+ + if(coin == 1) { ## random walk+ z <- z + 1+ } else {+ z <- z - 1+ } + }> print(z)[1] 2
Conditions are always evaluated from left to right. For example, inthe above code, if z
were less than 3, the second test would nothave been evaluated.
13.5 repeat
Loops
repeat
initiates an infinite loop right from the start. These arenot commonly used in statistical or data analysis applications butthey do have their uses. The only way to exit a repeat
loop is tocall break
.
One possible paradigm might be in an iterative algorith where you maybe searching for a solution and you don’t want to stop until you’reclose enough to the solution. In this kind of situation, you oftendon’t know in advance how many iterations it’s going to take to get“close enough” to the solution.
x0 <- 1tol <- 1e-8repeat { x1 <- computeEstimate() if(abs(x1 - x0) < tol) { ## Close enough? break } else { x0 <- x1 } }
Note that the above code will not run if the computeEstimate()
function is not defined (I just made it up for the purposes of thisdemonstration).
The loop above is a bit dangerous because there’s no guarantee it willstop. You could get in a situation where the values of x0
and x1
oscillate back and forth and never converge. Better to set a hardlimit on the number of iterations by using a for
loop and thenreport whether convergence was achieved or not.
13.6 next
, break
next
is used to skip an iteration of a loop.
for(i in 1:100) { if(i <= 20) { ## Skip the first 20 iterations next } ## Do something here}
break
is used to exit a loop immediately, regardless of whatiteration the loop may be on.
for(i in 1:100) { print(i) if(i > 20) { ## Stop loop after 20 iterations break } }
13.7 Summary
Control structures like
if
,while
, andfor
allow you tocontrol the flow of an R programInfinite loops should generally be avoided, even if (you believe)they are theoretically correct.
Control structures mentioned here are primarily useful for writingprograms; for command-line interactive work, the “apply” functionsare more useful.