13 Control Structures | R Programming for Data Science (2024)

Watch a video of this section

Control structures in R allow you to control the flow of execution ofa series of R expressions. Basically, control structures allow you toput some “logic” into your R code, rather than just always executingthe same R code every time. Control structures allow you to respond toinputs or to features of the data and execute different R expressionsaccordingly.

Commonly used control structures are

  • if and else: testing a condition and acting on it

  • for: execute a loop a fixed number of times

  • while: execute a loop while a condition is true

  • repeat: execute an infinite loop (must break out of it to stop)

  • break: break the execution of a loop

  • next: skip an interation of a loop

Most control structures are not used in interactive sessions, butrather when writing functions or longer expresisons. However, theseconstructs do not have to be used in functions and it’s a good idea tobecome familiar with them before we delve into functions.

13.1 if-else

Watch a video of this section

The if-else combination is probably the most commonly used controlstructure in R (or perhaps any language). This structure allows you totest a condition and act on it depending on whether it’s true orfalse.

For starters, you can just use the if statement.

if(<condition>) { ## do something} ## Continue with rest of code

The above code does nothing if the condition is false. If you have anaction you want to execute when the condition is false, then you needan else clause.

if(<condition>) { ## do something} else { ## do something else}

You can have a series of tests by following the initial if with anynumber of else ifs.

if(<condition1>) { ## do something} else if(<condition2>) { ## do something different} else { ## do something different}

Here is an example of a valid if/else structure.

## Generate a uniform random numberx <- runif(1, 0, 10) if(x > 3) { y <- 10} else { y <- 0}

The value of y is set depending on whether x > 3 or not. Thisexpression can also be written a different, but equivalent, way in R.

y <- if(x > 3) { 10} else {  0}

Neither way of writing this expression is more correct than theother. Which one you use will depend on your preference and perhapsthose of the team you may be working with.

Of course, the else clause is not necessary. You could have a seriesof if clauses that always get executed if their respective conditionsare true.

if(<condition1>) {}if(<condition2>) {}

13.2 for Loops

Watch a video of this section

For loops are pretty much the only looping construct that you willneed in R. While you may occasionally find a need for other types ofloops, in my experience doing data analysis, I’ve found very fewsituations where a for loop wasn’t sufficient.

In R, for loops take an interator variable and assign it successivevalues from a sequence or vector. For loops are most commonly used foriterating over the elements of an object (list, vector, etc.)

> for(i in 1:10) {+ print(i)+ }[1] 1[1] 2[1] 3[1] 4[1] 5[1] 6[1] 7[1] 8[1] 9[1] 10

This loop takes the i variable and in each iteration of the loopgives it values 1, 2, 3, …, 10, executes the code within the curlybraces, and then the loop exits.

The following three loops all have the same behavior.

> x <- c("a", "b", "c", "d")> > for(i in 1:4) {+ ## Print out each element of 'x'+ print(x[i]) + }[1] "a"[1] "b"[1] "c"[1] "d"

The seq_along() function is commonly used in conjunction with forloops in order to generate an integer sequence based on the length ofan object (in this case, the object x).

> ## Generate a sequence based on length of 'x'> for(i in seq_along(x)) { + print(x[i])+ }[1] "a"[1] "b"[1] "c"[1] "d"

It is not necessary to use an index-type variable.

> for(letter in x) {+ print(letter)+ }[1] "a"[1] "b"[1] "c"[1] "d"

For one line loops, the curly braces are not strictly necessary.

> for(i in 1:4) print(x[i])[1] "a"[1] "b"[1] "c"[1] "d"

However, I like to use curly braces even for one-line loops, becausethat way if you decide to expand the loop to multiple lines, you won’tbe burned because you forgot to add curly braces (and you will beburned by this).

13.3 Nested for loops

for loops can be nested inside of each other.

x <- matrix(1:6, 2, 3)for(i in seq_len(nrow(x))) { for(j in seq_len(ncol(x))) { print(x[i, j]) } }

Nested loops are commonly needed for multidimensional or hierarchicaldata structures (e.g.matrices, lists). Be careful with nestingthough. Nesting beyond 2 to 3 levels often makes it difficult toread/understand the code. If you find yourself in need of a largenumber of nested loops, you may want to break up the loops by usingfunctions (discussed later).

13.4 while Loops

Watch a video of this section

While loops begin by testing a condition. If it is true, then theyexecute the loop body. Once the loop body is executed, the conditionis tested again, and so forth, until the condition is false, afterwhich the loop exits.

> count <- 0> while(count < 10) {+ print(count)+ count <- count + 1+ }[1] 0[1] 1[1] 2[1] 3[1] 4[1] 5[1] 6[1] 7[1] 8[1] 9

While loops can potentially result in infinite loops if not writtenproperly. Use with care!

Sometimes there will be more than one condition in the test.

> z <- 5> set.seed(1)> > while(z >= 3 && z <= 10) {+ coin <- rbinom(1, 1, 0.5)+ + if(coin == 1) { ## random walk+ z <- z + 1+ } else {+ z <- z - 1+ } + }> print(z)[1] 2

Conditions are always evaluated from left to right. For example, inthe above code, if z were less than 3, the second test would nothave been evaluated.

13.5 repeat Loops

Watch a video of this section

repeat initiates an infinite loop right from the start. These arenot commonly used in statistical or data analysis applications butthey do have their uses. The only way to exit a repeat loop is tocall break.

One possible paradigm might be in an iterative algorith where you maybe searching for a solution and you don’t want to stop until you’reclose enough to the solution. In this kind of situation, you oftendon’t know in advance how many iterations it’s going to take to get“close enough” to the solution.

x0 <- 1tol <- 1e-8repeat { x1 <- computeEstimate()  if(abs(x1 - x0) < tol) { ## Close enough? break } else { x0 <- x1 } }

Note that the above code will not run if the computeEstimate()function is not defined (I just made it up for the purposes of thisdemonstration).

The loop above is a bit dangerous because there’s no guarantee it willstop. You could get in a situation where the values of x0 and x1oscillate back and forth and never converge. Better to set a hardlimit on the number of iterations by using a for loop and thenreport whether convergence was achieved or not.

13.6 next, break

next is used to skip an iteration of a loop.

for(i in 1:100) { if(i <= 20) { ## Skip the first 20 iterations next  } ## Do something here}

break is used to exit a loop immediately, regardless of whatiteration the loop may be on.

for(i in 1:100) { print(i) if(i > 20) { ## Stop loop after 20 iterations break  } }

13.7 Summary

  • Control structures like if, while, and for allow you tocontrol the flow of an R program

  • Infinite loops should generally be avoided, even if (you believe)they are theoretically correct.

  • Control structures mentioned here are primarily useful for writingprograms; for command-line interactive work, the “apply” functionsare more useful.

13 Control Structures | R Programming for Data Science (2024)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Kieth Sipes

Last Updated:

Views: 5344

Rating: 4.7 / 5 (67 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Kieth Sipes

Birthday: 2001-04-14

Address: Suite 492 62479 Champlin Loop, South Catrice, MS 57271

Phone: +9663362133320

Job: District Sales Analyst

Hobby: Digital arts, Dance, Ghost hunting, Worldbuilding, Kayaking, Table tennis, 3D printing

Introduction: My name is Kieth Sipes, I am a zany, rich, courageous, powerful, faithful, jolly, excited person who loves writing and wants to share my knowledge and understanding with you.