The backbone of programs
Published in · 13 min read · Jan 22, 2020
If you want to start programming, I must admit that the outlook is not good: different operating systems, so many programming languages, and endless ways of reaching the same results. These are the type of situations that will make you either run away (as fast and as far as you can, hoping you’ll never bump into programming again), or face the beast. If you decide to face it, then you won’t know where to start from, or how to handle it. You’ll only have questions and probably not even one answer, but you know what? This is a great way to start. Actually this is how I started programming. I believed that if I ever wanted to understand programs I had to answer the question of:
How do programs work, and how can you build them?
Certainly not a simpleissue, but come on, harder questions have been answered, right? Take a look at Marvin Minsky. Minsky, who is considered one of the fathers of AI, wrote “The Society of Mind” to answer one of the most difficult questions of our time: what is the human mind, and how does it work? With a revolutionary perspective, Minsky suggested that our minds consist of the aggregation of small-minds (or more basic components) that have evolved to perform highly specific tasks. According to him, most of these tiny-minds lack the attributes we think of as intelligence and are severely limited, and can only reach feelings, thoughts or purposeful action through the interaction with other components.
This idea contains a powerful concept: complex matters can be thought as groups of simpler subjects, that can also be partitioned into more basic things, until you reach a proper understanding. If you think about it this way, any efficient system can be explained as a set of simpler functions that when put together perform in a way that reach superior results in comparison to their individual performances.
For this reason, if I wanted to understand computer programs (and learn how to make them) I needed to understand their building blocks. You see, when a programruns, the code is read by the computer line by line (from top to bottom, and from left to right), just like you would read a book. At some point, the program may reach a situation where it needs to make a decision such as jump to a different part of the program or re-run a certain piece again. These decisions that affect the flow of the program’s code are known as a Control Structures.
Control Structures can be considered as the building blocks of computer programs. They are commands that enable a program to “take decisions”, following one path or another. A program is usually not limited to a linear sequence of instructions since during its process it may bifurcate, repeat code orbypasssections. Control Structures are the blocks that analyze variables and choose directions in which to go based on given parameters.
The basic Control Structures in programming languages are:
- Conditionals (or Selection): which are used to execute one or more statements if a condition is met.
- Loops (or Iteration): which purpose is to repeat a statement a certain number of times or while a condition is fulfilled.
Now let’s take a look at each one of these concepts. Down below we will deep dive using R programming language (one of the mostly used languages for data science), but the same ideas and explanations apply to any other programming language.
“Conditionals” are at the very core of programming. The idea behind them is that they allow you to control the flow of the code that is executed based on different conditions in the program (e.g. an input taken from the user, or the internal state of the machine the program is running on). In this article we will explore the Conditionals Control Structures of “If statements” and “If-Else statements”.
1) If Statements
“If statements” execute one or more statements when a condition is met. If the testing of that condition is TRUE, the statement gets executed. But if it’s FALSE (the condition is not met), then nothing happens. Let´s visualize it:
The syntax of “If statements” is:
Example
To show a simple case, let’s say you want to verify if the value of a variable (x) is positive:
In this example, first we assign the value of 4 to the variable (x) and use the “If statement” to verify if that value is equal or greater than 0. If the test results TRUE (as in this case), the function will print the sentence: “variable x is a positive number”.
Output
[1] "variable x is a positive number"
But since the “If statement” only executes a statement if the tested condition is TRUE, what would had happened if the variable value was negative? To execute a statement on a tested condition with a FALSE result, we need to use “If-Else statement”.
2) If-Else Statements
This Control Structure allows a program to follow alternative paths of execution, whether a condition is met or not.
The syntax of “If-Else statements” is:
The “else part” of the instruction is optional and only evaluated if the condition tests FALSE.
Example 1
Following our example, we extend the previous conditional “If statement” by adding the “else part” to test if a the value of a variable is positive or negative and perform an action whether the test result is TRUE or FALSE.
In this example, we assign the value of -4 to the variable (x) and use the “If statement” to verify if that value is equal or greater than 0. If the test results TRUE, the function will print the sentence: “variable x is a positive number”. But in case the test results FALSE (as in this case), the function continues to the alternative expression and prints the sentence: “variable x is a negative number”.
Output
[1] "variable x is a negative number"
Example 2
Let’s say you need to define more than 2 conditions, as in the event of grading an exam. In that case you can grade A, B, C, D or F (5 options), so, how would you do it?
“If-Else statements” can have multiple alternative statements. In the below example we define an initial score, and an “If-Else statement” of 5 rating categories. This piece of code will go through each condition until reaching a TRUE test result.
Output
[1] “C”
“Loop statements” are nothing more than the automation of multi-step processes by organizing sequences of actions, and grouping the parts that need to be repeated. Also a central part of programming, iteration (or Looping) gives computers much of their power. They can repeat a sequence of steps as often as necessary, and appropriate repetitions of simple steps can solve complex problems.
In general terms, there are two types of “Looping techniques”:
- “For Loops”: are the ones that execute for a prescribed number of times, as controlled by a counter or an index.
- “While Loops” and “Repeat Loops”: are based on the onset and verification of a logical condition. The condition is tested at the start or the end of the loop construct.
Let’s take a look at them:
1) For Loops
In this Control Structure, statements are executed one after another in a consecutive order over a sequence of values that gets evaluated only when the “For Loop” is initiated (never re-evaluated). In this case, the number of iterations is fixed and known in advance.
If the evaluation of the condition on a variable (which can assume values within a specified sequence) results TRUE, one or more statements will be executed sequentially over that string of values. Once the first condition test is done (and results TRUE), the statement is executed and the condition is evaluated again, going through an iterative process. The “variable in sequence” section performs this test on each value of the sequence until it covers the last element.
If the condition is not met and the resulting outcome is FALSE (e.g. the “variable in sequence” part has finished going through all the elements of the sequence), the loop ends. If the condition test results FALSE in the first iteration, the “For Loop” is never executed.
The syntax of “For Loops” is:
Example 1
To show how “For Loops” work, first we will create a sequence by concatenating different names of fruits to create a list (called “fruit_list”):
We will use this fruit list as the “sequence” in a“For Loop”, and make the “For Loop” run a statement once (print the name of each value) for each provided value in the sequence (the different fruits in the fruit list):
This way, the outcome of the “For Loop” is as follows:
## [1] "Apple"
## [1] "Kiwi"
## [1] "Orange"
## [1] "Banana"
OK, so we printed the name of each value in the list. Not a big deal, right? The good thing is that “For Loops” can be used to produce more interesting results. Take a look at the following example.
Example 2
What if we want to modify values, or perform calculations sequentially? You can use “For Loops” to perform mathematical operations sequentially over each value of a vector (elements of the same type, which in this case will be numerical).
In this example, we will create a sequence of numbers (from 1 to 10), and set a “For Loop” to calculate and print the square root of each value in that sequence:
In this case, the outcome of the “For Loop” is:
[1] 1
[1] 1.414214
[1] 1.732051
[1] 2
[1] 2.236068
[1] 2.449490
[1] 2.645751
[1] 2.828427
[1] 3
[1] 3.162278
You can use any type of mathematical operator over a numerical sequence, and as we will see later in this article, make all sorts of combinations between different Control Structures to reach more complex results.
2) While Loops
In “While Loops” a condition is first evaluated, and if the result of testing that condition is TRUE, one or more statements are repeatedly executed until that condition becomes FALSE.
Unlike “If statements”, in which a condition tested as TRUE executes an expression only once and ends, “While Loops” are iterative statements that execute some expression over and over again until the condition becomes FALSE. If the condition never turns out to be FALSE, the “While Loop” will go on forever and the program will crash. The other way around, if the condition test results FALSE in the beginning of the loop, the expression will never get executed.
The syntax of “While Loops” is:
Example 1
Let’s see an example. First we will create a variable (x) and assign it the value of 1. Then we set a “While Loop” to iteratively test a condition over that variable until the condition test results FALSE:
This is how it works: the initial value of the variable (x) is 1, so when we test the condition “is the variable (x) less than 10?”, the result evaluates to TRUE and the expression is executed, printing the result of the variable (x), which in the first case is 1. But then something happens: the variable (x) is incremented by 1 before the function ends, and in the next iteration the value of x will be 2.
This variable reassignment is important because it will eventually reach the FALSE condition and the loop exit (value of x = 10). Failing to change the initial conditions in a “While Loop” will result into an infinite loop and a program crash.
Output
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
Example 2
Have you heard of the Fibonacci sequence? This is a series of numbers with the characteristic that the next number in the sequence is found by adding up the two numbers before it: 0, 1, 1, 2, 3, 5, 8, 13, 21,… This sequence can be found in several nature phenomena, and has different applications in finance, music, architecture, and other disciplines.
Let’s calculate it using a “While Loop”.
In this case we set a maximum value in the series as the stop condition, so that the loop prints the Fibonacci series only for numbers below 100. When a series element (which ever it is) becomes bigger than 100, the loop cycle ends.
[1] 0
[1] 1
[1] 1
[1] 2
[1] 3
[1] 5
[1] 8
[1] 13
[1] 21
[1] 34
[1] 55
[1] 89
Example 3
Another way of generating the Fibonacci series with a “While Loop” is, instead of setting the maximum value of the series as a stop condition, setting the number of series elements you want to generate.
This “While Loop” appends the next element of the series to the end of the previous element, until reaching a stop condition. In this case, when the series reaches 10 elements (no matter which values), the loop cylce ends.
Output
[1] 0 1 1 2 3 5 8 13 21 34
3) Repeat Loops
Closely linked to “While Loops”, “Repeat Loops” execute statements iteratively, but until a stop condition is met. This way, statements are executed at least once, no matter what the result of the condition is, and the loop is exited only when certain condition becomes TRUE:
The syntax of “Repeat Loops” is:
“Repeat Loops” use “Break statements” as a stop condition. “Break statements” are combined with the test of a condition to interrupt cycles within loops, since when the program hits a break, it will pass control to the instruction immediately after the end of the loop (if any).
“Repeat Loops” will run forever if the break condition is not met. See these 2 examples
Example 1
First we create a variable (x) and assign it the value of 5. Then we set a “Repeat Loop” to iteratively print the value of the variable, modify the value of the variable (increase it by 1), and test a condition over that variable (if it equals 10) until the condition test results TRUE.
The “breaking condition” triggers when the variable (x) reaches 10, and the loop ends.
Output
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
Example 2
Now let’s suppose we produce a list of random numbers, for which we don’t know the order or sequence of generation.
In this example we will use a “Repeat Loop” to generate a sequence of normally distributed random numbers (you can generate random with any other distribution, we just pick this one), and break the sequence once one of those numbers is bigger than 1. Since we don’t know which numbers will come first, we don’t know how long the sequence will be: we just know the breaking condition.
First, we use the “set.seed” instruction to fix the random numbers (generate always the same random numbers), and make this example reproduceable.
Then we initiate the “Repeat Loop” by generating a normally distributed random number, printing it, and checking if that number is bigger than 1. Only when this condition becomes TRUE (could be with the first generated number, or not), the loop cycle will pass to the break statement and end.
Output
[1] -0.9619334
[1] -0.2925257
[1] 0.2587882
[1] -1.152132
[1] 0.1957828
[1] 0.03012394
[1] 0.08541773
[1] 1.11661
This shows once again the importance of setting a proper breaking condition. Failing to do so will result in an infinite loop.
We’ve seen and explained concepts in isolation, but “Control Structures” can be combined anyway you want: Loops may contain several internal Loops; Conditionals may contain Loops and Conditionals, the options are endless. (in fact, when reviewing “Repeat Loops” we found that the examples contained nested “If statements”).
You can develop advanced solutions just by combining the “Control Structures” we explained in this article. Like Minsky stated, we can reach complex outcomes as a result of the interaction of simpler components. Control Structures constitute the basic blocks for decision making processes in computing. They change the flow of programs and enable us to construct complex sets of instructions out of simpler building blocks.
My advice is: learn about them.
It will ease your path to coding and understanding of programs, and will help you to find new ways of solving problems.