R Programing

What is R? R is a interactive language and environment for statistical computing and graphics.

In an oversimplified sense, think of a programmable calculator, now think of R as a advanced programmable calculator. Difference being that you need to know how to “talk” the R-language in order to tell it to do what you want - The “talk” occurs through writing.

Each language has its own specific syntax. This is simply a set of rules that makes the writer (you) and the reader (computer) make sense of the written sentences. Even in a calculator, you cannot write 3+3+. This will through an error. That is because the statement/sentence is syntactically incorrect.

What is Rstudio?

RStudio is an integrated development environment (IDE) for R. Think of it as a software application that provides the capability to easily run R. Note that though the original objective for Rstudio was to easily run R, Rstudio has expanded such that we can be able to use it in building various tools. These notes(website) for example was created using Quarto on Rstudio.

Rstudio has 4 panes:

Editor pane - This is where you write your code.
Console pane - This is where you run your code
Environment pane -Gives you an overview of the variables currently stored in memory
Plot pane - Shows the graphs plotted.

For easier access to your code, ensure to write the code in the editor pane.

In this course we will learn R and its syntax.

Math in R

You can do any normal calculations in R, the same way you do in a calculator.

Try it out.

Math Operators

These are functions used to do basic math math operations. They are subdivided into two categories:

Arithmetic Operators: used to carry out mathematical operations

Operator Description

+ Addition

– Subtraction

* Multiplication

/ Division

^ or ** Exponent

%% Modulus (Remainder from division)

%/% Integer Division
```
5**3
```
```
[1] 125
```
```
5^3
```
```
[1] 125
```
```
5%%3
```
```
[1] 2
```
```
5%/%3
```
```
[1] 1
```
Others include
- %*% - Matrix multiplication
- %o% - Outer multiplication
- %x% - Kronecker multiplication
Relational Operators: Used to compare between two values.

Operator Description

< Less than

> Greater than

<= Less than or equal to

>= Greater than or equal to

== Equal to

!= Not equal to
Logical Operators

Operator Description

& AND

| OR

! NOT

Operator	Description
+	Addition
–	Subtraction
*	Multiplication
/	Division
^ or **	Exponent
%%	Modulus (Remainder from division)
%/%	Integer Division

Operator	Description
<	Less than
>	Greater than
<=	Less than or equal to
>=	Greater than or equal to
==	Equal to
!=	Not equal to

Operator	Description
&	AND
\|	OR
!	NOT

Although all the functions/operators above are binary, ie work on two inputs eg x + y where x and y are your inputs, others can also accept just one input, hence considered are unary operators. eg -x to negate x and !x to negate the logical value of x

-3

[1] -3

!TRUE

[1] FALSE

For math computation, the order of operations (PEMDAS/BODMAS) is strictly observed.

Named Math functions

But R is an advanced calculator. How can I compute trigonometric values? The only downside with R is that you need to know the name for the functions you want. Regarding math functions, this is simple as they are stored exactly the same way they are called in math. look at the list of math functions below:

abs, sign, sqrt, ceiling, floor, trunc, cummax, cummin, cumprod, cumsum, log, log10, log2, log1p, acos, acosh, asin, asinh, atan, atanh, exp, expm1, cos, cosh, cospi, sin, sinh, sinpi, tan, tanh, tanpi, gamma, lgamma, digamma, trigamma

From the list above, it is easy to tell what sqrt function does. ie It is the \(\sqrt{~~}\) function. We can tell what exp, sin, cos, tan are. But what about atan, asin, tanh etc?

Meaning if you need to compute \(sin^{-1}(x)\) you would have to know that the sine inverse function is represented as asin in R whereby the a stands for arc ie arc sine function. In other languages, the same function will have a different name. eg in python we use arcsin instead.

From now on, you are expected to know the function names that you would use before using it. If you are not sure what the function is, you can Google.

Exercise 1

Compute the following using R:
1. \(\log_{10}100\)
2. \(\log_{e}e^2\)
3. \(\log_2 8\)
4. \(\sin(30^\circ)\) hint
5. \(\sin^{-1}(0.5)\) in degrees. hint: look at part d above.
6. \(\sqrt{4}\) and \(\sqrt{-4}\)
7. \(\sqrt[3]{8}\) and \(\sqrt[3]{-8}\) hint: \(\sqrt[3]{-8} = -2\)
8. \(0^0\). Is this correct?
Round off the following numbers using R:
1. \(980, 950, 930\) to the nearest 100 Hint: round(123, -2) rounds to nearest 100
2. \(98, 95, 93\) to the nearest 10
3. \(9.8, 9.5, 9.3\) to the nearest 1
4. \(0.98, 0.95, 0.93\) to the nearest 0.1
5. \(0.098, 0.095, 0.093\) to the nearest 0.01
Determine the final output of the following operations and check your answer against those produced by R
```
    (3 > 4) | TRUE
    3 > 4 | TRUE
    (3 > 4) & TRUE
    (3 > 4) | FALSE
    3 >= 3
    3 != 4
```

R Basic Data Types

numeric(Double, Integer)

Doubles - real numbers

Integer - Whole numbers. You could append the letter L at the end.

typeof(5)

[1] "double"

typeof(5L)

[1] "integer"

Internally stored as double to have more space ie precise representation.

0.1+0.2 == 0.3

[1] FALSE

Character

strings, names

typeof("a")

[1] "character"

Logical

typeof(TRUE)

[1] "logical"

Complex

Must contain an i at the end for it to be considered as complex.

typeof(5+1i)

[1] "complex"

The other main type is raw

Variables

Variables are used to store data, whose value can be changed according to our need. Unique name given to variable is identifier as it enables identify the data stored in memory.

Usually they are lvalues and rvalues, ie they can be on left side of the assignment operator and also be on the right side of the assignment operator.

Naming Variables

One usually decides on the name to use for his/her variables. The rules followed in coming up with a variable name are:

Identifiers can be a combination of letters, digits, period (.) and underscore (_) ONLY.
It must start with a letter or a period. If it starts with a period, it cannot be followed by a digit.
Reserved words and Constants in R cannot be used as identifiers.

Variable and function names should be lowercase. Use an underscore (_) to separate words within a name. Generally, variable names should be nouns and function names should be verbs. Strive for names that are concise and meaningful.

    # Good
    day_one
    day_1

    # Bad
    first_day_of_the_month
    DayOne
    dayone
    djm1

The Assignment Operator

In order to make use of the variables, we need to be able to assign values to the variable. This is done by the help of the assignment operator. Often a language will restrict the assignment operator to only one symbol, =. That is not the case with R. In R we have many assignment operators.

The left assignment operator. <- or =

x <- 3
y = 2
a <- b <- 4 # assigning 4 to both a and b
d = e = 5 # assigning 5 to both d and e

The right assignment operator ->
```
10 -> x # assigning 10 to x
```

Example of using a variable

x <- 10 # create a variable x with the value 10
x # implicitly print the value of x. We could also use print(x)

[1] 10

x * 2 # Multiply x by 2 ie 10*2

[1] 20

x <- x + 2 # increment x by 2
x #x is now 12

[1] 12

Note: Refrain from using inbuilt function names as variables. eg c <- 3. c is a function in R and hence should not be used as a variable name.

Note: There is an assign function which can also be used to assign values to variables. The variable need to be written in literal form ie with quotes

assign("var_1", 3)
var_1

[1] 3

So far we have avoided the use of literal strings/characters. But they too can be used in assignment. Although this is a bad practice.

"var_2" <- 39 # DO NOT USE THIS THOUGH IT WORKS
var_2

[1] 39

Reserved Words.

While variables names could be anything, there are words reserved in R such that they cannot be changed nor can they be used as variables

if	else	repeat	while	function
for	in	next	break	TRUE
FALSE	NULL	Inf	NaN	NA
NA_integer_	NA_real_	NA_complex_	NA_character_	…1, …2

TRUE <- 1

Error in TRUE <- 1: invalid (do_set) left-hand side to assignment

if <- 2

Error: <text>:1:4: unexpected assignment
1: if <-
       ^

Constants

These are rvalues. They cannot be on the left hand side of the assignment operator. Though common in lower level languages, R does not have much constants in it. Examples include numbers eg 5, literal strings/characters eg ’hello' , complex numbers -a number patched with the letter i eg 5i , 3+9i , integers eg 5L, hexadecimals-numbers preceded by 0X or 0x eg 0xff ,logical values eg TRUE

[1] 5

3+9i

[1] 3+9i

0xff

[1] 255

TRUE

[1] TRUE

The value \(\pi\) which is a constant in nature is just a normal variable in R. It can be changed. Hence be careful when dealing with these types of values

pi

[1] 3.141593

pi <-4
pi # pi changed

[1] 4

rm(pi)#To remove the current stored variable pi and revert back to the original pi
pi

[1] 3.141593

NOTE: The variables F and T store the logical values false and true simultaneously. Though logical, they can be changed. Hence refrain from using them, or simply refrain from having variables named as F or T

[1] TRUE

T <- FALSE # Change the T
T # changed T

[1] FALSE

rm(T) #remove the variable T and revert back to the original T
T

[1] TRUE

Writing Basic Functions

A function in R is an object containing multiple interrelated statements that are run together in a predefined order every time the function is called.

A simple function is defined by the keyword function and then stored in a variable name:

square_10 <- function() {
  10^2
}

The simple statement above when called will return 100

square_10()

[1] 100

It does not make sense to write a function that will always return a constant. We just rather define the constant itself. But to make use of the function property, we need to define the function with some passed parameters. This will enable the function to evaluate the parameters in a predefined manner. The parameter, is just a variable, ie placeholder that is passed into the function, when the function is called

Example: A function to square any number, not necessarily 10

square <- function(x){ # x is your parameter.
  x^2
}

square(10)

[1] 100

square(5)

[1] 25

rect_area <- function(len, width){ # takes 2 parameters
  len * width
}
rect_area(10, 5)

[1] 50

Take note that when calling any function in R, whether user defined or inbuilt functions, we use the parenthesis. ie mean(a)e tc.

There are a lot of details that a function entails, although those will be discussed in a future date.

Comments

Comments are often important part of a program as they describe what each part of the program does. It is often necessary to include them so as your code can be understood by someone else or even by yourself later on when reviewing it. In R as in Python, comments are preceded by a sharp/hash/tag/pound symbol ie # Thus any line of code from the hash onwards is considered commented out as it will not be parsed by the interpretor

1+1 # This is a comment

[1] 2

#This whole line is a comment on finding sin of degrees
sin(30*pi/180)

[1] 0.5

Exercise 2

A cylinder has a radius of r cm and height of hcm. Write a function to obtain the surface area when completely covered. \(SA = 2\pi rh + 2\pi r^2\) Compute the SA when radius = 10cm and height = 18cm
let x = 3 what is the value of !x ? Elaborate as to why that is the case. What numerical value can x take such that !x results to TRUE ?
There are other assignment operators in R, ie <<- and ->>. What is the difference between these and the ones discussed above? Run ?`=` in R and read the help page to the end. See whether you could answer the question.
What are the differences between “=” and “<-” assignment operators?