5**3
[1] 125
5^3
[1] 125
5%%3
[1] 2
5%/%3
[1] 1
What is R? R is a interactive language and environment for statistical computing and graphics.
In an oversimplified sense, think of a programmable calculator, now think of R as a advanced programmable calculator. Difference being that you need to know how to “talk” the R-language in order to tell it to do what you want - The “talk” occurs through writing.
Each language has its own specific syntax. This is simply a set of rules that makes the writer (you) and the reader (computer) make sense of the written sentences. Even in a calculator, you cannot write 3+3+
. This will through an error. That is because the statement/sentence is syntactically incorrect.
What is Rstudio?
RStudio is an integrated development environment (IDE) for R. Think of it as a software application that provides the capability to easily run R. Note that though the original objective for Rstudio was to easily run R, Rstudio has expanded such that we can be able to use it in building various tools. These notes(website) for example was created using Quarto on Rstudio.
Rstudio has 4 panes:
Editor pane - This is where you write your code.
Console pane - This is where you run your code
Environment pane -Gives you an overview of the variables currently stored in memory
Plot pane - Shows the graphs plotted.
For easier access to your code, ensure to write the code in the editor pane.
In this course we will learn R and its syntax.
You can do any normal calculations in R, the same way you do in a calculator.
Try it out.
These are functions used to do basic math math operations. They are subdivided into two categories:
Arithmetic Operators: used to carry out mathematical operations
Operator | Description |
---|---|
+ | Addition |
– | Subtraction |
* | Multiplication |
/ | Division |
^ or ** | Exponent |
%% | Modulus (Remainder from division) |
%/% | Integer Division |
5**3
[1] 125
5^3
[1] 125
5%%3
[1] 2
5%/%3
[1] 1
Others include
Relational Operators: Used to compare between two values.
Operator | Description |
---|---|
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |
== | Equal to |
!= | Not equal to |
Logical Operators
Operator | Description |
---|---|
& | AND |
| | OR |
! | NOT |
Although all the functions/operators above are binary, ie work on two inputs eg x + y
where x
and y
are your inputs, others can also accept just one input, hence considered are unary operators. eg -x
to negate x
and !x
to negate the logical value of x
-3
[1] -3
!TRUE
[1] FALSE
For math computation, the order of operations (PEMDAS/BODMAS) is strictly observed.
But R is an advanced calculator. How can I compute trigonometric values? The only downside with R is that you need to know the name for the functions you want. Regarding math functions, this is simple as they are stored exactly the same way they are called in math. look at the list of math functions below:
abs, sign, sqrt, ceiling, floor, trunc, cummax, cummin, cumprod, cumsum, log, log10, log2, log1p, acos, acosh, asin, asinh, atan, atanh, exp, expm1, cos, cosh, cospi, sin, sinh, sinpi, tan, tanh, tanpi, gamma, lgamma, digamma, trigamma
From the list above, it is easy to tell what sqrt
function does. ie It is the \(\sqrt{~~}\) function. We can tell what exp, sin, cos, tan
are. But what about atan, asin, tanh
etc?
Meaning if you need to compute \(sin^{-1}(x)\) you would have to know that the sine inverse function is represented as asin
in R whereby the a
stands for arc
ie arc sine
function. In other languages, the same function will have a different name. eg in python we use arcsin
instead.
From now on, you are expected to know the function names that you would use before using it. If you are not sure what the function is, you can Google.
Compute the following using R:
Round off the following numbers using R:
\(980, 950, 930\) to the nearest 100 Hint: round(123, -2)
rounds to nearest 100
\(98, 95, 93\) to the nearest 10
\(9.8, 9.5, 9.3\) to the nearest 1
\(0.98, 0.95, 0.93\) to the nearest 0.1
\(0.098, 0.095, 0.093\) to the nearest 0.01
Determine the final output of the following operations and check your answer against those produced by R
(3 > 4) | TRUE
3 > 4 | TRUE
(3 > 4) & TRUE
(3 > 4) | FALSE
3 >= 3
3 != 4
Doubles - real numbers
Integer - Whole numbers. You could append the letter L at the end.
typeof(5)
[1] "double"
typeof(5L)
[1] "integer"
Internally stored as double to have more space ie precise representation.
0.1+0.2 == 0.3
[1] FALSE
strings, names
typeof("a")
[1] "character"
typeof(TRUE)
[1] "logical"
Must contain an i
at the end for it to be considered as complex
.
typeof(5+1i)
[1] "complex"
The other main type is raw
Variables are used to store data, whose value can be changed according to our need. Unique name given to variable is identifier as it enables identify the data stored in memory.
Usually they are lvalues and rvalues, ie they can be on left side of the assignment operator and also be on the right side of the assignment operator.
One usually decides on the name to use for his/her variables. The rules followed in coming up with a variable name are:
Identifiers can be a combination of letters, digits, period (.) and underscore (_) ONLY.
It must start with a letter or a period. If it starts with a period, it cannot be followed by a digit.
Reserved words and Constants in R cannot be used as identifiers.
Variable and function names should be lowercase. Use an underscore (_
) to separate words within a name. Generally, variable names should be nouns and function names should be verbs. Strive for names that are concise and meaningful.
# Good
day_one
day_1
# Bad
first_day_of_the_month
DayOne
dayone
djm1
In order to make use of the variables, we need to be able to assign values to the variable. This is done by the help of the assignment operator. Often a language will restrict the assignment operator to only one symbol, =
. That is not the case with R. In R we have many assignment operators.
The left assignment operator. <-
or =
<- 3
x = 2
y <- b <- 4 # assigning 4 to both a and b
a = e = 5 # assigning 5 to both d and e d
The right assignment operator ->
10 -> x # assigning 10 to x
Example of using a variable
<- 10 # create a variable x with the value 10
x # implicitly print the value of x. We could also use print(x) x
[1] 10
* 2 # Multiply x by 2 ie 10*2 x
[1] 20
<- x + 2 # increment x by 2
x #x is now 12 x
[1] 12
Note: Refrain from using inbuilt function names as variables. eg c <- 3
. c
is a function in R and hence should not be used as a variable name.
Note: There is an assign
function which can also be used to assign values to variables. The variable need to be written in literal form ie with quotes
assign("var_1", 3)
var_1
[1] 3
So far we have avoided the use of literal strings/characters. But they too can be used in assignment. Although this is a bad practice.
"var_2" <- 39 # DO NOT USE THIS THOUGH IT WORKS
var_2
[1] 39
While variables names could be anything, there are words reserved in R such that they cannot be changed nor can they be used as variables
if | else | repeat | while | function |
---|---|---|---|---|
for | in | next | break | TRUE |
FALSE | NULL | Inf | NaN | NA |
NA_integer_ | NA_real_ | NA_complex_ | NA_character_ | …1, …2 |
TRUE <- 1
Error in TRUE <- 1: invalid (do_set) left-hand side to assignment
if <- 2
Error: <text>:1:4: unexpected assignment
1: if <-
^
These are rvalues. They cannot be on the left hand side of the assignment operator. Though common in lower level languages, R does not have much constants in it. Examples include numbers eg 5
, literal strings/characters eg ’hello'
, complex numbers -a number patched with the letter i eg 5i
, 3+9i
, integers eg 5L
, hexadecimals-numbers preceded by 0X
or 0x
eg 0xff
,logical values eg TRUE
5
[1] 5
3+9i
[1] 3+9i
0xff
[1] 255
TRUE
[1] TRUE
The value \(\pi\) which is a constant in nature is just a normal variable in R. It can be changed. Hence be careful when dealing with these types of values
pi
[1] 3.141593
<-4
pi # pi changed pi
[1] 4
rm(pi)#To remove the current stored variable pi and revert back to the original pi
pi
[1] 3.141593
NOTE: The variables F
and T
store the logical values false and true simultaneously. Though logical, they can be changed. Hence refrain from using them, or simply refrain from having variables named as F
or T
T
[1] TRUE
<- FALSE # Change the T
T # changed T T
[1] FALSE
rm(T) #remove the variable T and revert back to the original T
T
[1] TRUE
A function in R is an object containing multiple interrelated statements that are run together in a predefined order every time the function is called.
A simple function is defined by the keyword function
and then stored in a variable name:
eg
<- function() {
square_10 10^2
}
The simple statement above when called will return 100
square_10()
[1] 100
It does not make sense to write a function that will always return a constant. We just rather define the constant itself. But to make use of the function property, we need to define the function with some passed parameters. This will enable the function to evaluate the parameters in a predefined manner. The parameter, is just a variable, ie placeholder that is passed into the function, when the function is called
Example: A function to square any number, not necessarily 10
<- function(x){ # x is your parameter.
square ^2
x }
square(10)
[1] 100
square(5)
[1] 25
<- function(len, width){ # takes 2 parameters
rect_area * width
len
}rect_area(10, 5)
[1] 50
Take note that when calling any function in R, whether user defined or inbuilt functions, we use the parenthesis. ie mean(a)
e tc.
There are a lot of details that a function entails, although those will be discussed in a future date.
A cylinder has a radius of r cm and height of hcm. Write a function to obtain the surface area when completely covered. \(SA = 2\pi rh + 2\pi r^2\) Compute the SA when radius = 10cm and height = 18cm
let x = 3
what is the value of !x
? Elaborate as to why that is the case. What numerical value can x
take such that !x
results to TRUE
?
There are other assignment operators in R, ie <<-
and ->>
. What is the difference between these and the ones discussed above? Run ?`=`
in R and read the help page to the end. See whether you could answer the question.
What are the differences between “=” and “<-” assignment operators?
Comments
Comments are often important part of a program as they describe what each part of the program does. It is often necessary to include them so as your code can be understood by someone else or even by yourself later on when reviewing it. In R as in Python, comments are preceded by a sharp/hash/tag/pound symbol ie
#
Thus any line of code from the hash onwards is considered commented out as it will not be parsed by the interpretor