R Programming Language

23,231 views 35 slides Oct 12, 2021
Slide 1
Slide 1 of 35
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35

About This Presentation

The presentation is a brief case study of R Programming Language. In this, we discussed the scope of R, Uses of R, Advantages and Disadvantages of the R programming Language.


Slide Content

Programming Language

History and Introduction R  is a  programming language  and free software environment for statistical computing and graphics supported by the  R  Foundation for Statistical Computing . R is widely used by statisticians, data analysts and researchers for developing statistical software and data analysis . It compiles and runs on a wide variety of UNIX platforms, Windows and Mac OS. The copyright for the primary source code for R is held by the R Foundation and is published under the  GNU General Public License version 2.0 . 2

3 History and Introduction R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. Currently R is developed & maintained by the R Development Core Team . The Applications of R programming language includes : 1.Statical Computing 2.Machine Learning 3.Data Science R can be downloaded and installed from CRAN( Comprehensive R Archive Network ) website. R language is cross platform interoperable and fully portable which means R program that you write on one platform can be carried out to other platform and run there.(Platform independent)

4 History and Introduction Top Tier companies using R – companies all over the world use R language for statical analysis. These are some of top tier companies that uses R. Company Name Applications Facebook For behavior analysis related to status updates and profile pictures. Google For advertising effectiveness and economic forecasting. Twitter For data visualization and semantic clustering Microsoft Acquired Revolution R company and use it for a variety of purposes. Uber For statistical analysis Airbnb Scale data science.

5 Evolution of R R is a dialect of S language… It means that R is an implementation of the S programming language combined with lexical scoping  semantics & inspired by  Scheme.   S language was created by John Chambers in 1976 at Bell Labs. A commercial version of S was offered as S-PLUS starting in 1988 . R S version1 S version 2 S version 3 S version4

6 Features of R These are the some of important features of R - R is a simple, effective and well-developed, programming language which includes conditionals, loops, user defined & recursive functions and input & output facilities. R provides a large, coherent and integrated collection of tools for data analysis . R has an effective data handling and storage facility . R provides a suite of operators for calculations on arrays, lists, vectors and matrices . R provides graphical facilities for data analysis and display either directly at the computer or printing at the papers.

7 Features of R Fast Calculation Extremely Compatible Open Source Cross Platform Support Wide Packages Large Standard Library Fast Calculation - R can be used to perform complex mathematical and statistical calculations on data objects of a wide variety. Extreme Compatibility - R is an  interpreted language  which means that it  does not need a compiler  to make a program from the code . Open Source - R is an open-source software environment . You can make improvements and  add packages  for additional functionalities Cross Platform Support - R is  machine-independent . It supports the cross-platform operation. Therefore, it can be used on many different operating systems . Wide Packages - CRAN houses more than   10,000  different  packages  and extensions that help solve all sorts of problems in data science . Large Standard Library - R can produce static graphics with production quality visualizations and has extended libraries providing interactive graphic capabilities.

8 Syntax of R Once we have R environment setup, then it’s easy to start our R command prompt by just typing R in command prompt. Hello World Program – > myString <- “ Hello world ! ” > print( myString ) Output : [1] “Hello World !” The [ ]   in the output of   R  can be used to reference data frame columns In the Syntax of R we will discuss – Data Types Variables Keywords Operators Data Structures

Data Types Logical Integer Character Numeric raw Complex Data Types In R there are basically 6 data types – Data Type Examples Integer 2L,5L,8L Numeric 6,2,1,9 Logical true,false,0,1 raw Raw Bytes complex Z=3+7i Character ‘A ’ , ”Aditya” , ” AB12” 9

10 Variables Rules For Naming Variables in R – In R variable name must be a combination of letters, digits, period(.) and underscores. It must start with a letter or period(.) and if it starts with period then it period should not be followed by number. Reserved words in R cannot be used in variable name. Valid variables Invalid Variables myValue .my.value.one my_value_one Data4 .1nikku TRUE vik@sh _temp

Keywords Reserved Keywords in R – Reserved words are set of words that have special meaning and cannot be used as names of identifiers. If Else Repeat While Function For In Next Break TRUE FALSE NULL inf NaN - Reserved Keywords in R 11

12 Operators In any programming language, an operator is a symbol which is used to represent an action. R has several operators to perform tasks including arithmetic, logical and bitwise operations. Operators in R can mainly be classified into the following categories – 1.Arithmetic Operators = {+ , - , * , / , %% , %/%} 2.Logical Operators = { ! , & , && , | , ||} 3.Assignment operators = { <- , <<- , = , -> , ->>} 4.Relational Operators = { < , > , <= , >= , != , ==}

13 Functions Functions are used to incorporate sets of instructions that you want to use repeatedly. There are two types of functions. Function Built In User Defined

14 Built - In Built-in functions are those functions which are provided by R so that we can use directly within the language and its standard libraries. In R there are so many built-in functions which make our programming fast and easy. For Example : 1.The sum( a,b ) function will return ( a+b ) > print( sum( 10,20 ) ) [1] 30 2.The seq ( a,b ) function is used to get sequence from a to b. > print( seq ( 5,15 ) ) [1] 5 6 7 8 9 10 11 12 13 14 15

User Defined User defined functions are those functions which we define in our code and use them repeatedly. These functions can be defined with two types. 1.Without Arguments 2.With Arguments Without Arguments With Arguments myFunction <- function() { #This will be printed on calling this funcition print(“Without Arguments”) } myFunction <- function( a,b ) { #This function will print sum of passed args print( a+b ) }

16 Conditional Statements Conditional Statements in R  programming are used to make decisions based on the conditions . Conditional statements execute sequentially when there is no condition around the statements. In R language we’ll discuss 3 types of Conditional Statements – 1.If - else statements 2.If – else if – else statements 3.Switch statements

17 If-else Start Execute Else block End Execute If Block Condition True? yes no Syntax – If(condition) { expression 1 } Else { expression 2 } Example – If(a>b) { print(“a is greater than b”) } Else { print(“ a is less than b”) }

18 Switch statement In switch() function we pass two types of arguments one is value and others is list of items. The expression is evaluated based on the value and corresponding item is returned. If the value evaluated from the expression matches with more than one item of the list then switch() function returns the item which was matched first. Examples: > switch( 2,”Delhi”,”Jaipur”,”Mumbai” ) > a=3 >[1] “Jaipur” > switch( a,”red”,”blue”,”green”,”yellow ” ) > [1] “green”

19 Loops While Loop For Loop Repeat Loop Loops In R Loops are used When we need to execute particular code repeatedly. In R Language there are 3 types of Loops – 1.For Loop 2.While Loop 3.Repeat Loop

For Loop E xample to count the number of even numbers in a vector . Program - x <- c(2,5,3,9,8,11,6) count <- 0 f or ( i in x) { if( i %% 2 == 0) { count=count+1 } } p rint(count) Output - [1] 3 No Last item Reached?? Body of For Loop Exit Loop Yes For each item in Sequence A for loop is used to iterate over a  vector in R programming . 20

21 While Loop In R programming, while loops are used to loop until a specific condition is met . Program – i <- 1 while( i <5) { print( i ) i =i+1 } Output – [1] 1 [1] 2 [1] 3 [1] 4 Yes No Condition True?? Execute code of while block Start Execute code outside while block

Repeat Loop A repeat loop is used to iterate over a block of code multiple number of times . There is no condition check in repeat loop to exit the loop . We must ourselves put a condition explicitly inside the body of the loop and use the break statement to exit the loop. Failing to do so will result into an infinite loop. Example – x <- 1 repeat { print(x) x = x+1 if (x == 4) { break } } Output – [1] 1 [1] 2 [1] 3 Body of Loop Break? Remaining body of loop Exit Enter Loop Yes No 22

23 Data Structures Data Structures Vectors Factors Data Frames Lists Matrices Arrays A data structure is a particular way of organizing data in a computer so that it can be used effectively. The idea is to reduce the space and time complexities of different tasks. Data structures in R programming are tools for holding multiple values. The most essential data structures used in R include : Vectors Arrays Factors Lists Matrices Data Frames

24 Vector Vector is the one of basic data structure of R which supports integer, double, Character, logical, complex and raw data types. The elements in a vector are known as components of a vector. Vector Creation Vector can be created using these two methods :- 1.By Using Colon( :) Operator – a <- 2:8 p rint(a) # 2 3 4 5 6 7 8 2.By Using seq () function – a <- seq (2,10,by=2) print(a) # 2 4 6 8 10

25 Vector Vector Operations 1.Combining Vectors 2.Arithmetic Operations a <- c(4,3,5) a <- c(1,2,3) b <- c(‘ x’,’y’,’z ’) b <- c(4,5,6) c <- c( a,b ) d <- a+b p rint(c) o/p= 4 3 5 x y z print(d) o/p = 5 7 9 3.Numeric Indexing 4.Duplicate Indexing a <- c(4,3,5 ) a <- c(4,3,5 ) Print(a[2]) op = 5 print(a[1,2,2,3,3]) o/p=4 3 3 5 5 5.Logical Indexing 6.Range Indexing a <- c(4,3,5) a <- c(1,2,3,4,5,6,7) print(a[ true,false,true ]) o/p = 4 5 print(a[2:6]) o/p = 2 3 4 5 6

26 Array Arrays allow us to store data in multi - dimensions and use in efficient way. array Creation Syntax - Array_Name <- array(data, dim=( row_size,column_size,matrices ), dim_names ) array Operations 1.Accessing Array Elements – Accessing array in R is similar to other programming languages like c,c ++ and java. Eg . Print( Arr [2,2]) 2.Arithmetic Operations – Eg . Arr3 <- Arr2 + Arr1 Or Arr3 = Arr1 – Arr2

27 Data Frame Data Frame is a table or a two dimensional Array type structure. Important Considerations The Column names should be non-empty. The row names should be unique. The Data stored in Data Frames can be only Numeric, Factor or Character Type. Each column should contain same number of data types. Data Frame Creation products <- data.frame ( product_number = seq (1:4) product_name = c(“Apple”,”Samsung”,”Redmi”,” Oppo ”)) print(products) Product_number Product_name 1 Apple 2 Samsung 3 Redmi 4 Oppo

28 Lists List is a data structure which have components of Mixed data types. So a vector having elements of different data types is called a list. List can be created using list() function. Eg . – x <- list( a =“ amba ”, b=9.23,c=TRUE) #list storing 3 different data types Accessing List Elements print(x[‘b’]) o/p = 9.23 print(x[‘a’]) o/p = “ amba ” Manipulating List Elements x [‘a’] <- “ nitin ” print([‘a’]) o/p = “ nitin ”

29 Matrices In R two dimensional rectangular data set is known as Matrix. A Matrix is created with the help of the input vector to the matrix() function. We can Perform addition, subtraction, multiplication and division operations on matrices. Creating matrix - Matrix1 <- matrix (2:7,nrow=2,ncol=3) print(Matrix1) o/p = 2 4 6 3 5 7 Accessing Elements – Matrix1[2,3] # 7 Assigning Value – Matrix1[2,3]=1

Matrices Operations On Matrices 1.Addition : Matrix3=Matrix1+Matrix2 2.Subtraction : Matrix3 = Matrix2 – Matrix1 3.Multiply by a Constant : Ex : 7*Matrix1 4.Identity Matrix : Ex – diag (5) 5.Transposition Ex – t(Matrix1) 30

31 Factors Factors are data objects which are used to categorise the data and store it as levels. For example: a data field such as marital status may contain only values from single, married, separated, divorced, or widowed . >x [1] single married married single Levels : married single Here, we can see that factor x has four elements and two levels. We can check if a variable is a factor or not using  class() function. >class(x) [1] “factor” >levels(x) [1] married single

R - Studio Interacting with R Studio – R-Studio is a free and open-source integrated development environment (IDE) for R, a programming language for statistical computing and graphics . R-Studio was founded by JJ Allaire,creator of the programming language ColdFusion. There are 4 main sections in R-Studio IDE… 1.Code Editor 2.Workspace and History 3.R console 4.Plots and Files

33 R - Studio RStudio is available in two editions: 1.RStudio Desktop , where the program is run locally as a regular desktop application . 2. RStudio Server , Prepackaged distributions of RStudio Desktop are available for Windows, OS X, and Linux . RStudio is written in the C++ programming language and uses the Qt framework for its graphical user interface.

34 Why learn R?

Thank You…!