Data Manipulation_dplyr_string in R programming

shubhragoyal11 1 views 27 slides Sep 16, 2025
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

R


Slide Content

R Programming We will be starting shortly … Welcome to the Digital Regenesys course in

Data Manipulation in R

Data Manipulation in R Using dplyr package (Refer to dplyr_Demo.R ) Using Strings (Refer to all files in the Folder String_Operations )

dplyr package

Function - filter() Choosing rows which follow the given filter library( dplyr ) # Create a sample data frame df <- data.frame ( name = c("John", "Sarah", "Mike", "Emily"), age = c(23, 28, 19, 32), gender = c("M", "F", "M", "F") ) # Filter the data frame to include only individuals who are over 25 years old filtered_df <- filter( df , age > 25) # Print the filtered data frame print( filtered_df ) name age gender 1 Sarah 28 F 2 Emily 32 F

Function - filter()

Function - arrange() library( dplyr ) # Create a sample data frame df <- data.frame ( name = c("John", "Sarah", "Mike", "Emily"), age = c(23, 28, 19, 32), gender = c("M", "F", "M", "F") ) # Arrange the data frame by age in ascending order arranged_df <- arrange( df , age) # Print the arranged data frame print( arranged_df ) name age gender 1 Mike 19 M 2 John 23 M 3 Sarah 28 F 4 Emily 32 F

Function - arrange()

Function - select() & rename() library( dplyr ) # Create a sample data frame df <- data.frame ( name = c("John", "Sarah", "Mike", "Emily"), age = c(23, 28, 19, 32), gender = c("M", "F", "M", "F") ) # Select only the 'name' and 'age' columns from the data frame selected_df <- select( df , name, age) # Rename the 'name' column to ' full_name ' renamed_df <- rename( selected_df , full_name = name) # Print the renamed data frame print( renamed_df ) full_name age 1 John 23 2 Sarah 28 3 Mike 19 4 Emily 32

Function - select() & rename()

Function - mutate() and transmute(): library( dplyr ) # Create a sample data frame df <- data.frame ( name = c("John", "Sarah", "Mike", "Emily"), age = c(23, 28, 19, 32), gender = c("M", "F", "M", "F") ) # Use mutate to add a new column to the data frame that calculates the age in months mutated_df <- mutate( df , age_in_months = age * 12) # Print the mutated data frame print( mutated_df ) # Use transmute to create a new data frame that only includes the name and age in months columns transmuted_df <- transmute( df , name, age_in_months = age * 12) # Print the transmuted data frame print( transmuted_df ) name age gender age_in_months 1 John 23 M 276 2 Sarah 28 F 336 3 Mike 19 M 228 4 Emily 32 F 384 name age_in_months 1 John 276 2 Sarah 336 3 Mike 228 4 Emily 384

Function - mutate() and transmute()

Function - mutate() and transmute()

Function- summarise () library( dplyr ) # Create a sample data frame df <- data.frame ( name = c("John", "Sarah", "Mike", "Emily"), age = c(23, 28, 19, 32), gender = c("M", "F", "M", "F") ) # Calculate the mean age of the data frame summary_df <- summarise ( df , mean_age = mean(age)) # Print the summary data frame print( summary_df ) mean_age 1 25.5

Function- summarise ()

Function - sample_n () and sample_frac () library( dplyr ) # Create a sample data frame df <- data.frame ( name = c("John", "Sarah", "Mike", "Emily"), age = c(23, 28, 19, 32), gender = c("M", "F", "M", "F") ) # Sample 2 random rows from the data frame sample_n_df <- sample_n ( df , 2) # Print the sampled data frame print( sample_n_df ) # Sample 50% of the rows from the data frame sample_frac_df <- sample_frac ( df , 0.5) # Print the sampled data frame print( sample_frac_df ) name age gender 2 Sarah 28 F 4 Emily 32 F name age gender 1 John 23 M 4 Emily 32 F

Function - sample_n () and sample_frac ()

String Manipulation (Refer to all files in the Folder String_Operations )

String Manipulation Process of handling and analyzing strings Concatenation of strings Calculating Length of strings Case Conversion of strings Character replacement Splitting the string Working with substring

Concatenation paste() function Syntax: paste(…, sep=“ “, collapse = NULL) cat() function Syntax: cat(…, sep=“ “, file)

Calculating length of strings length() function returns the number of strings nchar() returns the number of characters in each of the string

Case conversion toupper() converts all the characters of the string to upper case tolower() converts all the characters of the string to lower case casefold() converts all the characters of the string to lower case or upper case

Character replacement c h a r t r ( o ld c h a r , n ew c h a r , s t r i ng ) every instance of old character is replaced by the new character in the specified set of strings length of the old string should not be longer than the new string

Splitting the string strsplit() Syntax: strsplit(x, split) X: string whose each element is going to be split split: removes this character from the string and splits the string from this region.

Working with substrings substr() or substring() substr (string, start, end) string: character vector first: starting index of the substring last: Ending index of the substring
Tags