R data analysis and visualization for beginners GenEpi-BioTrain – Virtual Training 14 10.02.2025
Outline # Topic Time 1 Introduction 9:00 – 9:15 2 R & Rstudio overview 9:15 – 10:00 3 Q&A 10:00 – 10:05 4 Basic Syntax and Operations 10:05 – 10:30 5 Q&A 10:30 – 10:40 6 Break 20 min 10:40 – 11:00 7 Data Types & Structures 11:00 – 12:15 8 Q&A and closing remarks 12:15 – 12:30 2
Active participation is appreciated! 3 Ask questions Participate in Slido polls Share your knowledge and exchange experiences After the session use the discussion forum on Learning Portal to interact FIRST SLIDO POLL! https://app.sli.do/event/1ggsJNFuCDUVt7hhX8r5RT
4 Please download and install the Slido app on all computers you use S1 Icebreaker: Have you ever programmed? If yes in which programming language. ⓘ Start presenting to display the poll results on this slide.
5 Please download and install the Slido app on all computers you use S1Q1: Did you install R and RStudio? ⓘ Start presenting to display the poll results on this slide.
6 Germany Who are we? Molecular and Experimental Mycobacteriology and National Reference Center for Mycobacteria WHO – Supranational Reference Laboratory of Tuberculosis Prof. Stefan Niemann (Head)
Who are we ? Ivan Barilar Master of Science [Mol. Bio.] Uni. Zagreb – Research Associate [Pop. Gen.] –Uni. Hohenheim, Stuttgart 2012-2017, Research Associate [ Bioinfo .] @RCB since 2018 Bioinformatics team, WHO technical consultant Christian Utpatel Diploma Biology [Microbiology] - Dr. rer. nat. [Mircobiology] - University of Hamburg Research Associate @RCB since 2015 Co-Lead of the NGS unit, bioinformatics team, NGS implementation team, WHO technical consultant Viola Dreyer Master in Mathematics in Medicine and Life Science; Dr. rer . Nat. Bioinformatics at RCB/Uni Lübeck. Master Student/ Phd Student/Research Associate 2013 @RCB Bioinformatics team 7
8 Please download and install the Slido app on all computers you use S1Q2: How would you rate your experience level with R? ⓘ Start presenting to display the poll results on this slide.
R & RStudio overview GenEpi-BioTrain – Virtual Training 14 10.02.2025
Objectives This session consists of the following elements What is R? RStudio Interface 10
What is R? Open source software Programming language and environment which is well-suited for statistical analyses and visualisation Extensive (and expanding) set of tools and packages for various fields of study 11
What is RStudio? RStudio is an Integrated Development Environment (IDE) for R Makes coding in R easier with a user-friendly interface 12 If R is an engine RStudio is the dashboard
Open RStudio now 13
14 Please download and install the Slido app on all computers you use S1Q3: Did you manage to open RStudio? ⓘ Start presenting to display the poll results on this slide.
25 Please download and install the Slido app on all computers you use S1Q4: Which of the following panes in RStudio is used to write and edit scripts? ⓘ Start presenting to display the poll results on this slide.
26 Please download and install the Slido app on all computers you use S1Q5: What is the main difference between the Console and the Source (Script Editor) in RStudio? ⓘ Start presenting to display the poll results on this slide.
27 Please download and install the Slido app on all computers you use S1Q6: In which tab can you view plots generated in RStudio? ⓘ Start presenting to display the poll results on this slide.
Basic Syntax and Operations GenEpi-BioTrain – Virtual Training 14 10.02.2025
Objectives: RStudio input and syntax Console (Command line) Source (Scripts in the editor) 29
RStudio input Console (Command line) Source (Scripts in the editor) 30 Tab and upwards arrow are your best console friends Ctrl + L – clear console
RStudio input Console (Command line) Source (Scripts in the editor) 31 Tab again Ctrl + Enter – run line
R Syntax Variables X <- 2 Comments #This is a comment Keywords if, else, function ….. 32
R Syntax Errors Error in… Warnings Warning: Messages any other text 33
R Syntax, most common errors typoss Tipos TYPOS 34 Be careful of capital letters , running just a part of the code and not closing brackets
Working directory Set via code Set in RStudio interface 36
37 Please download and install the Slido app on all computers you use S1Q7: What happens when you type a command directly into the RStudio Console and press Enter? ⓘ Start presenting to display the poll results on this slide.
38 Please download and install the Slido app on all computers you use S1Q8: How do you run a selected line of code from the Source pane in RStudio? ⓘ Start presenting to display the poll results on this slide.
Data Types & Structures GenEpi-BioTrain – Virtual Training 14 10.02.2025
Objectives: Objects Vector Matrix 40
Objects, values and classes To understand computations in R, two slogans are helpful: Everything that exists is an object. Everything that happens is a function call. John Chambers 41
Objects, values and classes 42 Object Class Vector Numeric, integer, character, logical Factor Numeric, integer, character Array Numeric, integer, character, logical Matrix Numeric, integer, character, logical Data frame Numeric, integer, character, logical List Numeric, character, logical, function…
Numeric objects 43 X <- 2 Variable Object Assignment operator Value
Other object classes 44 Character Logical Integer Function Missing values word <- "hello“ word2 <- "A?7Fd“ Bool <- TRUE x <- 1L getwd () mv <- NA
Other object classes 45 To display a class of an object you can use a function like mode() All objects are temporarily saved in the workspace
Naming objects 46 Lowercase Underscore Avoid non-alphanumeric Do not use function names
Hands On! 47
48 Exercises Do basic arithmetical calculations for any combination of x , y and z . Use one variable more than once? Save the result in a new variable. Sum the values of x , y and z into a variable space . Try doing it in more than one way, there is a hint in the previous sentence. Make a new variable n . Assign it a negative value and then add it to one of the existing variables. Now do the same with its absolute value.
49 Please download and install the Slido app on all computers you use S1Q9: Which of the following is a valid variable name in R? ⓘ Start presenting to display the poll results on this slide.
50 Please download and install the Slido app on all computers you use S1Q10: What happens when you assign a new value to an existing variable in R? ⓘ Start presenting to display the poll results on this slide.
Vectors 51 Vector is an ordered set of entries It is usually defined by the concatenate c() function Vectors can be concatenated to each other v1 <- c(1, 2, 3, 4) v2 <- c(“a”, 1, 2, 3)
Vectors 52 Vectors can be created by functions such as seq () and rep() Elements of a vector can be accessed by indexing v1[1] v1 <- c(1, 2, 3, 4) v2 <- c(“a”, 1, 2, 3)
Vectors 53 There are many useful functions that can be applied to vectors length(), str(), mean(), sum(), min(), max()… Learning by doing
Hands On! 54
55 Exercises Make a vector of all numbers between 3 and 21. Multiply the second element of the vector by 3. Multiply the whole vector by 4. Multiply all but the last element of the vector by 5. Construct a vector of length 300 consisting of 100 copies of numbers 1,2,3.
56 Please download and install the Slido app on all computers you use S1Q11: What kind of an object is v1<3? ⓘ Start presenting to display the poll results on this slide.
Logical operators 57 == , equal != , unequal > , greater >= , greater or equal < , smaller <= , smaller or equal & , and | , or ! , not
Hands On! 58
Matrices 59 Matrix has a rectangular scheme of n * m values where n is the number of rows and m is the number of columns You can imagine it as two dimensional vector or a collection of vectors n m Elements of a matrix can be accessed by indexing m1[1,2]
Hands On! 60
61 Exercises Make a 3x3 matrix (3 rows, 3 columns), where the first column contains only number 4. From the above matrix extract columns where the sum of numbers in the column is bigger then 10. Use the colSums () function.
Acknowledgements The creation of this training material was commissioned by ECDC to Research Center Borstel with the direct involvement of Ivan Barilar 62