Moving from Data Scientist To Data Analyst (1).pptx
Size: 1.19 MB
Language: en
Added: Sep 28, 2025
Slides: 12 pages
Slide Content
MOVING FROM DATA SCIENTIST TO DATA ANALYST Presented By :- Anchal Singh Roll number :- BCA2302419
R for Data Science R is a powerful, open-source programming language widely used in data science for statistical computing, data manipulation, and visualization. Introduction to R for Data Science R provides tools for data extraction, cleaning, transformation, and analysis. It has many specialized packages like dplyr for data manipulation and ggplot2 for creating insightful visualizations. R supports statistical modeling, machine learning, and handling of structured and unstructured data. 2
USING THE EXERCISE FILES Download and Save Open in R or Rstudio Run Code Snippets Step-by-Step Explore and Modify Load Data into R Practice Data Analysis and Visualization Use Help and Documentation
R in context R in context refers to understanding how the R programming language fits within the broader scope of data science, analytics, and statistical computing AI in R refers to the use of the R programming language for artificial intelligence (AI) and machine learning tasks. R offers a rich ecosystem of packages and tools that enable data scientists and analysts to build, train, and deploy AI models effectively A Note About AI in R
Steps to Install R And RStudio Visit the official Comprehensive R Archive Network (CRAN) website: https://cran.r-project.org/ Choose Your Operating System Run the Installer Verify Installation Install RStudio (Optional but Recommended) Launch RStudio or R Console 5
ENVIRONMENTS FOR R 6 R Console (Base R GUI) Rstudio Jupyter Notebooks VS Code with R Extension Statistical Software Integration Cloud-Based Environments Toolbar: Quick access to running code, creating new files, saving, and project management. Projects: Organize workspaces and files into projects for better workflow management. Console Options: Interrupt execution, clear console, and manage graphics. NAVIGATING THE RSTUDIO ENVIRONMENT
ENTERING DATA 7 Manual Data Entry Using Vectors or Data Frames Using read.csv() to Load Data from CSV Files Using read.table () for General Text Files Using Clipboard for Quick Copy-Paste Interactive Data Entry with edit()
Data Types Numeric: Represents real numbers (e.g., 2.5, -10) Integer: Whole numbers (e.g., 1L, 100L with L suffix) Character: Text or string data (e.g., "Hello") Logical: Boolean values TRUE or FALSE Factor: Categorical data with fixed levels (used for categorical variables) Complex: Complex numbers (e.g., 1+2i) Vector: A sequence of elements of the same data type (e.g., numeric vector) List: An ordered collection of elements which can be of different types Matrix: Two-dimensional array of elements of the same type with rows and columns Data Frame: Table-like structure where columns can be different types; most commonly used for datasets Array: Multi-dimensional generalization of a matrix with same-type elements 8 Data Structure
Comments And Headers In R, comments are used to explain code, improve readability, and temporarily disable code execution. They begin with the # symbol, and everything after # on the same line is ignored by R during execution. Headers are comment sections at the beginning of scripts or code blocks to describe purpose, author, date, and summary. 9
PACKAGES FOR R 10 Category Popular R Packages Purpose Data Manipulation dplyr , tidyr , data.table Efficient data transformation and cleaning Data Visualization ggplot2 , plotly , shiny Static and interactive visualizations Machine Learning caret , randomForest , xgboost Building classification, regression models Statistical Analysis stats (base), MASS Statistical modeling and tests
DATA VISUALIZATION OPTIONS Creating Bar Graphs Creating Histogram Creating Box Plots Creating Scatterplots Creating Line Charts Creating Cluster Charts 11 DATA ANALYSIS Computing Frequencies Computing Descriptives Computing Correlations Computing a linear regression Computing Contingency Tables