DATA SCIENCE PPT.pptx

vikashyadav23235277 39 views 19 slides Sep 23, 2024
Slide 1
Slide 1 of 19
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19

About This Presentation

data science ppt


Slide Content

DATA SCIENCE USING PYTHON Under the guidance of :- Name-R.J SHUKLA (N.I.L.E.T) Name of Student :- VIKASH YADAV ECE 4 TH YEAR ROLL NU- 2105250310075 Department of Electronics & Communication Engineering BUDDHA INSTITUTE OF TECHNOLOGY, GIDA, GORAKHPUR

CONTENTS 1.Introduction 2. What is Data Science ? 3 . Why Data Science is required ? 4. Why Python for Data Science? 5.Advantages & Disadvantages of Python 6.Python Libraries for Data Science

What is data science ? Data Science  Computer Science  + Mathematics/statistics  +  Visualization

Why Data Science ? Data is generated from different sources like :- Financial logs Text files Multimedia forms Audio file Video file Sensors  Instruments Total Data Stored

Why data science ?

Data Science process

Who is a Data Scientist ?

Why Python for Data Science ? Interpreted Intuitive and minimalistic code Expressive language Dynamically typed Automatic memory management

WHY PYTHON FOR DATA SCIENCE ? Advantages Ease of programming Minimizes the time to develop and maintain code Modular and object-oriented Large community of users A large standard and user-contributed library Disadvantages Interpreted and therefore slower than compiled languages Decentralized with packages

Code Performance vs Development Time

Python Libraries for data science Some Popular Python Libraries are : - NumPy SciPy Pandas Scikit -Learn Visualization Libraries Matplotlib Seaborn   All these libraries are installed on the SCC 

Python Libraries for data science NumPy : Introduces objects for multidimensional arrays and matrices, as well as functions that allow to easily perform advanced mathematical and statistical operations on those objects. Provides vectorization of mathematical operations on arrays and matrices which significantly improves the performance. Many other python libraries are built on NumPy. Link: http://www.numpy.org/  

Python Libraries for data science SciPy Collection of algorithms for linear algebra, differential equations, numerical integration, optimization, statistics and more Part of SciPy Stack Built on NumPy Link: https://www.scipy.org/scipylib/

PYTHON LIBRARIES FOR DATA SCIENCE PANDAS : Pan el Da ta S ystem Pandas is an open source, BSD-licensed library. High-performance, easy-to-use data structures. Provides data analysis and data manipulation tools ( reshaping, merging, sorting, slicing, aggregation etc.) Allows handling missing data. Link: http://pandas.pydata.org/

Python Libraries for data science SciKit -Learn : Provides machine learning algorithms: classification, regression, clustering, model validation etc. Built on NumPy, SciPy and matplotlib  Link: http://skikit-learn.org/  

PYTHON LIBRARIES FOR DATA SCIENCE Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats  A set of functionalities similar to those of MATLAB Line plots, scatter plots, BarCharts, histograms, pie charts etc. Relatively low-level; some effort needed to create advanced visualization MATPLOTLIB : Link: https://matplotlib.org/

PYTHON LIBRARIES FOR DATA SCIENCE Based on matplotlib   Provides high level interface for drawing attractive statistical graphics Similar (in style) to the popular ggplot2 library in R SEABORN Link: https://seaborn.pydata.org/

Loading Python Libraries In [ ]:  #Import Python Libraries import numpy as np  import scipy as sp import pandas as pd import matplotlib as mpl  import seaborn as sns 

  CONCLUSION