python is sometihng that everyone hasd to study for ggsfgdf
Size: 4.36 MB
Language: en
Added: Aug 08, 2024
Slides: 92 pages
Slide Content
Lecture 1: Python – Fundamentals Dr. A. Ramesh DEPARTMENT OF MANAGEMENT IIT ROORKEE
Learning objectives Installing Python Fundamentals of Python Data Visualisation 2
Python Installation Process Installation Process – Step 1: Type https://www.anaconda.com at the address bar of web browser. Step 2: Click on download button Step 3: Download python 3.8 version for windows OS Step 4: Double click on file to run the application Step 5: Follow the instructions until completion of installation process 3
Python Installation Process Installation Process – Step 1: Type https://www.anaconda.com at the address bar of web browser. 4
Python Installation Process Step 2: Click on download button 5
Python Installation Process Step 3: Download python 3.8 version for windows OS 6
Python Installation Process Step 4: Double click on the downloaded file to run the application 7
Python Installation Process 8
Python Installation Process 9
Python Installation Process 10
Python Installation Process 11
Python Installation Process 12
Python Installation Process 13
Python Installation Process 14
Python Installation Process 15
Python Installation Process 16
Why Jupyter NoteBook ? 17 Why? Edit code on web browser Easy in documentation Easy in demonstration User- friendly Interface
Python and Jupyter 18 Python Programming Language Jupyter Application Software Package contains both python and jupyter application
19
About Jupyter NoteBook 20 Cell -> Access using Enter Key
About Jupyter NoteBook 21 Input Field -> Green color indicates edit mode Blue color indicates command mode
About Jupyter NoteBook 22 -> It contains documentation -> Text not executed as code
About Jupyter Notebook Command mode allow to edit notebook as whole To close edit mode (Press Escape key) Execution (Three ways) Comment line is written preceding with # symbol. 23 Ctrl +Enter (Output field can not be modified) Shift +Enter (Output field is modified) Run button on Jupyter interface
About Jupyter Notebook Important shortcut keys 24 A -> To create cell above B -> To create cell below D + D -> For deleting cell M -> For markdown cell Y -> For code cell
Fundamentals of Python Loading a simple delimited data file Counting how many rows and columns were loaded Determining which type of data was loaded Looking at different parts of the data by subsetting rows and columns 25
Importing Different Files in Jupyter Notebook Importing text file 26
Importing Different Files in Jupyter Notebook Importing tablular file 27
Importing Different Files in Jupyter Notebook Importing excel file 28
Importing Different Files in Jupyter Notebook Importing Zip file 29
Importing Different Files in Jupyter Notebook Importing PDF file 30
31
Loading a simple delimited data file 32
33
head method shows us only the first 5 rows 34
Get the number of rows and columns 35
get column names 36
get the dtype of each column 37
Pandas Types Versus Python Types 38
get more information about data 39
Looking at Columns, Rows, and Cells # get the country column and save it to its own variable 40
# show the first 5 observations 41
# show the last 5 observations 42
# Looking at country, continent, and year 43
44
Looking at Columns, Rows, and Cells Subset Rows by Index Label: loc 45
get the first row Python counts from 0 46
# get the 100th row # Python counts from 0 47
get the last row 48
Subsetting Multiple Rows # select the first, 100th, and 1000th rows 49
Subset Rows by Row Number: iloc # get the 2nd row 50
get the 100th row 51
# using -1 to get the last row 52
With iloc , we can pass in the -1 to get the last row—something we couldn’t do with loc . 53
# get the first, 100th, and 1000th rows 54
Subsetting Columns The Python slicing syntax uses a colon, : If we have just a colon, the attribute refers to everything. So, if we just want to get the first column using the loc or iloc syntax, we can write something like df.loc [:, [columns]] to subset the column(s). 55
# subset columns with loc # note the position of the colon # it is used to select all rows 56
57
# subset columns with iloc # iloc will alow us to use integers # -1 will select the last column 58
Subsetting Columns by Range # create a range of integers from 0 to 4 inclusive 59
# subset the dataframe with the range 60
Subsetting Rows and Columns # using loc 61
# using iloc 62
Subsetting Multiple Rows and Columns #get the 1st, 100th, and 1000th rows # from the 1st, 4th, and 6th columns 63
if we use the column names directly, # it makes the code a bit easier to read # note now we have to use loc , instead of iloc 64
65
66
Grouped Means # For each year in our data, what was the average life expectancy? # To answer this question, # we need to split our data into parts by year; # then we get the ' lifeExp ' column and calculate the mean 67
68
69
If you need to “flatten” the dataframe , you can use the reset_index method. 70
Grouped Frequency Counts use the nunique to get counts of unique values on a Pandas Series . 71
Basic Plot 72
73
Visual Representation of the Data Histogram -- vertical bar chart of frequencies Frequency Polygon -- line graph of frequencies Ogive -- line graph of cumulative frequencies Pie Chart -- proportional representation for categories of a whole Stem and Leaf Plot Pareto Chart Scatter Plot 74
Methods of visual presentation of data Table 75
Methods of visual presentation of data Graphs 76
Methods of visual presentation of data Pie chart 77
Methods of visual presentation of data Multiple bar chart 78
Methods of visual presentation of data Simple pictogram 79 West
Principles of Excellent Graphs The graph should not distort the data The graph should not contain unnecessary adornments (sometimes referred to as chart junk) The scale on the vertical axis should begin at zero All axes should be properly labeled The graph should contain a title The simplest possible graph should be used for a given set of data
Graphical Errors: Compressing the Vertical Axis Good Presentation Quarterly Sales Quarterly Sales Bad Presentation 25 50 Q1 Q2 Q3 Q4 $ 100 200 Q1 Q2 Q3 Q4 $
Graphical Errors: No Zero Point on the Vertical Axis Monthly Sales 36 39 42 45 J F M A M J $ Graphing the first six months of sales Monthly Sales 39 42 45 J F M A M J $ 36 Good Presentations Bad Presentation