business analytic meeting 1 tunghai university.pdf
AnggiAndriyadi
9 views
62 slides
Aug 21, 2024
Slide 1 of 62
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
About This Presentation
Business analytics
Size: 2.58 MB
Language: en
Added: Aug 21, 2024
Slides: 62 pages
Slide Content
Business Analysis
and AI Applications
What is Data Analysis?
1.> A process of inspecting, cleansing, transforming and modeling data with the goal of
discovering useful information, informing conclusion and supporting decision-making.
Definition by Wikipedia.
What is Data Analysis
1.> A process of inspecting, cleansing, transforming and modeling data with the goal of
discovering useful information, informing conclusion and supporting decision-making.
Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with the goal of
discovering useful information, informing conclusion and supporting decision-making.
Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with the goal of
discovering useful information, informing conclusion and supporting decision-making.
Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with the goal of
discovering useful information, informing conclusion and supporting decision-making.
Definition by Wikipedia.
What is Data Analysis
Data Analysis Tools
Auto-managed closed toolsProgramming Languages
Closed Source !
Expensive "
Limited #
Easy to learn $
Open Source %
Free (or very cheap) &
Extremely Powerful '
Steep learning curve $
Auto-managed closed toolsProgramming Languages
Why Python for Data Analysis?
Why Python for Data Analysis?
Why would we choose Python over R or Julia?
! very simple and intuitive to learn
! “correct” language
! powerful libraries (not just for Data Analysis)
! free and open source
! amazing community, docs and conferences
Python, sadly, is not always the answer
●When R Studio is needed
●When dealing with advanced statistical methods
●When extreme performance is needed
When to choose R?
The Data Analysis Process
●Building Machine
Learning Models
●Feature Engineering
●Moving ML into
production
●Building ETL
pipelines
●Live dashboard and
reporting
●Decision making
and real-life tests
●Exploration
●Building statistical
models
●Visualization and
representations
●Correlation vs
Causation analysis
●Hypothesis testing
●Statistical analysis
●Reporting
●Hierarchical Data
●Handling categorical
data
●Reshaping and
transforming
structures
●Indexing data for
quick access
●Merging, combining
and joining data
●Missing values and
empty data
●Data imputation
●Incorrect types
●Incorrect or invalid
values
●Outliers and non
relevant data
●Statistical sanitization
Data ExtractionData CleaningData WranglingAnalysis
●SQL
●Scrapping
●File Formats
○CSV
○JSON
○XML
●Consulting APIs
●Buying Data
●Distributed
Databases
Action
Python & PyData Ecosystem
PYTHON ECOSYSTEM:The libraries we use...
●pandas: The cornerstone of our Data Analysis job with Python
●matplotlib: The foundational library for visualizations. Other libraries we’ll use will be
built on top of matplotlib.
●numpy: The numeric library that serves as the foundation of all calculations in Python.
●seaborn: A statistical visualization tool built on top of matplotlib.
●statsmodels: A library with many advanced statistical functions.
●scipy: Advanced scientific computing, including functions for optimization, linear
algebra, image processing and much more.
●scikit-learn: The most popular machine learning library for Python (not deep learning)
And finally,
why Python?
AI & Machine Learning for Business
✦Artificial Intelligence
✦Models
✦Machine Learning
Part 1 : What is AI ?Part 2 : How do we use it ?
✦AI in Practice
What is AI ?
IntelligenceArtificial
Made by humansThe ability to solve problems and make decisions
Intelligence in Action
Intelligence requires knowing
how the world works
We understand things through Models
Models allow you to make predictions
InformationModelPredictions
2 Types of Models
Principle - driven Data - driven
If we see dark clouds in the sky then it's
probably going to rain later
The sky is similar to other times when it
rained
Standard programming techniquesMachine Learning
Machine Learning ( ML )
Machine learning is just a computer's ability to learn by example
①Training data
•Targets are the things that we're trying to predict
•Predictors are all the information that we're going to use in
order to estimate the target
★Machine learning allows the computer to figure out the relationship between predictors and targets, simply
by seeing many different examples.
Machine Learning ( ML )
① Training Data② ML Algorithm③ ML Model
Machine Learning ( ML )
① New Data② ML Algorithm③Predictions
Built-in function
A function is a reusable piece of code that carries out a task.
print
input
int ()
…
Variables are a key element of programming.
Theyareusedforcalculations,forstoringvaluesforlateruse,indecisionsandiniteration.
Box
What is a Variable?
Assigning Value to Variables
int: numberA
5
Put 5 into the variable called numberA:
numberA 5
Programming language:
numberA = 5
When you assign a variable, you use the ” = “ symbol.
The name of the variable goes on the left, and the value you want to store in the variable goes on the right.
X10python interpreter
numberA = 1000
print (numberA)
Memory
•Storage of Data:
When you declare a variable in a program, the computer allocates a specific portion of memory to store the data associated
with that variable.
•Data Retrieval:
To use the stored data, you refer to the variable by its name. For example, if you want to use the value stored in x, you simply
use x in your code, and the program retrieves the data stored in the memory location associated with x.
•Name
•Value
•Data type
Age
26integer
All variables are made up of three parts:
A datatype in Python is a classification that specifies which type of value
a variable can hold.
Datatypes
1.int (integer): Represents whole numbers, e.g., 5, -10, 0.
2.float (floating-point number): Represents decimal numbers, e.g., 3.14, -0.5, 2.0.
3.str (string): Represents text, enclosed in single (' '), double (" “), e.g., "Hello, Python!"
4.bool (boolean): Represents either True or False, used for logical operations and comparisons.
What is Augmented assign operator ?
x = 10
x = x + 5
x + = 5
Output variables
Declare the variable "engGrade" with an initial value 95.engGrade = 95
print(“English score:”, engGrade )
In Python, you can concatenate (combine) variables with strings using the “,”operator.
When you concatenate a variable with a string, it effectively combines the value of the variable with the string.
Input
grade = 0
grade = int(input())
print(“score:", grade)
•int(input()) : Convert the input to an integer.
•The int() function is used to convert the user's input, which is initially a string, into an integer.
Type Conversion
input(“x:”)x =
y = x + 1y = “1” + 1
input(“x:”)x =
y = x + 1
int(x)
float(x)
bool(x)
str(x)
print(type(x))
input(“x:”)x =
y = int(x) + 1
print(f ”x : {x}, y : {y}”)
Example: Add Two Numbers With User Input
Output:?
# Store input numbers
num1 = input('Enter first number: ')
num2 = input('Enter second number: ')
# Add two numbers
sum = float(num1) + float(num2)
# Display the sum
print(f”The sum of {num1} and {num2} is {sum}”)
Since, input() returns a string, we need to convert the string into number by using the float() function.
Exercise1 : Kilometers to Miles
The user is asked to enter kilometers.
Output:?
Formula : miles = kilometers * 0.621371192
Conditional Statement
if condition:
# Code to execute if the condition is True
if Statement:
•The if statement is used to execute a block of code if a specified condition evaluates to True.
•If the condition is False, the block is skipped.
temperature = 35
if temperature > 30 :
print(“it’s warm”)
Using these indentations, python interpreter will know what statements should be executed, if this condition
is true.
temperature = 35
if temperature > 30 :
print(“it’s warm”)
print(“drink water”)
print(“done”)
This statement will always be executed whether this statement is true or not.
temperature = 15
if temperature > 30 :
print(“it’s warm”)
print(“drink water”)
print(“done”)
Since the “done” message is not in the if block, the message will always be executed, whether this condition is true or not
if - else
if condition:
else:
Execution area A
Execution area B
aqi = 200
if aqi < 150:
print('Good Air Quality!')
else:
print('Poor Air Quality!')
What if we want multiple conditions?
elif (else if )
77
if-elif-else
if condition 1:
elif condition 2:
elifcondition 3:
else:
Execution area
1
Execution area
2
Execution area
X
Execution area
3
x = 10
if x > 20:
print("x is greater than 20")
elif x > 10:
print("x is greater than 10 but not greater than 20")
else:
print("x is 10 or less")
temperature = 15
if temperature > 30 :
print(“it’s warm”)
print(“drink water”)
print(“done”)
elif temperature > 20 :
print(“it’s nice”)
else:
print(“it’s cold)
“elif” statement
if none of the previous conditions are true, then what you have in the else block will be executed.
79
For example, when a student's score is 80, the program should display the corresponding grade: ‘B.'
In-class assignment 2
Write a program that represents the mapping between semester scores and grade ranges.
The program should categorize grades as follows:
If the score is greater than or equal to 90, it should be classified as 'A.'
➡ If the score is between 80 (inclusive) and 90 (exclusive), it should be classified as 'B.'
➡ If the score is between 70 (inclusive) and 80 (exclusive), it should be classified as 'C.'
➡ If the score is between 60 (inclusive) and 70 (exclusive), it should be classified as 'D.'
➡ If the score is less than 60, it should be classified as 'F.'