The Art of the Possible In Action Turning Data into Decision.pdf
TechSoupGlobal
82 views
78 slides
Aug 29, 2025
Slide 1 of 78
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
About This Presentation
Applications Development Manager, Tech Impact, and Dharneesh Jayaprakash, Data Engineer, Tech Impact share how to create a unified data ecosystem using Microsoft Fabric and Synapse Analytics, empowering your organization to make informed decisions. We’ll also cover essential practices in data prot...
Applications Development Manager, Tech Impact, and Dharneesh Jayaprakash, Data Engineer, Tech Impact share how to create a unified data ecosystem using Microsoft Fabric and Synapse Analytics, empowering your organization to make informed decisions. We’ll also cover essential practices in data protection and governance to ensure that your data is accessible, understandable, and responsibly managed.
This event series is designed to help nonprofits gain data and AI skills, made possible through generous support by Microsoft
Size: 1.01 MB
Language: en
Added: Aug 29, 2025
Slides: 78 pages
Slide Content
The Art of the Possible
In Action:
Turning Data into Decisions
Ryan Harrington
Dharneesh Jayaprakash
Mat St. Cyr
August 27, 2025
OUR MISSION IS TO LEVERAGE
TECHNOLOGY TO ADVANCE SOCIAL IMPACT
Full-Spectrum
Nonprofit Tech Services
Nonprofit Educational
Resources and Webinars
Tech Career
Development Programs
Ryan Harrington
Managing Director
Dharneesh Jayaprakash
Lead Data Engineer
Mat St. Cyr
Application Developer
Agenda
1.What is the data lifecycle?
2.How does technology enable it?
3.What is data governance?
4.How can you address it?
Building a Data Strategy
1
2
3
4
5
6
Define goals
Audit your data
Select tools for the job
Develop an implementation plan
Build skills
Evaluate
Data Life Cycle
Generation
ExtractTransform Load
Storage Analytics
Step 1 Step 2 Step 3
Data Life Cycle
Generation
ExtractTransform Load
Storage Analytics
Step 1 Step 2 Step 3
Data Generation
Goal:
Collect data accurately, consistently, and in a timely fashion
Best Practices:
Move away from manual data entry, move towards automated,
standardized data collection
Manual Data
Entry
Automated
Data Collection
Scenario: Predicting Donor Churn
Income DataPast Donation History Event Attendance Engagement with Emails
Is this person likely to
donate again? (Yes/No)
Scenario: Predicting Donor Churn
Income DataPast Donation History Event Attendance Engagement with Emails
Is this person likely to
donate again? (Yes/No)
Scenario: Predicting Donor Churn
Income DataPast Donation History Event Attendance Engagement with Emails
Is this person likely to
donate again? (Yes/No)
How do we get all
the data in the
same place?
Data Life Cycle
Generation
ExtractTransform Load
Storage Analytics
Step 1 Step 2 Step 3
Data Life Cycle
Generation
ExtractTransform Load
Storage Analytics
Step 1 Step 2 Step 3
Data Storage
Goal:
Centralized systems that make data accessible and secure
Best Practices:
Move away from on premises (local) file storage, move towards cloud-
based storage options
On Premises
Storage
Cloud-Based
Storage
Platform
Cloud-Based
Storage
Infrastructure
What pain points might you run into?
The Lone
System Trap
All the important data
lives on one person’s
local computer… and
they just went on
vacation.
What pain points might you run into?
The Lone
System Trap
All the important data
lives on one person’s
local computer… and
they just went on
vacation.
Email Attachment
Chaos
Is it
Report_v1_Updated.xlsx
or
Report_ReallyFinal_3.xls
x? Nobody knows.
What pain points might you run into?
The Lone
System Trap
All the important data
lives on one person’s
local computer… and
they just went on
vacation.
Email Attachment
Chaos
Is it
Report_v1_Updated.xlsx
or
Report_ReallyFinal_3.xls
x? Nobody knows.
Dashboard
Disaster
Your "automated" reports
require so many manual
steps that by the time you
finish, the data is already
outdated.
What pain points might you run into?
Common Pain Points
No single source of truth
Data silos
Lack of version control
Lack of automation
Poor data governance
Data inconsistency
Limited data accessibility
No data lineage &
audit trails
Poor performance
Lack of scalability
Do these pain points resonate
with you? Select all that apply.
The Slido app must be installed on every computer you’re presenting from
What type of data do you have?
Structured Data Unstructured Data
Where does the data actually go?
Data
Warehouse
Data
Lake
Data
Lakehouse
Central repository for
structured, cleaned,
analytics-ready data
Central storage for raw,
diverse data
Unified platform
including elements of
the data warehouse
and data lake
Where does the data actually go?
Data
Warehouse
Data
Lake
Data
Lakehouse
Central repository for
structured, cleaned,
analytics-ready data
Central storage for raw,
diverse data
Unified platform
including elements of
the data warehouse
and data lake
Options in Fabric
Data Life Cycle
Generation
ExtractTransform Load
Storage Analytics
Step 1 Step 2 Step 3
Scenario: Predicting Donor Churn
Income DataPast Donation History Event Attendance Engagement with Emails
Is this person likely to
donate again? (Yes/No)
How do we get all
the data in the
same place?
Data Engineering Pipeline
Source StorageTransform
Extract Load
Extract the Data
Source StorageTransform
Extract Load
Key Questions
•What are the data sources?
•How frequently does the data change?
•What is the structure of the data?
•How do we connect to the data?
•How do we extract the data?
Extract the Data: Predicting Donor Churn
Past Donation
History
Income DataEngagement with
Emails
Event Attendance
Update Frequency
Structure
How to Connect
How to Extract
Extract the Data: Predicting Donor Churn
Past Donation
History
Income DataEngagement with
Emails
Event Attendance
Update Frequency
Structure
How to Connect
How to Extract
Daily Daily Daily Yearly
Extract the Data: Predicting Donor Churn
Past Donation
History
Income DataEngagement with
Emails
Event Attendance
Update Frequency
Structure
How to Connect
How to Extract
Daily Daily Daily Yearly
JSON CSV JSON JSON
Extract the Data: Predicting Donor Churn
Past Donation
History
Income DataEngagement with
Emails
Event Attendance
Update Frequency
Structure
How to Connect
How to Extract
Daily Daily Daily Yearly
JSON CSV JSON JSON
API Forms API API
Extract the Data: Predicting Donor Churn
Past Donation
History
Income DataEngagement with
Emails
Event Attendance
Update Frequency
Structure
How to Connect
How to Extract
Daily Daily Daily Yearly
JSON CSV JSON JSON
API Forms API API
Data Factory Python Data Factory Data Factory
Transform the Data
Source StorageTransform
Extract Load
Key Questions
•Are there any issues with the data (outliers, missing, etc.)?
•Is the data validated?
•Does the data need to be aggregated? How?
•Do we need to augment the data?
•Does it need to be joined or merged? How?
Load the Data
Source StorageTransform
Extract Load
Key Questions
•Where should the data be stored?
•How does it need to be organized for downstream work?
•What is the expected volume of data?
•How should the data be loaded? Reload all of it? Append it?
•How do we know we did this well?
Data Life Cycle
Generation
ExtractTransform Load
Storage Analytics
Step 1 Step 2 Step 3
Azure Synapse Microsoft Fabric
Built for Engineers Analysts
Ease of Use Technical Very Easy
Dev Style High Code Low / No Code
Control Fine-grained Limited
Scale Petabyte-scale Auto-scale (moderate)
Use Cases Enterprise Analytics Self-service BI
Processing Advanced ETL / ELT Built-in pipelines
Productivity
Control
Demos
Azure Synapse vs Microsoft Fabric
Extract-Transform-Load
What is data governance?
Infrastructure tells us “what”
Ethics tells us “why”
Governance tells us “how”
What Is Data Governance?
The strategic framework and oversight for
managing data’s availability, usability,
integrity, and security within an organization by
setting policies, standards, and procedures that
govern data usage.
Sargiotis, Dimitrios. ”Overview and Importance of Data Governance”Data Governance: A Guide.
Cham: Springer Nature Switzerland, 2024. 487-510.
Examples of Data Governance
Training and Awareness
Programs
Developing a Data
Quality Plan
Monitoring and
Reporting
Establishing Data Quality
Metrics and Standards
Data Auditing and
Assessment
Establishing Data
Stewardship Roles
Easier use of data for
analytics
Prevention of data misuse
Removal of data silos
between departments
Ensure compliance with
data privacy laws
Increased trust in data
Better understanding of
where data comes from
Why Is Data Governance Important?
Eight Data Ethics Categories
Purpose
Limitation &
Informed
Consent
Privacy,
Security &
Confidentiality
Ownership &
Data
Sovereignty
Fairness,
Equity & Bias
Mitigation
Transparency
&
Accountability
Data Quality &
Integrity
Harm
Minimization
Openness vs.
Protection
How do we address these issues
with governance?
Key Questions to Consider
1.How does data flow through our organization?
2.How is our data collected and stored?
3.How do we ensure that our data is high quality?
4.Who has access to our data?
5.What security protocols are in place to protect our data?
6.How are we considering policies and regulations that impact
our data?
7.How are we empowering and educating our employees and
volunteers?
Key Questions to Consider
1.How does data flow through our organization?
2.How is our data collected and stored?
3.How do we ensure that our data is high quality?
4.Who has access to our data?
5.What security protocols are in place to protect our data?
6.How are we considering policies and regulations that impact
our data?
7.How are we empowering and educating our employees and
volunteers?
Ensuring High Quality Data
Data Quality
Accuracy
Completeness
ConsistencyTimeliness
Reliability
Data Access & Ownership
Data
Owner
Accountable for classification, protection, use, and quality of
one or more datasets
Data
Steward
SME with a thorough understanding of a particular data set.
Ensure alignment with governance standards set by Owner
Data
Custodian
Responsible for implementing security controls for a given
dataset
Legal Compliance
Protect Intellectual
Property
Maintain Trust
Prevent Harm
Key Components of Data Security
Policies and Regulations
HIPAA
Health Records
FERPA
Student Records
COPPA
Online Child Data
FCRA
Credit Data
Personal Data vs PII
Personal Data
and PII
PII, but not
Personal Data
Neither PII nor
Personal Data
The name “John Smith”
and theirSSN
The name ‘John Smith’
from a list of LinkedIn
profiles
A person’s age, zip code,
and ethnicity
The name ‘John Smith’
on a list of patients for an
HIV clinic
The name ‘John Smith’
and their salary as a
public school teacher
The name ‘John’ on a list
of patients for an HIV
clinic
Consider data catalogues
Act as a searchable “front door” to your organization’s data for
internal stakeholders
•Improves discovery and reduces internal friction
•Details what exists, who owns it, how to use it, and riskiness
Empowering Employees
What’s included?
Business Metadata Governance Metadata Technical Metadata
•Plain language name •Owner / Steward / Custodian•System of Record
•Description •Sensitivity / Classification•Table / File Path
•Program •Retention Policies •Refresh Cadence
•Use cases •Sharing Rules •Lineage
•Legal Considerations
(HIPAA, etc.)
•Data Quality Notes
Tools like Microsoft Fabric and Azure Synapse are the connective
tissue that enable analytics for organizations.
Putting it all together
1
2
3
Learning how to use and manage data storage in a cloud-based
system will help meet your organization’s data needs.
Effective governance is the “how” behind any safe and effective
data usage at your organization.
Optimize
Resources
Build
Trust
Improve
Decision
Making
Advance
Mission
Microsoft
Fabric
Azure AI
Foundry
How Microsoft Enables This
Azure Synapse
Analytics
Microsoft
Purview