Northern New England Tableau User Group - September 2024 Meeting
patrickdtherriault
121 views
52 slides
Oct 01, 2024
Slide 1 of 52
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
About This Presentation
Northern New England Tableau User Group - September 2024 Meeting
Size: 13.38 MB
Language: en
Added: Oct 01, 2024
Slides: 52 pages
Slide Content
Northern New England TUG The Roux Institute at Northeastern University & Virtual September 2024 | Meeting Link | Teams Link
Start by 9 905 – 945 15 min 1000 – 1040 Until 11 Hybrid attendees please be patient during breaks! Welcome, Introductions Presentation by Nicholas Lisac Quick Break Presentation by Joe Reynolds Wrap-up & Networking September 2024 Agenda
Today's Speakers Joe Reynolds Senior Data Analyst, Decision Support Northeastern University College of Professional Studies Nicholas Lisac Senior Data Analyst | InvoiceCloud Adjunct Faculty Instructor | SNHU
Putting the Tableau Metadata API to work at InvoiceCloud
“…the Metadata API identifies all of the databases, files, and tables used by the content on your Tableau Cloud site or Tableau Server.” Discover data that’s associated with the content published to your Tableau Cloud site or your Tableau Server. Track lineage or the relationships between content and external assets, like data sources and workbooks. Perform impact analysis . Using upstream and downstream lineage information, you can evaluate impact of changes to content. Confidential & Proprietary | 2 What is the Metadata API? Offers similar capabilities as Data Management but is available to Cloud users for free!
We’re hoping to increase simplicity by consolidating data sources Confidential & Proprietary | 6 What is the problem we’re solving The source is very expensive to run, store, and load, and has many ‘versions’ of itself We’re hoping to gain efficiency by removing redundant columns We need to update our core data model used for sales pipeline reporting As these sources are widely used, we don’t want to disrupt our analysts - and we want to know who we may be affecting
Table of all Columns used in Tableau Table of all Workbooks where Column is used Confidential & Proprietary | 7 Specific Applied Uses Field Deprecation Data Source Consolidation A new field needs to be used in place of a deprecated field Can we find all workbook owners, workbooks, sheets, and fields in those sheets that use the old column? We want to find the most streamlined column set for consolidation Can we find all workbooks and sheets that employ certain tables, and then compare column lists to find the smallest set of useful fields? Search for Deprecated Column Name Select Data Table(s) Tool Input Desired Output
Tableau Preview Confidential & Proprietary | 8 Table of all Workbooks where Column is used Search for Deprecated Column Name Table of all Columns used in Tableau Select Data Table(s)
We can identify dependencies on a data table by viewing which workbooks and sheets are using columns from that table Confidential & Proprietary | 2 Using the Metadata API This method starts by finding all sheets and fields on those sheets, and then looks ‘upstream’ to find the columns all those fields are dependent on Data Inside Workbook Server Level Information Data Table Information Upstream Downstream
Data Extraction & Processing Confidential & Proprietary | 10 My Steps: Workshop the query in the Graph i QL web tool and Postman Save Query as JSON file Build Pipeline in Knime Authenticate via REST API Create Pagination Run Queries Parse JSON to table Load table to Tableau View and analyze data in Tableau
Confidential & Proprietary | 11 Live Demo: Graph i QL , Knime , and Tableau The next few slides are meant to act as a resource for those who may have missed the demo
Graph i QL Confidential & Proprietary | 12
GraphQL Query for you to start from Confidential & Proprietary | 13 { sheetsConnection (first: 1, offset: 0) { nodes { name id workbook { luid name projectName createdAt updatedAt owner { luid name } } sheetFieldInstances { name id ... on CalculatedField { formula } upstreamFields { id name ... on CalculatedField { formula } datasource { name id } upstreamColumns { id name table { name ... on CustomSQLTable { name id } ... on DatabaseTable { id name } ... on VirtualConnectionTable { id name } } } } } } } }
Initial Results Consolidation Opportunity The remaining 81 columns not slated for deprecation encompass every column used in the workbooks connected to the 10+ downstream data sources This gives us the blueprint that necessary to create our universal consolidated data model without missing any required fields Confidential & Proprietary | 15 Efficiency Gains We found 98 columns in the Core table that are unused in any downstream workbooks If each were removed, we could reduce total table size by 2.8 billion cells This, combined with other improvements that reduce our row count from 34 million to 6 million, reduces our total size from 6 billion to 500 million cells, reducing extract time to under 4 minutes Has 10+ ‘child’ data sources that use it as a base, each with their own business purpose Core model has 34 million rows and 179 columns Takes 100-120 minutes to create the core extract Starting Conditions: Sales Data
Confidential & Proprietary | 16 Thank you!
Take a break, you deserve it! Back in 15
Today's Speakers Joe Reynolds Senior Data Analyst, Decision Support Northeastern University College of Professional Studies Nicholas Lisac Senior Data Analyst | InvoiceCloud Adjunct Faculty Instructor | SNHU
If a dashboard falls in the Tableau server…? Joe Reynolds, Sr. Data Analyst, Decision Support & Academic Quality Assessment Northeastern University, College of Professional Studies 09.19.2024
Three focus-areas when modernizing a new data process Working across the aisle with other units. The myth of self-service analytics – aim for the value zone. Building data habits, data hygiene through automated processes.
Professional Background English Undergrad at University of Southern Maine Field study in Brazil based on Freire's, Pedagogy of the Oppressed 10 years in media product development, New York & Maine First real-time-bidding system for mobile advertising Digital and traditional media in Maine Transitioned to data in 2020 First Graduating class of The Roux Institute Northeastern University, Late Fall 2022 (~2 years)
The College of Professional Studies …by the numbers Serving: ~8,000 Learners 6,500 Graduate & Doctoral 1,500 Undergraduate Professional Programs: 14 Bachelor 23 Graduate 3 Doctoral 50 Certificates This academic year we will run: 800 courses 3,600 class sections Across 6 campus locations in our global network Faculty & Staff 90 Full-time 20 Half-time 800 Part-time 130 Staff This academic year we will confer: 200 Doctorates 2,500 Graduate 300 Undergraduate
College SLT structure Dean Associate Dean Associate Dean Associate Dean Associate Dean Associate Dean Sr Assc Dean Sr Assc Dean P&L Responsibility Vision & external affairs Execution, investment, change
Course Billing Credits, Point-in-Time
How does it work?
The “4-box” and tabular views
Scaling up the value of previous versions But, Joe, if this viz was already a known quantity for the organization, what did you improve?
Viz and table parameter actions
Course Billing Credits, Point-in-Time
Touchdown!! But, where did you start?
Working with colleagues across the aisle
Remember that they are just like you and are handling competing requests from multiple stakeholders! What are the big problems is your organization is trying to solve right now? How do you prefer to co-create with your collaborators? What have you seen work to help people come together around this metric in the past? Don’t be a demanding jerk! Get curious.
Orient your problem to their stakeholder matrix.
Make it intuitive for your development partner to understand why managing your powerful, interested stakeholder closely is also in their best interest.
If we had that stakeholder in the room with us, what do you think are the one or two things they are most likely to ask about or poke holes in? The secret password to unlock urgent curiosity:
The myth of self-serve analytics (aim for the value zone.)
The myth of self-service analytics
Email is the value zone.
What’s the best way to get to the value zone? Microsoft Power Automate is a SaaS platform by Microsoft for optimizing and automating workflows and business processes. It is part of the Microsoft Power Platform line of products, which include Power Apps
Power Automate > Tableau subscriptions Personalize Customize Speed-to-value Scale
Building data habits, data hygiene through automated processes.
Proof that it works- responses w/n 24 hrs… “Thanks - this is very helpful.” “Thank you, Joe. Will forward to my program leads.” “Thank you for your efforts here, Joe. This is great.”
So, if a dashboard you need, falls…
Work across units to align interests and get buy-in on a new process. Unwind the myth of self-service analytics and aim for the value zone. Help your team build better data habits through automated processes. There are three ways to refocus: