04242024_CCC TUG_Joins and Relationships

ccctableauusergroup 236 views 56 slides Apr 25, 2024
Slide 1
Slide 1 of 56
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56

About This Presentation

April meeting of the California Community Colleges Tableau User Group (CCC TUG). April showers bring Tableau powers! Andrew Drinkwater discusses joins and relationships using student enrollment data, Emmanuel Sanchez-Tovar showcases DI dashboards from Oregon State University, and Sam Homier walks th...


Slide Content

CCC TUG April showers bring Tableau powers April 24 , 2024

CCC TUG April showers bring Tableau powers April 24 , 2024

This meeting is being recorded

Agenda Announcements Relationship Advice Tableau Showcase Office Hours Next Meeting - June 18, 2024

Tableau Conference 2024 April 29 - May 1 San Diego, CA

The data event of the year is headed to San Diego. Iron Viz Tableau Doctor Tableau True to the Core Data Village 1 visionary keynote 200+ breakout sessions 60 hands-on trainings 80+ theater sessions #DataFam Community Events Data Night Out Networking with thousands of peers 3 epic days to unlock the power of data for everyone!

Know Before You Go HE TUG Recording: https://usergroups.tableau.com/events/details/tableau-higher-education-tableau-user-group-presents-hetug-april-2024-spring-into-action-with-a-tc24-kbyg-dont-miss-a-beat-with-tableau-pulse/ Sarah Bartlett’s Blog: https://sarahlovesdata.co.uk/2024/04/09/tableau-conference-2024-know-before-you-go/

Salesforce+ Register for free: https://www.salesforce.com/plus/

Relationship Advice Andrew Drinkwater

Tips & Tricks on Joins and Relationships

Agenda Who Am I? Combining Data in Tableau: Joins, Relationships, Blends Overview of Key Concepts When and Why to Use Joins and Relationships Demo Gotcha's Rules of Thumb 

CO-FOUNDER & PRESIDENT PLAID ANALYTICS ENROLLMENT FORECASTER PAST BOARD MEMBER, CANADIAN INSTITUTIONAL RESEARCH & PLANNING ASSOCATION HETUG CO-ORGANIZER TABLEAU SINCE 2007 Andrew Drinkwater About Me

About Plaid Analytics Plaid empowers Strategic Enrolment Management professionals to actively plan & better serve their diverse learners while meeting institutional goals.  

Plaid Integrate Plaid Analyze Plaid Forecast Plaid Govern Data Warehouses & Pipelines Strategic & Operational Dashboards Enrollment & Tuition Forecasting Data Governance Tools plaid.is  

In early 2024, the Government of Canada cut international study permits by 30% year over year (from 500k to 365k). With a median tuition rate of about $36,000 that could mean a budget cut of $4 billion to higher education institutions. Enrollment Planning is More Important Than Ever Before Original growth plan Revised growth plan ‹#› This chart shows an example flatlining intakes at Plaid University, which was created for demo purposes. Growth continues for several years as a result of past years growth rates.  

Combining Data from Different Sources in Tableau

Slides: https://at.plaid.is/ccc Workbook: Tableau Public: https://at.plaid.is/tp Materials

Join Relationship Blend

Relationships Joins Response in form of question Only one response acceptable Closest to without going over Many responses acceptable Image source: Seattle Times Image source: Jacksonville.com

Joins vs Relationships Join Relationship Level of Detail Row Level Aggregate Input Physical Tables Logical Tables Output Logical Table Definition for how to query multiple tables when used in a visualization Join Type (Inner, Left, Right, Outer) Fixed when you create the data source Flexible, based on measures and dimensions in the viz Queries Executed as part of every query Only tables in viz are included in the query Extracts Can be pre-computed in an extract By design, cannot be pre-computed (joined at run-time, if needed) Created by Plaid Consulting Inc. Adapted from Bethany Lyons, Tableau Conference 2019

Blends vs Relationships Blends Relationship Scope Worksheet Data source Sharing Cannot be published Can be published Semantics Primary and Secondary data sources All tables are equal (no primary, no secondary) Join Type Only left join Supports all join types Performance Locally Computed Part of the SQL query Related Fields Variable Fixed Data Models Direct Direct and Indirect Created by Plaid Consulting Inc. Adapted from Bethany Lyons, Tableau Conference 2019

Joins Combine 2 input tables, creating a 3rd table A B C Created by Plaid Consulting Inc. Inspired by Bethany Lyons, Tableau Conference 2019

Combine 2 input tables, creating a 3rd table A B C Created by Plaid Consulting Inc. Inspired by Bethany Lyons, Tableau Conference 2019 Joins

Row-Level Joins Only one way to join. Therefore, we need many data sources for different types of questions. Many ways to join. Therefore, we can use aggregate joins in more flexible ways if they’re nimble. Created by Plaid Consulting Inc. Adapted from Bethany Lyons, Tableau Conference 2019 Aggregate Joins B C A B C A FIXED OUTPUT B B C C VARIABLE OUTPUT

Flexible way to combine data from multiple tables. Keeping the original level of detail and domain for each table. Fields in your viz determine what query and joins are run in real time Use a single Tableau data source to answer multiple questions. Relationships

Multiple tables Multiple levels of detail Much more flexibility Joins created on the fly – based on measures, dimensions, and tables used in the viz. B C A E F D Relationship Created by Plaid Consulting Inc. Adapted from Bethany Lyons, Tableau Conference 2019 Relationships: Idea

You work in Strategic Enrolment Management Key Question: Are we on track to meet our admission targets? Let’s compare: Applicants versus Targets Relationships – Example: Applicants vs Targets

Relationship Logical Layer Physical Table A Physical Layer Physical Table B Physical Table X Physical Table C Physical Table D Physical Layer Logical Table A consists of 4 Physical Tables Logical Table X consists of 1 Physical Table Logical Table A Logical Table X Relationship Created by Plaid Consulting Inc. Image adapted from Tableau Online Help ; also inspired by Interworks . Relationships: General Framework

Relationship Logical Layer Applications Physical Layer Program Intake Targets Term National Status Physical Layer Logical Table Applicants consists of 4 Physical Tables Logical Table Targets consists of 1 Physical Table Applicants Targets Relationship Created by Plaid Consulting Inc. Image adapted from Tableau Online Help ; also inspired by Interworks . Relationships: Applicants vs Targets

Today’s example: Students and Courses Relationship Logical Layer Stdnt Car Term Physical Layer Acad plan (+tbl) Stdnt enrl Acad prog tbl Term tbl Physical Layer Logical Table Applicants consists of 4 Physical Tables Logical Table Targets consists of 2 Physical Tables Students in Programs and Terms Students in Courses Relationship Created by Plaid Consulting Inc. Image adapted from Tableau Online Help ; also inspired by Interworks . Class tbl

Examples Demo: Actuals versus Targets Demo: Students and Courses Discussion: Data Densification Row Level Security Surveys

Demo 1: Applicants vs Targets

Most calculations in Tableau allow you to compute values at the data source level or the visualization level Data Source Visualization

Level of Detail Expressions allow you to compute values at three levels: Data Source Visualization Less Granular (Exclude) More Granular (Include) Independent (Fixed)

Demo 2: Which Courses Did a Student Take Before Changing Majors?

Discussion: Relationships for Row Level Security Survey Analysis

Discussion RLS: Seems to perform faster with relationships. Extracts are helpful for RLS, and as such the relationship method may be optimal because it pulls in each table independently rather than combining them, until needed. Surveys: most survey data is in the form of one row per respondent (and many columns). Tableau usually needs one row per respondent per question. With relationships, you can actually have both! This allows the best of both worlds – use the normalized version for visualizing results, and the base (denormalized) version for inter-question analysis and any demographic variables.

Missing values: Tableau often assumes you want an inner join when actually you want a left join. If your data disappears when you add a new field to the canvas, try adding the count field from all tables onto the view (Detail on the marks card works well for this) Illogical totals: Sometimes the detailed data doesn’t roll up nicely to a summary level. In this case, it’s because there are white students in every class section of CRIM 421, but other race/ethnicity are only in the majority of sections. Relationships – Gotcha’s

Performance Considerations Relationships Can get very slow if you have many tables where the database could have done the heavy lifting and eliminated a lot of the records. Consider limiting fields to improve the speed at which full tables are imported. Joins Superior for any case when the secondary table is less granular than the primary table (especially true for lookup tables). Work better when the database can eliminate a ton of records. Consider adding indexes and other efficiency mechanisms in the database rather than Tableau.

Summary Relationships in Tableau are a game-changer. They’re helpful because: More versatile than conventional joins. One data source can now answer many questions. No missing or duplicated data.

Rules of Thumb Use Joins Use Relationships Rule of thumb If the secondary table have granularity less than or the same as the primary table If you can use an inner join and have no duplication or loss of data If the secondary table has granularity greater than or different from the primary table. (or if you don’t know what to use) Example: Combining Student Term data with Student Course data Granularity: one row per: Student Term: (student, term) Student Course: (student, term, course) Student: (student) Term: (term) Student Term should be joined to: Student (less granular than Student Term: one row per student) Term (less granular than Student Term: one row per term) Student Term should be related to * Student Course (more granular than Student Term: one row per student, term, and course ) For this, assume Table A is “primary” and Table B is “secondary”. Use data blends only when joins and relationships do not work (note that you cannot blend a relationship data source)

Rules of Thumb Joins often perform better where cardinality is 1:1 or many:1 Figure out core granularity of your data. Less granular (eg validation tables) -> Join; More or different granularity (eg students vs courses) should be relationship Relationships work better where you have many:many relationships, different levels of aggregation, or for those newer to the concept of combining data with Tableau. If Joins and Relationships don’t work, Use Blends Note that blends do not work if you try to blend to a logical table One advantage of blends is you can define a different blend key on every worksheet.

Slides: https://at.plaid.is/ccc Workbook: Tableau Public: https://at.plaid.is/tp Materials

[email protected] www.linkedin.com/in/andrewdrinkh2o

About the data All data ahead is completely fake. Any names or other identifiers are generated algorithmically. Some patterns are modelled after publicly available data, such as proportion of international students or grade distributions. Plaid University admits ~3,000 students per academic year, most in the fall. Students study in one of 9 faculties, ranging from trades to law, arts to engineering. Students register in ~50% of their courses within their major, and 50% outside of it.

About the data Today’s example contains: 333,002 student-term records spanning Fall 2012-Fall 2021 17,445 students 1,399 internal transfers 1,031,031 student course enrolment records This database looks similar to a PeopleSoft Student Information System

Sources Clearpeaks. Tableau 2020.2 New Data Model. https://www.clearpeaks.com/tableau-2020-2-new-data-model Data Vizardry. Tableau – Understanding Relationships in the New Data Model – Part 1. https://datavizardry.com/2020/06/05/tableau-understanding-relationships-in-the-new-data-model-part-1/ The Flerlage Twins (2020). Tableau’s New Data Model. https://www.flerlagetwins.com/2020/05/tableau-data-model.html Lyons, B. (2019). Data Model Enhancements. https://tc19.tableau.com/learn/sessions/data-model-enhancements Interworks. Comparing Tableau’s New Relationships to Blends and Joins. https://interworks.com/blog/2020/04/21/comparing-tableaus-new-relationships-blends-joins/ Tableau (2020): Relationships Part 1. https://www.tableau.com/about/blog/2020/5/relationships-part-1-meet-new-tableau-data-model Relationships Part 2. https://www.tableau.com/about/blog/2020/5/relationships-part-2-tips-and-tricks Relationships Part 3. https://www.tableau.com/about/blog/2020/6/relationships-part-3-asking-questions-across-multiple-related-tables Relate Your Data. https://help.tableau.com/current/pro/desktop/en-us/relate_tables.htm Don’t Be Scared of Relationships. https://help.tableau.com/current/pro/desktop/en-us/datasource_dont_be_scared.htm Don’t be Scared of Deeper Relationships. https://help.tableau.com/current/pro/desktop/en-us/datasource_dont_be_scared_deeper.htm Tessellation (2020). Tableau’s New Relationships and What They Mean. https://www.tessellationtech.io/understanding-tableau-relationships/

Tableau Showcase Emmanuel Sanchez Tovar

Office Hours Sam Homier

Resources Link to Sample Workbook https://public.tableau.com/views/Maskingvs_SuppressingSmallDataValuesCCCTUG/Maskedvs_SuppressedCount?:language=en-US&:sid=&:display_count=n&:origin=viz_share_link Custom Number Format https://help.tableau.com/current/pro/desktop/en-us/formatting_specific_numbers.htm Data De-Identification Overview (Department of Education) https://studentprivacy.ed.gov/sites/default/files/resource_document/file/data_deidentification_terms_0.pdf Relevant Tableau Community Question https://community.tableau.com/s/question/0D58b0000BTEVedCQH/masking-data-and-retain-value

See you at the next meeting! Tuesday , June 18, 2024 12:00 - 1:30 PM One HE/CCC of a meeting! John Fink & Tatiana Velasco Community College Resource Center And more!

Do you have something you want to share? Let us know! Next meeting topic: Transfers https://docs.google.com/forms/d/e/1FAIpQLScJHLKhTRtrkvvLtpfHQBRBI_anicDqVbze0KJBiVyo9ptdEg/viewform

Samantha Homier Kimberly Coutts Lisa Trescott [email protected] @lisatrescott Contact Information [email protected] @kimstrezz [email protected] CCC TUG [email protected] @CCC_TUG
Tags