Data is beautiful​, please don't ruin it

AnneMarieT2 90 views 43 slides Jun 18, 2019
Slide 1
Slide 1 of 43
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43

About This Presentation

My presentation at WiMLDS Paris, March 12, 2019


Slide Content

Anne-Marie Tousch Senior Research Scientist      @amy8492 please don't ruin it! Data is beautiful Mach 12 th , 2019

2 reasons to visualize data Photo by  Yuri Loginov  from  Pexels

I keep seeing plain tables.

I keep seeing plain tables. Do they want me to read all this ? Did they copy-paste their slides from their paper ? Do they care about their audience? Do they care about giving this talk? Are they hiding something ? Do they realize a dataviz would be much more powerful ? Most respectful interpretation ?

Efficient communication A picture tells a 1000 words . Source: Business Insider, August 2016

Summary statistics

never trust summary statistics alone; always visualize your data Detecting patterns http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html

What is dataviz about?

Maxwell’s model of Saturn’s rings, 1858 Visualizing is always great , not only for data

Visualize algorithms http://playground.tensorflow.org

Pictures credits : Wiki Commons /  the_jetboy  CC2.0 / Sandor Vamos / Wiki Commons /  Acute3D /   Walkerssk / Pixabay  / Wiki Commons  / Wiki Commons Many ways to apprehend the world 7,000,000 visitors a year​ 2,500,000 rivets​ 10,100 tons​

Exploiting the human visual system Ever heard of Gestalt theory?

Count the 3s below Ready ?

Count the 3s below 756395068473 658663037576 860372658602 846589107830 Source: http://www.storytellingwithdata.com/book/downloads  

How many 3s? Source: http://www.storytellingwithdata.com/book/downloads  

Count the 3s below 756 3 9506847 3 65866 3 3 7576 860 3 72658602 8465891078 3 Source: http://www.storytellingwithdata.com/book/downloads  

Much easier now , uh? Source: http://www.storytellingwithdata.com/book/downloads  

How? Eugene Kim CC2.0

Define your goal Choose  an effective  visual Find  the right focus Close the  loop Explore / Explain Question? Simple is better Function first, form next Use color, size Remove clutter Do you answer your question? Do you have a story? Follow the process

Follow best practices Actively take control Think accessibility Use rules of thumbs Be truthful

When you want to focus the attention on just a number or two When you have a mixed audience, for information lookup To show the relationship between two things The best for continuous data over time Makes it very easy to compare categories To compare totals and also subcomponents Choose an effective , simple visual Source: http://www.storytellingwithdata.com/book/downloads  

Pie charts are evil

756 3 9506847 3 65866 3 3 7576 860 3 72658602 8465891078 3 Remember Source: http://www.storytellingwithdata.com/book/downloads  

There are many preattentive attributes Source: http://www.storytellingwithdata.com/book/downloads  

But two are special Colour is the most powerful tool you have. Use it sparingly and resist the urge to use colour for the sake of being colourful. Leverage colour selectively to highlight the important parts of your visual. Size matters. If you’re showing multiple things that are of roughly equal importance, size them similarly. If there is one really important thing, leverage size to indicate that: make it BIG!

Maximise data- ink ratio, within reason . Edward Tufte , The Visual Display of Quantitative Information

Forgo chartjunk , including moiré vibration, the grid , and the duck . Edward Tufte , The Visual Display of Quantitative Information

The moiré effect

Don’t let your design choices be happenstance. They should be the result of explicit decisions.

Select good defaults

One Python trick:

Take  the control

Take-aways Wikimedia Commons

Know your data

You should care It’s not only about nice graphics There’s a wealth of resources Well-grounded best practices

Further tips Highlight the important stuff Eliminate distractions Create a visual hierarchy of information Make it accessible 1 2 3 4 Only highlight 10% of the overall visual. Use preattentive attributes to do so, even together for very important stuff When detail isn’t needed, summarize. Ask yourself if eliminating this would change anything. If not, take it out. Push less impacting items to the background with light grey Organize information to guide the audience. Follow a Z-pattern from top left to bottom right. You might be an engineer, but it shouldn’t take someone with an engineering degree to understand your graph. Use simple language 5 Choose simple language over complex, choose fewer words over more words, define any specialized language with which your audience may not be familiar, and spell out acronyms. Be mindful of aestethics 6 Be smart with colors . Pay attention to alignment to give a sense of unity and cohesion. Leverage white space, and don’t add stuff just to fill space Always prefer simple over complex

The Visual Display of Quantitative Information . Edward Tufte . Graphics Press, 2d edition, 2001. The classic on beautiful, faithful displays. Visualization Analysis and Design . Tamara Munzner . AK Peters / CRC Press, Oct 2014. A comprehensive textbook. Visualize this: the FlowingData guide to design, visualization, and statistics . Nathan Yau . John Wiley & Sons, 2011. For practical examples and code. The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures . Dona M. Wong. W. W. Norton & Company, 2013. Storytelling with Data: A Data Visualization Guide for Business Professionals. Cole Nussbaumer Knaflic . Wiley, 2015. Books

Tukey, John W. "The future of data analysis." The annals of mathematical statistics 33.1 (1962): 1-67. pdf Cleveland, William S., and Robert McGill. "Graphical perception: Theory, experimentation, and application to the development of graphical methods." Journal of the American statistical association 79.387 (1984): 531-554. pdf Gelman , Andrew, Cristian Pasarica , and Rahul Dodhia . "Let's practice what we preach: turning tables into graphs." The American Statistician 56.2 (2002): 121-130. pdf Gelman , Andrew, and Antony Unwin. " Infovis and statistical graphics: different goals, different looks." Journal of Computational and Graphical Statistics 22.1 (2013): 2-28. pdf Gelman , Andrew, and Thomas Basbøll . "When do stories work? Evidence and illustration in the social sciences." Sociological Methods & Research 43.4 (2014): 547-570. pdf Maaten , Laurens van der, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of machine learning research 9.Nov (2008): 2579-2605. pdf Kim, Been, Rajiv Khanna, and Oluwasanmi O. Koyejo . "Examples are not enough, learn to criticize! criticism for interpretability." Advances in Neural Information Processing Systems . 2016. pdf Wongsuphasawat , Kanit , et al. "Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow ." IEEE transactions on visualization and computer graphics 24.1 (2018): 1-12. pdf Research papers

Flowing Data   Storytelling With Data The Functional Art Google Brain PAIR group colorbrewer2.org  helps select colors Blogs & other resources Learn from good examples junkcharts vizwiz fivethirtyeight theguardian.com/data But also from bad ones viz.wtf Practice with  makeovermonday   Interested? React on  paris-wimlds.slack.com  

Questions? Thanks @Paolo Terzi (Criteo) from whom I took a bunch of slides  Colocho CC BY-SA 2.5

Rule of thumb:  function first, form next