Overview of software for effective data analysis and data visualisation_2024.pptx

MRoux 100 views 37 slides Sep 16, 2025
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

Part of a series about data visualisation


Slide Content

Photo by Stefan Els Overview of software for effective data analysis and data visualisation

Marié Roux Manager: Research Impact Services [email protected] Kirchner van Deventer Manager: Research Data Services [email protected] PRESENTERS

CONTENT Introduction Data Cleaning Statistical analysis Visualisation applications and services Code help GIS/mapping Temporal data analysis Text/word clouds Infographics Social and other network analysis Working with Colour

This workshop will give an overview of tools and will not consists of in-depth training for each tool Presenters are not experts in the field of data analysis and visualisation , but are able to make a selection of the most important tools INTRODUCTION

What? Data wrangling is the process of transforming and structuring data from one raw form into a desired format with the intent of improving data quality and making it more consumable and useful for analytics or machine learning. It’s also called data cleaning. Benefits? Increased clarity and understanding of your data Data consistency Improved accuracy and precision of data Improved communication and decision-making Learn more : https://www.alteryx.com/glossary/data-wrangling DATA CLEANING

https://www.capellasolutions.com/blog/taming-the-data-beast-the-art-of-efficient-data-wrangling

Microsoft Excel   The most common tool used for manipulating spreadsheets and building analyses. With decades of development behind it, Excel can support almost any standard analytics workflow and is extendable through its native programming language, Visual Basic. Excel is suitable for simple analysis, but it is not suited for  analysing big data  — it has a limit of around 1 million rows — and it does not have good support for collaboration or versioning. Consider more modern cloud-based analytics platforms for large and collaborative analyses. Learn more : Data cleaning in Excel Microsoft Resources DATA CLEANING

OpenRefine OpenRefine is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data. It was borne out of a project started by Google (and used to be called Google Refine), but is now an open source project hosted on  Github . What can it do? Best tool to work with if you need to tidy up messy data. ‘Wrangle' messy or un-structured data to make it more structured. This is a necessary first step if you want to analyse the data in a spreadsheet or other statistical analysis tool. Finding and removing duplicates; grouping similar data; trim whitespace from beginning and end of values; Translate street addresses to lat / lng coordinates, etc. Learn more : New user manual DATA CLEANING

R R is a language and environment for statistical computing and graphics. What can it do : R started off as a statistical analysis language with built-in support for graphics and handling certain common data formats such as spreadsheet-like rows and columns. It is now also used for mapping, dashboards, interactive Web apps etc. Disadvantage: The fact that R runs on the command line means that users will have to take the time to learn which commands do what, and not all users will be comfortable with a text-only interface. Learn more: Computerworld Beginner's Guide to R  /  60+ resources to improve your R skills / R tutorials / Datacamp free course on R STATISTICAL ANALYSIS Source: https://data-flair.training/blogs/why-learn-r/

RStudio What can it do : RStudio is a set of integrated tools designed to help you be more productive with R. It includes a console, syntax-highlighting editor that supports direct code execution, and a variety of tools for plotting, viewing history and managing your workspace. Learn more: RStudio education ; RStudio tutorial ; Coursera: Open Source tools for Data Science ; Introduction to RStudio (Princeton University); RStudio Essentials STATISTICAL ANALYSIS

RStudio biblioshiny in Bibliometrix biblioshiny is a shiny app providing a web-interface for bibliometrix . The main features of bibliometrix : Data importing and conversion to data frame collection Data gathering using Dimensions, PubMed and Scopus APIs collection Data filtering Analytics and plots for four different level metrics: Sources Authors Documents Clustering by Coupling STATISTICAL ANALYSIS How to run biblioshiny Install bibliometrix R package in RStudio and then start biblioshiny digiting: bibliometrix :: biblioshiny () How to use biblioshiny Please follow the biblioshiny tutorial clicking  here . For a video tutorial, click  here .

SAS (Analytics Software & Solutions) : L eader in analytics. Through innovative analytics, BI and data management software and services, SAS helps turn data into better decisions. SPSS : The SPSS® software platform offers advanced statistical analysis, a vast library of machine learning algorithms, text analysis, open source extensibility, integration with big data and seamless deployment into applications. Statistica : A n advanced analytics software portfolio that provides enterprise and desktop software for statistics, data analysis, data management, data visualization, data mining (also called predictive analytics), and quality control. Campus licenses for above : IT ‘s Software Hub ( https://stellenbosch.sharepoint.com/sites/SoftwareHUB/Shared%20Documents/Forms/AllItems.aspx ) for students where you can download AtlasTi , Statistica , Mathematica, SAS, SPSS and others directly. Log in with your SU username and password. OTHER STATISTICAL ANALYSIS TOOLS

Atlas.ti What it does : A p owerful workbench for the qualitative analysis of large bodies of textual, graphical, audio and video data. Sophisticated tools help to arrange, reassemble, and manage material in creative, yet systematic ways. Advantages : Use of automatic network layouts; Word frequencies can be visualised as tables and as word clouds; support text, PDF, survey, audio, video and graphical files; -lots of built-in functions for coding, retrieving, analysing, visualising and exporting Learn more: Video tutorials / Quick tour and manuals / Creating and assigning codes / Advice on coding in Atlas.ti / PGSkills workshop at SU QUALITATIVE DATA ANALYSIS SOFTWARE Source: https://atlasti.com/2016/12/23/rethinking-atlasti8/

Dedoose What it does : A cross-platform app for analysing qualitative and mixed methods research with text, photos, audio, videos, spreadsheet data and more. Advantages: User-friendly; easy storage on a cloud; affordable pricing (you only pay for the months in which you use it); full qualitative and mixed methods support; interactive visualisations and analytics Learn more : Dedoose resources ; Review of Dedoose QUALITATIVE DATA ANALYSIS SOFTWARE

DEDOOSE DASHBOARD

DEDOOSE EXCERPTS AND CODING

Tableau Desktop What it does? This tool can turn data into any number of visualisations, from simple to complex. You can drag and drop fields onto the work area and ask the software to suggest a visualisation type, then customise everything from labels and tool tips to size, interactive filters and legend display. Tableau offers a variety of ways to display interactive data. You can combine multiple connected visualisations onto a single dashboard, where one search filter can act on numerous charts, graphs and maps; underlying data tables can also be joined. Learn more: Get free student version: https://www.tableau.com/academic/students#form https://www.tableau.com/learn/webinars/ GT-tableau-desktop-getting-started VISUALISATION APPLICATIONS AND SERVICES

Microsoft Power BI What it does:  This is Microsoft's general Business Intelligence (BI) platform, with data wrangling and visualisation for many different data sources (without Excel's row limits), as well as a web service that allows for streaming data and scheduled data updates. Summary example This is simple to use for basic visualisations and report creation and makes it fairly easy to do data exploration. It will handle files too large for Excel. Runs R scripts within the desktop software and can generate many R visualisations. Learn more : Training resources from Microsoft . VISUALISATION APPLICATIONS AND SERVICES

Google Data Studio What it does: This service is designed to create dashboards and reports from multiple data sources. The focus is on Google sources such as Google Sheets, Google Analytics and BigQuery , but some other sources are supported as well. You can create meaningful, shareable charts and graphs with a few clicks — just drag and drop. Customise everything from colours to logos, add shapes and images, insert dynamic controls, and easily give viewers a way to select the data they want to see in a report from multiple sources — including Analytics, Google Ads, Google Search Console, YouTube, and Campaign Manager. Learn more: Data Studio video tutorials / Gallery with examples / Introduction to Data Studio online course VISUALISATION APPLICATIONS AND SERVICES

RAWGraphs What it does : The idea behind RAWGraphs is to provide a tool that allows people without coding skills to produce visualisations on their own. Originally conceived for graphic designers to complete a series of tasks that were unavailable in other tools, it evolved into a platform that provides simple ways to map data dimensions onto visual variables. Basically  RAWGraphs allows users to easily and quickly create data visualisations that can be exported and edited in graphics software (such as Adobe Illustrator and Sketch). Learn more : Using RAWGraphs VISUALISATION APPLICATIONS AND SERVICES

Flourish What it does : Flourish is part of the Canva family and was created  to enable everyone to tell stories with data. Launched in 2018, the tool is used by a large community of creators.  Flexible templates, custom themes, focus on stories Interactive toolkit, not just visualisations. Create everything from quizzes to carousels. Learn more : Help and resources / Blog VISUALISATION APPLICATIONS AND SERVICES

CODE HELP D3.js D3.js  is a JavaScript library for manipulating documents based on data.  D3  helps you bring data to life using HTML, SVG, and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation. Exhibit A Publishing Framework for Data-Rich Interactive Web Pages. Exhibit lets you easily create web pages with advanced text search and filtering functionalities, with interactive maps, timelines, and other visualisations. Google chart tools Display live data. JavaScript InfoVis Toolkit What sets this tool apart from many others is the highly polished graphics it creates from just basic code samples. Since this is not an application but a code library, you must have coding expertise in order to use it.

GIS / MAPPING Geographic Information Systems (GIS) What it does Computer systems for capturing, storing, checking, and displaying data related to positions on Earth’s surface. GIS can show many different kinds of data on one map, such as streets, buildings, and vegetation. Programs that create, edit, visualise , analyse and publish geospatial information on Windows, Mac, Linux, BSD (Android coming soon) Can open digital maps on your computer, create new spatial information to add to a map, create printed maps customised to your needs and perform spatial analysis. Interactive tool for data analysis, integration and visualisation . Convey information in an intuitive and accessible manner For example: Google Maps Waze Source: https://education.nationalgeographic.org/resource/geographic-information-system-gis/

GIS / MAPPING GIS Courses offered by SU Centre for Geographical Analysis The CGA specialises in the application of geographical information systems, satellite remote sensing and other geographical-analytical techniques in carrying out its research, training and service provision functions. Courses Offered: Introduction to GIS Introduction To Earth Observation Advanced Earth Observation Example of map that CGA can teach you to create

QUANTUM GIS (QGIS) Major open-source GIS program Accessible and functional Free to download, small installation size and low system requirements compared to other open-source GIS Can import, edit and save most spatial file formats Significant user-base and online documentation offers a wide community of support Integrates with other open-source GIS and extends its capabilities Multiple plugins and tools allow for greater customisation User-friendly interface Learn more : Quantum GIS , Introduction to QGIS

OTHER OPEN SOURCE GIS/MAPPING TOOLS GRASS GIS OpenJUMP OpenLayers OpenStreetMap CARTO Free 14-day trial Learn more : GIS Lounge , GIS and Maps

TEMPORAL DATA ANALYSIS Temporal data is data that represents a state in time, such as land-use patterns, total rainfall over a certain period. Can be used to analyse weather patterns and other environmental variables, monitor traffic conditions, study demographic trends, etc. Examples of temporal data. Source: https://desktop.arcgis.com/en/arcmap/10.3/map/time/what-is-temporal-data.htm Learn more : Temporal Analysis , Spatiotemporal

TEMPORAL DATA VISUALISATION TOOLS D3.js ( https://d3js.org/ ) What it is JavaScript library for manipulating documents based on data Uses HTML, SVG and CSS Allows for animation and interaction in data visualisation Pros Massive community of support Highly flexible in design choices Free to use Cons Requires knowledge of coding and then learning D3 on top of that Learn more : D3.js Graph Gallery , 3.js A Practical Introduction

TEMPORAL DATA VISUALISATION TOOLS Observable What it is A website where you can learn to use D3.js and other data visualisation tools through tutorials and practical training Examples: https://observablehq.com/d/d280cb30053f69a9 https://observablehq.com/@d3/rotating-orthographic

TEXT/WORD CLOUDS WordClouds Free to use Can generate various shapes and colours Wooclap Free and easy to use ATLAS.ti Embedded feature available Example of a word cloud using the text Heart of Darkness by Joseph Conrad (1899)

INFOGRAPHICS Canva Free to use graphic design platform (with optional upgrade plans for more advanced use) Can create social media graphics, presentations, posters and infographics Venngage Free to use (with optional upgrade plans for more advanced use)

SOCIAL AND OTHER NETWORK ANALYSIS Gephi What it is Gephi is a 3D visualization tool for visualising network data. It is an interactive visualization tool. Free to use Useful for visualising statistical information, including relationships within networks NodeXL What it is An Excel plugin that can display network graphs from a list of connections Optimised for analysing online social media Drawback Requires Excel to run https://youtu.be/TFBkAO1MjnU

VosViewer for network analysis Data ( https://www.vosviewer.com/features/highlights ) Web of Science, Scopus, Dimensions, Lens, and PubMed . Co-authorship networks, citation-based networks, and co-occurrence networks can be created based on data downloaded from Web of Science, Scopus, Dimensions, and Lens. Co-authorship networks and co-occurrence networks can also be created based on PubMed data. Crossref , Europe PMC, and OpenAlex . Networks can also be created based on data retrieved through the APIs of Crossref , Europe PMC, and OpenAlex . Semantic Scholar, OpenCitations , and WikiData . For a given set of DOIs, networks can also be created based on data retrieved through the APIs of Semantic Scholar, OpenCitations , and WikiData . SOCIAL AND OTHER NETWORK ANALYSIS Use VosViewer for citation, bibliographic coupling, co-citation, or co-authorship relationships. VOSviewer also offers text mining functionality that can be used to construct and visualise co-occurrence networks of important terms.

WORKING WITH COLOUR ColorBrewer An online tool designed to help with selecting appropriate colour schemes for maps and other graphics The provided map does not depict actual data, but rather serves as a carefully designed diagnostic tool for evaluating individual colour schemes It provides you with your chosen colours ’ codes to apply to your own map

USEFUL LINKS http://www.kwantu.net/blog/2016/12/28/how-to-clean-up-messy-data-using-open-refine https://atlasti.com/2016/12/23/rethinking-atlasti8/ https://www.visualisingdata.com/resources/ https://www.computerworld.com/article/2507728/enterprise-applications-22-free-tools-for-data-visualization-and-analysis.html?page=10 http://selection.datavisualization.ch/ https://steemit.com/utopian-io/@scipio/how-to-do-data-visualization-using-rawgraphs

QUESTIONS? Network graph of character interactions in the Star Wars franchise

Thank you Enkosi Dankie Photo by Stefan Els
Tags