UNIT-1 INTRODUCTION TO Data Visualization Topics: Data Visualization and its Importance, Drivers for Data Visualization, Introduction to Data Visualization Analytics, Data Visualization Analytics applications. Data Visualization TECHNOLOGIES: Hadoop’s Parallel World, Data discovery, Open-source technology for Data Visualization Analytics, cloud and Data Visualization, Predictive Analytics, Mobile Business Intelligence and Data Visualization, Crowd Sourcing Analytics, Inter- and Trans-Firewall Analytics, Information Management.
Data Visualization Data Visualization refers to extremely large and complex datasets that cannot be easily managed, processed or analysed using traditional data processing tools. Importance of Data Visualization: Informed Decision Making Improved Customer Experience Innovation and Product Development Operational Efficiency Fraud Detection and Security
Smart Cities and Urban Planning Scientific Research Challenges and Opportunities Technologies and Tools Ethical Considerations Healthcare Advancements Optimizing Marketing Strategies
Drivers for Data Visualization
1. The digitization of society 2. The drop in technology costs 3. Connectivity through cloud computing 4. Increased knowledge about data science 5. Social media applications 6. The rise of Internet-of-Things (IoT).
Data Visualization analytics Data Visualization analytics is the sometimes-difficult process of analysing large amounts of data in order to reveal information – such as hidden patterns, correlations, market trends, and consumer preferences – that may assist businesses in making educated business choices. The Lifecycle Phases of Data Visualization Analytics : Stage 1 - Business case evaluation Stage 2 - Identification of data Stage 3 - Data filtering
Stage 4 - Data extraction Stage 5 - Data aggregation Stage 6 - Data analysis Stage 7 - Visualization of data Stage 8 - Final analysis result Data Visualization Analytics Tools: Matplotlib: A widely-used 2D plotting library for Python. Provides a variety of chart types, enabling the creation of static, animated, and interactive visualizations. Versatile and a staple in the toolkit of many data scientists. Seaborn: Built on Matplotlib, Seaborn specializes in statistical data visualization. Simplifies the process of creating informative and attractive visualizations. Excellent for exploratory data analysis.
Plotly : A versatile library supporting interactive visualizations and dashboards. Compatible with multiple programming languages, including Python, R, and Julia. Ideal for creating dynamic and interactive data visualizations. D3.js: A JavaScript library for producing dynamic, interactive data visualizations in web browsers. Provides full control over the visualization process. Powerful for creating custom and complex visualizations. Tableau Public: While not strictly open source, Tableau Public is noteworthy for its accessibility. Allows the creation and sharing of interactive charts, dashboards, and reports. A free version of Tableau’s data visualization platform with a user-friendly interface.
Exploring the top 10 open-source Data Visualization tools to stay ahead of tech in the year 2023 D3.js: Website: D3.js Description: A powerful JavaScript library for creating interactive and dynamic data visualizations in the web browser. Matplotlib: Website: Matplotlib Description: A widely-used 2D plotting library for Python that produces high-quality static charts and graphs. Plotly : Website: Plotly Description: A Python graphing library that makes interactive, publication-quality graphs online. It can be used both online and offline. Bokeh: Website: Bokeh Description: A Python interactive visualization library that targets modern web browsers for presentation. Tableau Public: Website: Tableau Public Description: While Tableau itself is a commercial product, Tableau Public allows for free data visualization sharing.
Exploring the top 10 open-source Data Visualization tools to stay ahead of tech in the year 2023 6. Grafana: Website: Grafana Description: An open-source analytics and monitoring platform. It supports various data sources and is commonly used for time-series data. 7. Metabase : Website: Metabase Description: An open-source business intelligence tool that lets users ask questions about their data and create interactive dashboards. 8. Chart.js: Website: Chart.js Description: A simple yet flexible JavaScript charting library for designers and developers to create interactive charts in web pages. 9. Vega-Lite: Website: Vega-Lite Description: A high-level grammar for creating visualizations using JSON syntax. It is built on top of D3.js and simplifies the process of creating charts. 10. Redash : Website: Redash Description: An open-source data visualization and dashboarding platform that connects to various data sources to create interactive visualizations.
cloud and Data Visualization Data Visualization and cloud computing are two distinctly different ideas, but the two concepts have become so interwoven that they are almost inseparable. It's important to define the two ideas and see how they relate. The pros of Data Visualization in the cloud Scalability Agility Cost
Accessibility Resilience The cons of Data Visualization in the cloud Network dependence Storage costs Security Lack of standardization
Predictive Analytics Predictive analytics uses statistics and modelling techniques to determine future performance. Industries and disciplines, such as insurance and marketing, use predictive techniques to make important decisions. Predictive models help make weather forecasts, develop video games, translate voice-to-text messages, customer service decisions, and develop investment portfolios.
People often confuse predictive analytics with machine learning even though the two are different disciplines. Types of predictive models include decision trees, regression, and neural networks. How Does Netflix Use Predictive Analytics? Data collection is very important to a company like Netflix. It collects data from its customers based on their behaviour and past viewing patterns. It uses that information to make recommendations based on their preferences. This is the basis behind the "Because you watched..." lists you'll find on your subscription.
Mobile Business Intelligence and Data Visualization Mobile Business Intelligence (BI) refers to the ability to access and perform BI-related data analysis on mobile devices and tablets. Why is mobile BI important? Businesses these days possess an abundance of data. Everyone needs real-time data access to make data-driven decisions anytime and anywhere in this fast-paced environment. The number of organizations using mobile apps like SaaS for critical business processes is increasing daily. Whether you are a CEO, salesperson, digital marketer, department manager, or employee, mobile BI can help you increase productivity, improve the decision-making process , and boost your business.
Crowd Sourcing Analytics Crowd sourcing is the process of exploring customers’ ideas, opinions and thoughts available on the internet from large groups of people aimed at incorporating innovation, implementing new ideas, and eliminating product issues. Pros Crowdsourcing brings together communities around a common project or cause Efficient way of solving time-intensive problems Deeper engagement by communities, who resonate and build loyalty to the product or solution
Cons Results can be easily skewed based on the crowd being sourced Lack of confidentiality or ownership of an idea Potential to miss the best ideas, talent, or direction and fall short of the goal or purpose.
Real Life Examples Lay’s Netflix Use Crowdsourcing Waze
INFORMATION MANAGEMENT Information management refers to a program or system inside an organisation that controls the procedures that regulate the structure, processing, distribution, and use of data. For the objectives of business intelligence, information management is essential.
Challenges of information management Collecting information Making information available Ensuring information is used