What is data analytics?
Mostcompaniesarecollectingloadsofdataallthetime—but,initsrawform,thisdata
doesn’treallymeananything.Thisiswheredataanalyticscomesin.Dataanalyticsisthe
processofanalyzingrawdatainordertodrawoutmeaningful,actionableinsights,
whicharethenusedtoinformanddrivesmartbusinessdecisions.
Adataanalystwillextractrawdata,organizeit,andthenanalyzeit,transformingitfrom
incomprehensiblenumbersintocoherent,intelligibleinformation.Havinginterpretedthe
data,thedataanalystwillthenpassontheirfindingsintheformofsuggestionsor
recommendationsaboutwhatthecompany’snextstepsshouldbe.
,businessesandorganizationsareabletodevelopamuchdeeperunderstandingoftheir
audience,theirindustry,andtheircompanyasawhole—and,asaresult,aremuchbetter
equippedtomakedecisionsandplanahead.
What’s the difference between data analytics
and data science?
Dataanalytics
Adataanalystwillseektoanswerspecificquestionsoraddressparticular
challengesthathavealreadybeenidentifiedandareknowntothebusiness.
Todothis,theyexaminelargedatasetswiththegoalofidentifyingtrends
andpatterns.Theythen“visualize”theirfindingsintheformofcharts,
graphs,anddashboards.Thesevisualizationsaresharedwithkey
stakeholdersandusedtomakeinformed,data-drivenstrategicdecisions.
Data science:
•Adatascientist,ontheotherhand,considerswhatquestionsthebusiness
shouldorcouldbeasking.
•Theydesignnewprocessesfordatamodeling,writealgorithms,devise
predictivemodels,andruncustomanalyses.
•Forexample:Theymightbuildamachinetoleverageadatasetandautomate
certainactionsbasedonthatdata—and,withcontinuousmonitoringand
testing,andasnewpatternsandtrendsemerge,improveandoptimizethat
machinewhereverpossible.
What are the different types of data analysis?
Now we have a working definition of data analytics,
let’s explore the four main types of data
analysis:descriptive,diagnostic,predictive,
andprescriptive.
What is the typical process that a data analyst will
follow?
Nowwe’vesetthesceneintermsoftheoveralldataanalystrole,
let’sdrilldowntotheactualprocessofdataanalysis.Here,we’ll
outlinethefivemainstepsthatadataanalystwillfollowwhen
tacklinganewproject:
Define the question(s) you want to answer
•The first step is to identifywhy you are conducting
analysisandwhat question or challenge you hope to solve.
•At this stage, you’ll take a clearly defined problem and come up with
a relevant question or hypothesis you can test. You’ll then need to
identify what kinds of data you’ll need and where it will come from.
•For example: A potential business problem might be that customers
aren’t subscribing to a paid membership after their free trial ends.
Your research question could then be “What strategies can we use to
boost customer retention?”
Collect the data
•Withaclearquestioninmind,you’rereadytostartcollectingyour
data.Dataanalystswillusuallygatherstructureddatafromprimaryor
internalsources,suchasCRMsoftwareoremailmarketingtools.
•Theymayalsoturntosecondaryorexternalsources,suchasopendata
sources.Theseincludegovernmentportals,toolslikeGoogleTrends,
anddatapublishedbymajororganizationssuchasUNICEFandthe
WorldHealthOrganization.
Clean the data
•Onceyou’vecollectedyourdata,youneedtogetitreadyforanalysis—
andthismeansthoroughlycleaningyourdataset.Youroriginaldataset
maycontainduplicates,anomalies,ormissingdatawhichcoulddistort
howthedataisinterpreted,sotheseallneedtoberemoved.Data
cleaningcanbeatime-consumingtask,butit’scrucialforobtaining
accurateresults.
Analyze the data
•Now for the actual analysis! How youanalyze the datawill depend
on the question you’re asking and the kind of data you’re working
with, but some common techniques include regression
analysis,cluster analysis, and time-series analysis (to name just a
few).
•We’ll go over some of these techniques in the next section. This step
in the process also ties in with the four different types of analysis we
looked at in section three (descriptive, diagnostic, predictive, and
prescriptive).
Visualize and share your findings
•Thisfinalstepintheprocessiswheredataistransformedintovaluable
businessinsights.Dependingonthetypeofanalysisconducted,you’ll
presentyourfindingsinawaythatotherscanunderstand—intheformofa
chartorgraph,forexample.
•Atthisstage,you’lldemonstratewhatthedataanalysistellsyouinregards
toyourinitialquestionorbusinesschallenge,andcollaboratewithkey
stakeholdersonhowtomoveforwards.Thisisalsoagoodtimeto
highlightanylimitationstoyourdataanalysisandtoconsiderwhatfurther
analysismightbeconducted.
What skills do you need to become a data
analyst?
Hardskills
•Mathematicalandstatisticalability
•KnowledgeofprogramminglanguagessuchasSQL,R,orPython
•Ananalyticalmindset
Softskills
Keenproblem-solvingskills
Excellentcommunicationskills
Adaptability
What is Data Analytics?
Inthisnewdigitalworld,dataisbeinggeneratedinanenormousamount
whichopensnewparadigms.Aswehavehighcomputingpoweranda
largeamountofdatawecanusethisdatatohelpusmakedata-driven
decisionmaking.Themainbenefitsofdata-drivendecisionsarethatthey
aremadeupbyobservingpasttrendswhichhaveresultedinbeneficial
results.
Types of Data Analytics
•There are four major types of data analytics:
•Predictive (forecasting)
•Descriptive (business intelligence and data mining)
•Prescriptive (optimization and simulation)
•Diagnostic analytics
What is Business Intelligence?
•BusinessIntelligenceisoneofthemostpowerfultoolsmany
organizationsusetoknowtheircustomerbaseandmarketbetter.It
describesthebusinessmethodologyinwhichtherawdatais
transformedintousefulinformationwhichhelpsindecisionmaking.
BenefitsofBusinessIntelligence
•Businessintelligencehasbroadapplications,andiftalkingaboutthe
benefitsofbusinessintelligenceintheretailsector,nowadaysbusiness
intelligencetoolsenableorganizationstotakebenefitofdatanotonly
toassumecurrentsalesbutalsotoestimatefuturepotential,patterns,
trendsandknowthedemandofthecustomeronadeeperlevel.
Use of Pattern
Organizationscanfindoutanemployee'sarrivaltimeattheoffice
bywhentheircellphoneshowsupintheparkinglot.Observingthe
recordoftheswipeoftheparkingpermitcardinthecompany
parkinggarage,caninformtheorganizationwhetheranemployee
SȚintheofficebuildingoroutoftheofficeatanymomentintime.
Data Processing Chain
DATA:
•Anything that is recorded is data. Observations and facts are data: anecdotes Data and
opinions are also data, of a different kind.
•Data can be numbers. like the record of daily weather, or daily sales. Data can be
alphanumeric, such as the names of employees and customers O can come from any
number of sources from operational records 1n-Data side an organization, or from records
compiled by the industrial bodies and government agencies.
•Data can come from individuals telling stories from memory and from people's interaction in
social contexts, or from machines reporting their own status or from logs of web usage Data
can come in many ways it may come as paper reports, or as a file stored on a computer.
•It may be words spoken over the phone. It may be e-mail or chat on the Internet or may
come as movies and songs in DVDs, and so on. There is also data about data that is called
metadata. For example, people regularly upload videos on YouTube.
•The format of the video file (whether high resolution or lower resolution) is metadata. The
information about the time of uploading is metadata. The account from which it was
uploaded is also metadata. The record of downloads of the video is also metadata
Database
•Database A database is a modeled collection of data that is accessible in
many ways.
•A data model can be designed to integrate the operational data of the
organızation. 'The data model abstracts the key entities involved in an
action and their relation-ships.
•Most databases today follow the relational data model and its variants.
Each data modeling technique imposes rigorous rules and constraints to
ensure the integrity and consistency of data over time.
Data Warehouse
•Data Warehouse A data warehouse is an organized store of data from all over
the organization specially designed to help make management decisions.
Data can be extracted from operational database to answer a particular set of
queries.
•This data, com binedwith other data, can be rolled up to a consistent
granularity and uploaded to a separate data store called the data warehouse.
Therefore. the data warehouse is a simpler version of the operational
database, with the purpose of addressing reporting and decision-making
needs only.
•The data in the warehouse cumula-tivelygrows as more operational data
becomes available and is extracted and appended to the data warehouse.
Unlike in the operational database, the data values in the warehouse are not
updated
Data Mining
•Data Mining is the art and science of discovering useful and innovative
patterns from data. There is a wide variety of patterns that can be found in
the data. There are many techniques, simple or complex, that help with
finding patterns.
Decision tree
•Adecision treeis a flowchart-like structure used to make decisions or
predictions. It consists of nodes representing decisions or tests on
attributes, branches representing the outcome of these decisions, and
leaf nodes representing final outcomes or predictions. Each internal
node corresponds to a test on an attribute, each branch corresponds
to the result of the test, and each leaf node corresponds to a class
label or a continuous value.
Regression:
Regression is a statistical method used in finance, investing, and other
disciplines that attempts to determine the strength and character of the
relationship between a dependent variable and one or more independent
variables.
Artificial Neural Network
The term "Artificial Neural Network" is derived from Biological neural
networks that develop the structure of a human brain. Similar to the
human brain that has neurons interconnected to one another, artificial
neural networks also have neurons that are interconnected to one
another in various layers of the networks. These neurons are known as
nodes,that learn from past data and predict future values.
Cluster analysis
Cluster analysis isa statistical method for processing data. It works by
organising items into groups –or clusters.
Clustering helps to splits data into several subsets. Each of these
subsets contains data similar to each other, and these subsets are called
clusters. Now that the data from our customer base is divided into
clusters, we can make an informed decision about who we think is best
suited for this product.
Association rule mining:
Association rule mining isa technique used to uncover hidden
relationships between variables in large datasets. It is a popular method
in data mining and machine learning and has a wide range of
applications in various fields, such as market basket analysis, customer
segmentation, and fraud detection.
Data Visualization
•Data Visualization As data and insights grow in number, a new
requirement is the ability of the executives and decision makers to absorb
this information in real time. There BSI limit to human comprehension and
visualization capacity. That is a good reason to prioritize and manage with
fewer but key variables that relate directly to the Key Result Areas (KRAs) of
a role.
•Here are a few considerations when presenting using data: = Present the
conclusions and not just report the data. Choose wisely from a palette of
graphs to suit the data. Organize the results to make the central point stand
out Ensure that the visuals accurately reflect the numbers. Inappropriate
visu-als can create misinterpretations and misunderstandings = Make the
presentation unique, imaginative and memorable
Terminology and Careers
•Many overlapping terms are used in the marketplace. Here is a brief
look at the most popular terms used.
Business Intelligenceis a business oriented term. So is Decision
science. This is the field of using data-driven insights to make superior
business decisions. This is a broad category and in some sense includes
data analytics or data mining as a component.
Data Analyticsis a technology oriented term. So is Data Mining. Both
involve the use of techniques and tools to find novel useful patterns from
data. It involves
•Big Data is special type of data … very large, fast and complex.
•Storing and managing all this data is a slightly more technical task and
it uses more recent and innovative tools.
•This field is more attractive to the more technical people.
•Cloud computing is an attractive solution to store and process Big
Data