Digital Data
Unit -3
Dr. T. NIKIL PRAKASH
ASSISTANT PROFESSOR
DEPARTMENT OF INFORMATION TECHNOLOGY,
ST. JOSEPH’S COLLEGE (AUTONOMOUS),
TIRUCHIRAPPALLI-02
Digital Data
•Data is present in homogeneous sources as well as in
heterogeneous sources.
•The need of the hour is to understand, manage,
process, and take the data for analysis to draw
valuable insights
•Types of Digital Data
•Digital data can be structured, semi-structured or unstructured
data.
•Structured data
•When data follows a pre-defined schema/structure we say
it is structured data.
•This is the data that is in an organized form (e.g., in rows
and columns) and be easily used by a computer program.
•Relationships exist between entities of data, such as classes
and their objects.
•About 10% data of an organization is in this format.
•Data stored in databases is an example of structured data
Sources of Structured Data
•SQL Databases-Oracle DB,
•Spreadsheets such as Excel
•OLTP Systems
•Online forms
•Sensors such as GPS or RFID tags
•Network and Web server logs
•Medical devices
Ease of working with structured data
•Structureddataiseasiertoworkwiththanunstructureddatabecauseit's
alreadyformattedandhasaclearstructure:
•Easytoanalyzeandmanipulate
•Structureddataiseasyforbothhumansandmachinestoworkwithbecauseit's
alreadyformatted.
•Easytosearchandquery
•Structureddata'sorganizednaturemakesiteasytomanipulateandquery.
•Easytouse
•Structureddatacanbeusedbyaveragebusinessuserswhounderstandthetopic
thedatarelatesto.
•Easytostore
•StructureddatacanbestoredintabularformatslikeExcelsheetsorSQL
databases,whichrequirelessstoragespace.
•Easytoscale
•Structureddatacanbestoredindatawarehouses,whichmakesithighlyscalable
•Example
•<!DOCTYPEhtml>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
•Unstructured data:
•This is the data which does not conform to a data model or is not
in a form which can be used easily by a computer program.
•About 80% data of an organization is in this format; for example,
memos, chat rooms, PowerPoint presentations, images, videos,
letters. researches, white papers, body of an email, etc.
Issues of Unstructured Data
•Storageandmanagement
•Unstructureddataisdifficulttostoreandmanagebecauseitcomesinmany
formats,suchastext,video,audio,andsocialmediacontent.Itcanalsobe
difficulttonavigatethroughthelargevolumeofunstructureddata.
•Processing
•Unstructureddatacanbetime-consumingandresource-intensiveto
process.Traditionaldatastorageoptionsmayalsobeinflexibleandunable
toadapttounstructureddata.
•Analysis
•Unstructureddataisnotorganizedinapredefinedmanner,makingit
difficulttoprocessandanalyzeusingtraditionalmethods.
•Cyber-attacks
•Unstructureddatacanmakesystemsmorevulnerabletocyber-attacks.
Deals with unstructured data
Introduction to Big Data
•The "Internet of Things" and its widely ultra-
connected nature are leading to a burgeoning
rise in big data.
•There is no dearth of data for today's
enterprise.
•On the contrary, they are mired in data and
quite deep at that.
•Data is widely available.