XML(EXtensible Markup Language). XML(EXtensible Markup Language).pptppt

sivani14565220 36 views 36 slides Apr 28, 2024
Slide 1
Slide 1 of 36
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36

About This Presentation

XML(EXtensible Markup Language).ppt XML(EXtensible Markup Language).pptXML
(EXtensible Markup Language).ppt
XML(EXtensible Markup Language).ppt
XML(EXtensible Markup Language).ppt
XML(EXtensible Markup Language).ppt
XML(EXtensible Markup Language).ppt
XML(EXtensible Markup Language).ppt
XML(EXtensib...


Slide Content

XML(EXtensible Markup
Language)
4/28/2024 1GAGAN THAKRAL(ABESEC)

XML
•XML stands for EXtensibleMarkup Language.
•XML is a markup languagemuch like HTML.
•XML was designed to describe data.
•XML tags are not predefined. You must define
your own tags.
•XML uses a Document Type Definition(DTD)
or an XML Schemato describe the data.
•XML with a DTD or XML Schema is designed to
be self-descriptive.
4/28/2024 2GAGAN THAKRAL(ABESEC)

XML
•Best description of XML is this: XML is a cross-
platform, software and hardware
independent tool for transmitting
information.
4/28/2024 3GAGAN THAKRAL(ABESEC)

XML-Example
XML document :(file name: “xml_note.xml”)
<?xml version="1.0" encoding="ISO -
8859-1" ?>
<note>
<to>Aman</to>
<from>Raman</from>
<header>Reminder</header>
<body>Don't forget me this
weekend!</body>
</note>
4/28/2024 4GAGAN THAKRAL(ABESEC)

More Example
<?xml version="1.0" encoding="ISO -8859-1"?>
<bookstore>
<book category="COOKING">
<title lang="en">North Indian Food</title>
<author>Dr. Ram Parkash</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
----------------
----------------
</bookstore>
4/28/2024 5GAGAN THAKRAL(ABESEC)

The Main Differences
Between XML and HTML
–XML was designed to carry data.
–XML is not a replacement for HTML.
–XML and HTML were designed with different
goals:
•XML was designed to describe data and to focus on
what data is.
•HTML was designed to display data and to focus on
how data looks.
–HTML is about displaying information, while XML
is about describing information.
4/28/2024 6GAGAN THAKRAL(ABESEC)

Advantages of Using XML
•Truly Portable Data
•Easily readable by human users
•Very expressive
•Very flexible and customizable
•Easy to use from programs (libs available)
•Easy to convert into other representations
•Many additional standards and tools
•Widely used and supported
4/28/2024 7GAGAN THAKRAL(ABESEC)

XMLEncoding
•XML documents can contain international
characters, like Norwegian æøå, or French
êèé.
•To avoid errors, you should specify the
encoding used, or save your XML files.
•Character encoding defines a unique binary
code for each different character used in a
document.
•In computer terms, character encoding are
also called character set, character map, code
set, and code page.
4/28/2024 8GAGAN THAKRAL(ABESEC)

XMLEncoding
1.ISO-8859-1
2.UTF-8
3.UTF-16
4/28/2024 9GAGAN THAKRAL(ABESEC)

•The Unicode Standard has become a success
and is implemented in HTML, XML, Java,
JavaScript, E-mail, ASP, PHP, etc.
•The Unicode standard is also supported in
many operating systems and all modern
browsers.
•The Unicode Consortium cooperates with the
leading standards development organizations,
like ISO, W3C, and ECMA.
4/28/2024 10GAGAN THAKRAL(ABESEC)

•UTF-8 uses 1 byte (8-bits) to represent basic
Latin characters, and two, three, or four bytes
for the rest.
•UTF-8 = The Web Standard
•UTF-8 is the standard character encoding on
the web.
•UTF-8 is the default character encoding for
HTML5, CSS, JavaScript, PHP, SQL, and XML.
•UTF-16 uses 2 bytes (16 bits) for most
characters, and four bytes for the rest.
4/28/2024 11GAGAN THAKRAL(ABESEC)

A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the
universal...
</section>
</text>
</article>
4/28/2024 12GAGAN THAKRAL(ABESEC)

A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the
universal...
</section>
</text>
</article>
Freely definable
tags
4/28/2024 13GAGAN THAKRAL(ABESEC)

Elements in XML Documents
•(Freely definable) tags: article,title, author
–with start tag: <article>etc.
–and end tag: </article>etc.
•Elements: <article> ... </article>
•Elements have a name(article) and a content(...)
•Elements may be nested.
•Elements may be empty: <this_is_empty/>
•Each XML document has exactly one root element and forms
a tree.
•Elements with a common parent are ordered.
4/28/2024 14GAGAN THAKRAL(ABESEC)

Elements vs. Attributes
Elements may have attributes(in the start tag) that have a nameand
a value, e.g. <section number=“1“>.
What is the difference between elements and attributes?
•Only one attribute with a given name per element (but an arbitrary
number of subelements)
•Attributes have no structure, simply strings (while elements can have
subelements)
As a rule of thumb:
•Content into elements
•Metadata into attributes
Example:
<person born=“1912-06-23“ died=“1954-06-07“>
Abc</person> proved that…
4/28/2024 15GAGAN THAKRAL(ABESEC)

XML Documents as Ordered Trees
article
authortitle text
sectionabstract
Theindex
We
b
provides

title=“…“
number=“1“
In order

The Web
in 10
years
4/28/2024 16GAGAN THAKRAL(ABESEC)

Well-Formed XML Documents
A well-formeddocument must adher to, among others, the
following rules:
•Every start tag has a matching end tag.
•Elements may nest, but must not overlap.
•There must be exactly one root element.
•Attribute values must be quoted.
•An element may not have two attributes with the same
name.
•Comments and processing instructions may not appear
inside tags.
4/28/2024 17GAGAN THAKRAL(ABESEC)

Well-Formed XML Documents
A well-formeddocument must adher to, among others, the
following rules:
•Every start tag has a matching end tag.
•Elements may nest, but must not overlap.
•There must be exactly one root element.
•Attribute values must be quoted.
•An element may not have two attributes with the same
name.
•Comments and processing instructions may not appear
inside tags.
Only well-formed documents can
be processed by XML parsers.
4/28/2024 18GAGAN THAKRAL(ABESEC)

XML is not…
•A replacement for HTML
(but HTML can be generated from XML)
•A presentation format
(but XML can be converted into one)
•A programming language
(but it can be used with almost any language)
•A network transfer protocol
(but XML may be transferred over a network)
•A database
(but XML may be stored into a database)
4/28/2024 19GAGAN THAKRAL(ABESEC)

Conversion of XML into Tree
<?xml version = “1.0” ?>
<address>
<name>
<first>Shiva</first>
<last>Singh</last>
</name>
<email>[email protected]</email>
<phone>9999999999</phone>
<birthday>
<year>1991</year>
<month>03</month>
<day>11</day>
</birthday>
</address>
4/28/2024 20GAGAN THAKRAL(ABESEC)

•A well-formed XML document has a tree
structure and obeys all the XML rules.
•A particular application may add more rules in
either a DTD (document type definition) or in
a schema.
•Many specialized DTDs and schemas have
been created to describe particular areas.
4/28/2024 21GAGAN THAKRAL(ABESEC)

Document Type Definitions
•A DTD describes the tree structure of a
document and something about its data.
•There are two data types, PCDATA and CDATA.
–PCDATA is parsed character data.
–CDATA is character data, not usually parsed.
•A DTD determines how many times a node
may appear, and how child nodes are ordered.
4/28/2024 22GAGAN THAKRAL(ABESEC)

DTD for address Example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
4/28/2024 23GAGAN THAKRAL(ABESEC)

Schemas
•Schemas are themselves XML documents.
•They were standardized after DTDs and provide more
information about the document.
•They have a number of data types including string,
decimal, integer, boolean, date, and time.
•They divide elements into simple and complex types.
•They also determine the tree structure and how
many children a node may have.
4/28/2024 24GAGAN THAKRAL(ABESEC)

Schema for address Example
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
4/28/2024 25GAGAN THAKRAL(ABESEC)

XML Parsers
•An XML parser is a software library or package
that provides interfaces for client applications
to work with an XML document.
•The XML Parser is designed to read the XML
and create a way for programs to use XML.
•XML parser validates the document and check
that the document is well formatted.
4/28/2024 GAGAN THAKRAL(ABESEC) 26

Let's understand the working of
XML parser by the figure given
below:
4/28/2024 GAGAN THAKRAL(ABESEC) 27

Types of XML Parsers
•These are the two main types of XML Parsers:
1. DOM
2. SAX
4/28/2024 GAGAN THAKRAL(ABESEC) 28

DOM (Document Object Model)
•A DOM document is an object which contains
all the information of an XML document. It is
composed like a tree structure.
•The DOM Parser implements a DOM API. This
API is very simple to use.
4/28/2024 GAGAN THAKRAL(ABESEC) 29

Features of DOM Parser
•A DOM Parser creates an internal structure in
memory which is a DOM document object and
the client applications get information of the
original XML document by invoking methods
on this document object.
•DOM Parser has a tree based structure.
4/28/2024 GAGAN THAKRAL(ABESEC) 30

Advantages
1) It supports both read and write operations
and the API is very simple to use.
2) It is preferred when random access to widely
separated parts of a document is required.
4/28/2024 GAGAN THAKRAL(ABESEC) 31

Disadvantages
•It is memory inefficient. (consumes more
memory because the whole XML document
needs to loaded into memory).
•It is comparatively slower than other parsers.
4/28/2024 GAGAN THAKRAL(ABESEC) 32

SAX (Simple API for XML)
•A SAX Parser implements SAX API. This API is
an event based API and less intuitive.
4/28/2024 GAGAN THAKRAL(ABESEC) 33

Features of SAX Parser
•It does not create any internal structure.
•Clients does not know what methods to call,
they just overrides the methods of the API and
place his own code inside method.
•It is an event based parser, it works like an
event handler in Java.
4/28/2024 GAGAN THAKRAL(ABESEC) 34

Advantages
•It is simple and memory efficient.
•It is very fast and works for huge documents.
4/28/2024 GAGAN THAKRAL(ABESEC) 35

Disadvantages
•It is event-based so its API is less intuitive.
•Clients never know the full information
because the data is broken into pieces.
4/28/2024 GAGAN THAKRAL(ABESEC) 36