2 Content SGML XML Difference Between XML and HTML
3 SGML SGML (Standard Generalized Markup Language) is a standard for how to specify a document markup language or tag set. Such a specification is itself a document type definition (DTD). SGML is not in itself a document language, but a description of how to specify one. It is metadata. SGML is based on the idea that documents have structural and other semantic elements that can be described without reference to how such elements should be displayed. The actual display of such a document may vary, depending on the output medium and style preferences. Some advantages of documents based on SGML are: They can be created by thinking in terms of document structure rather than appearance characteristics (which may change over time). They will be more portable because an SGML compiler can interpret any document by reference to its document type definition (DTD). Documents originally intended for the print medium can easily be re-adapted for other media, such as the computer display screen.
4 SGML Contd … SGML is based somewhat on earlier generalized markup languages developed at IBM, including General Markup Language (GML) and ISIL . HTML is an example of SGML- based Language An SGML application consists of several parts: The SGML declaration : The SGML declaration specifies which characters and delimiters may appear in the application. The document type definition ( DTD): The DTD defines the syntax of markup constructs. The DTD may include additional definitions such as numeric and named character entities. A specification that describes the semantics to be ascribed to the markup . This specification also imposes syntax restrictions that cannot be expressed within the DTD. Document instances containing data (contents) and markup . Each instance contains a reference to the DTD to be used to interpret it.
5 XML XML stands for E xtensible M arkup L anguage. It is a text-based markup language derived from Standard Generalized Markup Language (SGML ). XML tags identify the data and are used to store and organize the data, rather than specifying how to display it like HTML tags, which are used to display the data . XML is not going to replace HTML in the near future, but it introduces new possibilities by adopting many successful features of HTML . XML is a markup language that defines set of rules for encoding documents in a format that is both human-readable and machine-readable It is a software- and hardware-independent tool for storing and transporting data. XML became a W3C Recommendation as early as in February 1998.
6 XML- Characteristics XML is extensible − XML allows to create self-descriptive tags, or language, that suits the application . XML carries the data, does not present it − XML allows to store the data irrespective of how it will be presented . XML is a public standard − XML was developed by an organization called the World Wide Web Consortium (W3C) and is available as an open standard.
7 XML- Declaraction Syntax Rules for XML Declaration The XML declaration is case sensitive and must begin with "" where "xml" is written in lower-case. If the document contains XML declaration, then it strictly needs to be the first statement of the XML document. The XML declaration strictly needs be the first statement in the XML document. An HTTP protocol can override the value of encoding that you put in the XML declaration.
8 XML- Tags & Elements An XML file is structured by several XML-elements, also called XML-nodes or XML-tags. The names of XML-elements are enclosed in triangular brackets < > Syntax Rules for Tags and Elements Element Syntax: Each XML-element needs to be closed either with start or with end elements as : <element>….</element> or <element/> Nesting of Elements: An XML-element can contain multiple XML-elements as its children, but the children elements must not overlap. i.e., an end tag of an element must have the same name as that of the most recent unmatched start tag. Root Element: An XML document can have only one root element. For example, following is not a correct XML document, because both the x and y elements occur at the top level without a root element Case Sensitivity: The names of XML-elements are case-sensitive. That means the name of the start and the end elements need to be exactly in the same case
9 XML- Attributes An attribute specifies a single property for the element, using a name/value pair. An XML element can have one or more attributes . Syntax for defining attributes: Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are considered two different XML attributes. Same attribute cannot have two values in a syntax. Attribute names are defined without quotation marks, whereas attribute values must always appear in quotation marks
10 XML- Document An XML document is a basic unit of XML information composed of elements and other markup in an orderly package. An XML document can contain a wide variety of data. For example, database of numbers, numbers representing molecular structure or a mathematical equation. Document Prolog comes at the top of the document, before the root element. This section contains: XML declaration Document type declaration Document Elements are the building blocks of XML. These divide the document into a hierarchy of sections, each serving a specific purpose. You can separate a document into multiple sections so that they can be rendered differently, or used by a search engine. The elements can be containers, with a combination of text and other elements <?xml version = "1.0"?> <contact-info> < name>Jaya</ name> < company>ISM</ company> <phone>(011) 123-4567</phone> </contact-info>
11 XML- Example <?xml version="1.0" encoding="ISO-8859-1" ?> <note> < to> ABC </ to> < from> XYZ </ from> <heading> Reminder </heading> <body> Don't forget me this weekend! </body> </note> The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO-8859-1 = Latin-1/West European character set). The next line describes the root element of the document (like saying: "this document is a note"): The next 4 lines describe 4 child elements of the root (to, from, heading, and body ). And finally the last line defines the end of the root element . XML documents must contain a root element. This element is "the parent" of all other elements. The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree. All elements can have sub elements (child elements) .
12 HTML vs XML HTML XML HTML is used to display data and focuses on how data looks. XML is a software and hardware independent tool used to transport and store data . It focuses on what data is. HTML is a markup language itself. XML provides a framework to define markup languages . HTML is not case sensitive . XML is case sensitive . HTML is a presentation language. XML is neither a presentation language nor a programming language. HTML has its own predefined tags . You can define tags according to your need . In HTML, it is not necessary to use a closing tag . XML makes it mandatory to use a closing tag . HTML is static because it is used to display data. XML is dynamic because it is used to transport data. HTML does not preserve whitespaces . XML preserve whitespaces .