Standards in Information Management: XML: Arijit Sengupta
Standards in Information Management: XML: Arijit Sengupta
Standards in Information Management: XML: Arijit Sengupta
Standards in Information
Management: XML
Arijit Sengupta
Learning Objectives
ISOM
• Overview
• Syntax and Structure
• The XML Alphabet Soup
• XML as a meta-language
Overview
What is XML?
ISOM
Application X
Repository Database
Overview
Documents vs. Data
ISOM
• Submissions by
Microsoft
IBM
Hewlett-Packard
Fujitsu Laboratories
Sun Microsystems
Netscape (AOL), and others…
• Technologies using XML
SOAP, ebXML, BizTalk, WebSphere, many
others…
Agenda
ISOM
• Overview
• Syntax and Structure
• The XML Alphabet Soup
• XML as a meta-language
Syntax and Structure
Components of an XML Document
ISOM
• Elements
Each element has a beginning and ending tag
• <TAG_NAME>...</TAG_NAME>
Elements can be empty (<TAG_NAME />)
• Attributes
Describes an element; e.g. data type, data range, etc.
Can only appear on beginning tag
• Processing instructions
Encoding specification (Unicode by default)
Namespace declaration
Schema declaration
Syntax and Structure
Components of an XML Document
ISOM
• Yes
<xml? Version=“1.0” ?>
<PARENT>
<CHILD1>This is element 1</CHILD1>
<CHILD2/>
<CHILD3></CHILD3>
</PARENT>
Syntax and Structure
An XML Document
ISOM
<?xml version='1.0'?>
<bookstore>
<book genre=‘autobiography’ publicationdate=‘1981’
ISBN=‘1-861003-11-0’>
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre=‘novel’ publicationdate=‘1967’ ISBN=‘0-201-63361-2’>
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Syntax and Structure
Namespaces: Overview
ISOM
xmlns: bk = “urn:mybookstuff.org:bookinfo”
xmlns: bk = “http://www.example.com/bookinfo/”
<BOOK xmlns:bk=“http://www.bookstuff.org/bookinfo”>
<bk:TITLE>All About XML</bk:TITLE>
<bk:AUTHOR>Joe Developer</bk:AUTHOR>
<bk:PRICE currency=‘US Dollar’>19.99</bk:PRICE>
<bk:BOOK xmlns:bk=“http://www.bookstuff.org/bookinfo”
xmlns:money=“urn:finance:money”>
<bk:TITLE>All About XML</bk:TITLE>
<bk:AUTHOR>Joe Developer</bk:AUTHOR>
<bk:PRICE money:currency=‘US Dollar’>
19.99</bk:PRICE>
Syntax and Structure
Namespaces: Default Namespace
ISOM
• Overview
• Syntax and Structure
• The XML Alphabet Soup
• XML as a meta-language
The XML ‘Alphabet Soup’
ISOM
• XML document:
<BOOK>
<TITLE>All About XML</TITLE>
<AUTHOR>Joe Developer</AUTHOR>
</BOOK>
• DTD schema:
<!DOCTYPE BOOK [
<!ELEMENT BOOK (TITLE+, AUTHOR) >
<!ELEMENT TITLE (#PCDATA) >
<!ELEMENT AUTHOR (#PCDATA) >
]>
The XML ‘Alphabet Soup’
Schemas: XSD Example
ISOM
• XML document:
<CATALOG>
<BOOK>
<TITLE>All About XML</TITLE>
<AUTHOR>Joe Developer</AUTHOR>
</BOOK>
…
</CATALOG>
The XML ‘Alphabet Soup’
Schemas: XSD Example
ISOM
XSLT
The XML ‘Alphabet Soup’
Transformations: XSLT
ISOM
/bookstore[@specialty = “textbooks”]
(find all bookstores where the specialty
attribute = “textbooks”)
/book[@style = /bookstore/@specialty]
(find all books where the style attribute = the
specialty attribute of the bookstore element
at the root)
More XPath Examples
ISOM
/bookstore/book[1] Selects the first book element that is the child of the
bookstore element
/bookstore/book[last()] Selects the last book element that is the child of the
bookstore element
/bookstore/book[last()-1] Selects the last but one book element that is the child of
the bookstore element
/bookstore/book[position()<3] Selects the first two book elements that are children of the
bookstore element
//title[@lang] Selects all the title elements that have an attribute named
lang
//title[@lang='eng'] Selects all the title elements that have an attribute named
lang with a value of 'eng'
• Accessor functions:
node-name, data, base-uri, document-uri
• Numeric value functions:
abs, ceiling, floor, round, …
• String functions:
compare, concat, substring, string-length,
uppercase, lowercase, starts-with, ends-
with, matches, replace, …
• Other functions include functions on
boolean values, dates, nodes, etc.
The XML ‘Alphabet Soup’
Data Islands
ISOM
<XML id=“XMLID”>
<BOOK>
<TITLE>All About XML</TITLE>
<AUTHOR>Joe Developer</AUTHOR>
</BOOK>
</XML>
• Overview
• Syntax and Structure
• The XML Alphabet Soup
• XML as a meta-language
XML as a Meta-Language
ISOM
XLL XML/DTD
XSLT
XSchema GO
CML
XPath
MathML
WML
XPointer XQL BeanML
Gene Ontology (GO)
ISOM
• Designed to be media-independent
• Initiated by International Press
Telecommunications Council
• Enables tracking of news stories over time
• NewsML website
www.newsml.org
• NewsML DTD
http://www.oasis-open.org/cover/newsML.html
• SportsML DTD – Derived from NewsML DTD
http://xml.coverpages.org/sportsML.html
cXML
ISOM
• Overview
• Syntax and Structure
• The XML Alphabet Soup
• XML as a meta-language
Resources
ISOM
• http://www.xml.com/
• http://www.w3.org/xml/
• http://www.w3schools.com/
• http://msdn.microsoft.com/xml/