9.
Extensible Markup Language (XML): An Introduction
XML (Extensible Markup Language) is a standard protocol developed by John Bosak and members from the World Wide Web Consortium W3C (http://www.w3.org/XML). It roots back to 1969, Charles Goldfarb, an IBM researcher who was in charge of the Generalized Markup Language (GML) design team. Although SGML is an International standard for marking up data, it has the following weaknesses: extreme complex, expensive to implement, and no support in common browsers such as Netscape and Internet Explorer. While HTML is free and simple, but is not able to describe the data it is representing
. The XML
was developed to overcome shortcomings of SGML and HTML. XML promises to
increase both the efficiency and the flexibility of handling computerized
information as needed in E-Commerce, for example you can specify a new set of
user defined tags: <PRICE> and </PRICE> that is unique to some
specific applications and is not limited by the tags available from HTML.
·
A meta language for
describing data (meta-data)
·
A standard protocol
for exchanging and publishing information in a structured manner
A quick look of an
XML markups document:
<?xml
version=“1.0”?>
<!—dataset.xml
-->
<dataset>
<row num=“1”>
<lastname>SMITH</lastname>
<firstname>Dan</firstname>
<salary>80000</salary>
<jobtitle>Web Developer <jobtitle>
</row>
</dataset>
Contents of this XML
page include the
)
Additional
highlights of XML are:
The design
goals for XML are: (source – http://www.w3.org/TR/REC-xml/)
2.
XML shall support a wide variety of applications.
3.
XML shall be compatible with SGML.
4.
It shall be easy to write programs which process XML
documents.
5.
The number of optional features in XML is to be kept to
the absolute minimum, ideally zero.
6.
XML documents should be human-legible and reasonably
clear.
7.
The XML design should be prepared quickly.
8.
The design of XML shall be formal and concise.
9.
XML documents shall be easy to create.
10.
Terseness in XML markup is of minimal importance.
XML in 10
points (source- http://www.w3.org/XML/1999/XML-in-10-points)
Examples of
XML Applications
·
Separation of data storage and display within HTML pages - we can use XML to carry data. The data in an XML file
can be loaded into HTML pages as “data islands” and then using HTML for data
layout and display.
In addition to an XML file itself, some files and programs
are needed for processing an XML-compliant applications:
o
SAX (Simple API for XML) http://www.saxproject.org/ - is originally a
Java-only API. The current version is SAX 2.0.1, and there are versions for
several programming language environments other than Java.
XML Syntax
and Terminology
are
rewritten, in XML, as empty tag notation which is ended with a forward slash.
which have
no text appearing between two tags,
Start
Tag:<ElementName AttributesName="AttributeValue">
End Tag: </ElementName>
The XML declaration
appears at the first line of an XML document, before the content:
<?xml
version=”versionumber”
encoding=”encoding_declaration”
standalone=”yes_or_no” ?>
This
declaration is interpreted as the following:
Version
Declaration (required):
Encoding
Declaration (optional):
Standalone
Document Declaration (optional):
XML pages can be created with any text editor. We will use a Microsoft Notepad editor to create the following email XML example
page, save it as email.xml, and view it with
Internet Explorer .
Example 1:
A well-formed email is prepared in XML format. The XML declaration appears
first line to show the version of XML standard and encoded language code. It
follows by a comment line, which shows the XML file name with .xml as the
extension. View email.xml
example.
<?xml version="1.0"
encoding’”UTF-8” ?>
<!-- email.xml -->
<EMAIL>
<TO>lin@ipfw.edu</TO>
<FROM>lin@hotmail.com</FROM>
<CC>XMLL_in_Action@hotmail.com</CC>
<SUBJECT>XML Example: Basic email</SUBJECT>
<BODY>This is an XML-based email.</BODY>
</EMAIL>
Example 2: XML
elements (tags) created for capture information of <PRICE>,
<PICTURE>, and <ARTICLE>.
<PRICE
Currency="Euro"> 26.02 </PRICE>
<PICTURE> </PICTURE>
<ARTICLE>
<TITLE> </TITLE>
<DATE> </DATE>
<AUTHOR>
<FIRSTNAME> </FIRSTNAME>
<LASTNAME> </LASTNAME>
</AUTHOR>
<SUMMARY> </SUMMARY>
<CONTENT> </CONTENT>
</ARTICLE>
Example 3: Empty elements.
<img align =”center”
src=”http://www.etcs.ipfw.edu/~lin/netdiagram.gif” />
<br> </br>
<br/>
XML
Elements, Attributes, and Values
XML
elements (tags) are basic components of a document. All elements in an XML
documents have parent-child relationships.
Elements can have different content types (attributes) and may have a
set of attribute specifications. Attributes are contained with an element’s
opening tag. Each attribute has a quotation mark delimited values that describe
the purpose and content of the particular element. Information contained in
attributes are called Metadata or “data about data” which describes the
content, description, quality, condition, or other characteristics of data. For
Web applications, metadata can be seen as machine understandable information is
addressed by the W3C Metadata Activity (http://www.w3.org/Metadata/Activity.html).
XML
elements must follow these naming rules defined in XML specification (www.w3.org/TR/REC-cml):
Content
within XML document can be encoded as either elements or attributes. For
example, a book title can be expressed as one of the following:
As an
element:
<book>
<title> XML-RPC </title>
*****
</book>
As an
attribute:
<book title=”XML-RPC”>
*****
</book>
XML Trees
XML
documents are structured as hierarchical trees. A root element contains
information to show its meaning. As shown in Example 1, the element <EMAIL> is the root element which
has five child elements including <TO>, <FROM>, <CC>,
<SUBJECT>, and <BODY>. In Example 4, again, <lab-room> is the
root tag (element), which contains three elements <labtable>,
<labchair>, and <pc>.
Further elements and attributes are used to describe Lab Tables, Lab
Chairs, and Personal Computer in terms of <quantity>, <quality>,
<color>, <manufacturer>, etc.
Example 4: An XML file for describing a university computer
lab. View et226lab.xml
example.
<?xml version="1.0"?>
<!-- et226lab.xml -->
<lab-room>
<labtable type="rectangular" wood="maple">
<quantity>8</quantity>
<quality>good</quality>
<color>brown</color>
<manufacturer>Steel Case</manufacturer>
</labtable>
<labchair wood="oak">
<quantity>20</quantity>
<quality>good</quality>
<cushion included="false">
<color>brown</color>
</cushion>
</labchair>
<pc>
<quantity>14</quantity>
<monitor>14</monitor>
<cpu> Intel Pentium 4 1.2 GHz</cpu>
<harddisk> 40Gbytes</harddisk>
</pc>
</lab-room>
Example 5:
XML documents can be viewed with Microsoft Internet Explorer 6.0 in tree
structure format. You can open
et226lab.xml document with Internet Explorer 6.0 to see the left picture with
only the root element, then click the “-/+” sign to expand or contract the
viewed list.
XML
Document Processing
The
function of the XML markup is to describe its storage and logical structure,
and to associate attribute-value pairs with its logical structures
There is a
need to extract XML data or content and display it using a Web browser. For
example, we have an XML page named home_appliance_catalog.xml, which describes
the <Washer> product of <HOME_APPLIANCE> catalog. To bind this XML data on a Web page, we need
to make a reference to the XML data source. This is accomplished by placing the
following code in the example HTML page named “ha_catalog_display.html”:
<xml src="home_appliance_catalog.xml"
id="xmlhomeapp"
async="false">
</xml>
We use its DATASRC
property with the same id=”xmlhomeapp” to reference to the same XML data
island:
<table datasrc="#xmlhomeapp" width="100%" border="1">
We then
use HTML <SPAN> tags to embed DATAFLD to
extract the bound data, and display content:
<tr align="center">
<td><span
datafld="product_name"></span></td>
<td><span
datafld="product_id"></span></td>
<td><span
datafld="EnergyStarQualified"></span></td>
<td><span
datafld="price"></span></td>
</tr>
When you
open the ha_catalog_display.html page with Microsoft Internet Explorer, the XML
content should display in HTML table format.
Example 6:
Binding XML data and display it with a IE Web browser. We created two files as
shown below:
Then open
the HTML page to read and display the XML data with the HTML table format.
<html>
<!-- ha_catalog_display.html -->
<body>
<xml src="home_appliance_catalog.xml"
id="xmlhomeapp"
async="false">
</xml>
<table datasrc="#xmlhomeapp" width="100%" border="1">
<thead>
<th>Product
Name</th>
<th>Product
ID</th>
<th>EnergyStar
Qualified</th>
<th>Price</th>
</thead>
<tr align="center">
<td><span
datafld="product_name"></span></td>
<td><span
datafld="product_id"></span></td>
<td><span
datafld="EnergyStarQualified"></span></td>
<td><span datafld="price"></span></td>
</tr>
</table>
</body>
</html>
<?xml version="1.0"
encoding="utf-8" ?>
<!-- home_appliance_catalog.xml -->
<!-- Edited with Microsoft Notepad -->
<HOME_APPLIANCES>
<WASHER>
<product_name>WH
Front Load Washer</product_name>
<product_id> GHW8200</product_id>
<EnergyStarQualified>Yes</EnergyStarQualified>
<price>$350.00</price>
</WASHER>
<WASHER>
<product_name>WH
Top Load Washer</product_name>
<product_id> GHW8100</product_id>
<EnergyStarQualified>No</EnergyStarQualified>
<price>$300.00</price>
</WASHER>
<WASHER>
<product_name>Washer/Dryer
All In One </product_name>
<product_id> GHW9300</product_id>
<EnergyStarQualified>Yes</EnergyStarQualified>
<price>$950.00</price>
</WASHER>
</HOME_APPLIANCES>
WML (Wireless Markup Language) – a WAP
(Wireless Application Protocol) WML specification endorsed by Ericsson,
Motorola, Nokia, and Unwired PlanetWML.
VoiceXML Standard
- VoiceXML is a Document Type Definition (DTD) created by four companies
AT&T, IBM, Lucent Technologies, and Motorola for implementing interactive
voice recognition (IVR) applications as Web-based telephony systems.
MathML - Mathematics Markup Language (http://www.w3.org/1999/07/REC-MathML-19990707/):
<plus/>, <times/>, <power/>, <mrow>, <msup>, <mi>, <mn>
EdaXML - XML-based
symbols mapped directly to its database of millions of electronic components.
The availability of these XML (Extensible Markup Language) symbols allows
printed circuit board (PCB) designers and design teams to perform
cross-platform design using electronic design automation (EDA) tools from a
variety of vendors.
XML For Automation Devices - It covers basic elements in automation including
algorithms, programs, controllers and interfaces (http://www.gca.org/papers/xmleurope2001/papers/html/s07-1.html)
CIM/XML – a
language for representing power system models http://www.langdale.com.au/CIMXML/
The
Control System Modeling Language www.slac.stanford.edu/econf/C011127/talks/THCT004.pdf
The
LandXML schema (http://www.anvil.eu.com/XML.htm)
- facilitates the exchange of data created during the Land Planning, Civil
Engineering and Land Survey process.
Web Sites
and Document References:
XML
Related Magazines
XML
Development Tools: Document Editor, IDE, XSLT
XML
Applications:
Development
tools
XML
Resources: