SE735
- Data and Document Representation & Processing
|
XML I |
Recommended
textbooks:
S. Holzner, Sams
Teach yourself XML in 21 Days, 3rd edition, 2004.
C. Bates, XML
in Theory and Practice, Wiley, 2003.
The following
examples are from Holzner:
Sample HTML
doc:
Text View |
Browser
View |
<HTML> <HEAD> <TITLE>Hello From HTML</TITLE> </HEAD> <BODY> <CENTER> <H1> An HTML
Document </H1> </CENTER> This is an HTML
document! </BODY> </HTML> |
|
Sample XML
doc:
Text View |
Browser
View |
<?xml version="1.0" encoding="UTF-8"?> <document> <heading> Hello From XML </heading> <message> This is an XML document! </message> </document> |
<?xml
version="1.0" encoding="UTF-8" ?> - <document> <heading>Hello From
XML</heading> <message>This text is
inside a <message> element.</message> </document> |
Sample XML
with Stylesheets:
Test View |
Browser
View |
Xml files contents: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/css"
href="ch01_04.css"?> <document> <heading> Hello From XML </heading> <message> This is an XML
document! </message> </document> |
|
Stylesheet file contents (ch01_04.css): heading {display: block; font-size: 24pt;
color: #ff0000; text-align: center} message{display: block; font-size:
18pt; color: #0000ff; text-align: center} |
Extracting
Content: JavaScript:
Text View |
|
<HTML> <HEAD> <TITLE> Retrieving
data from an XML document </TITLE> <XML ID="firstXML" SRC="ch01_02.xml"></XML> <SCRIPT
LANGUAGE="JavaScript"> function
getData() { xmldoc=
document.all("firstXML").XMLDocument; nodeDoc =
xmldoc.documentElement;
nodeHeading = nodeDoc.firstChild;
outputMessage = "Heading: " +
nodeHeading.firstChild.nodeValue;
message.innerHTML=outputMessage; } </SCRIPT> </HEAD> <BODY> <CENTER> <H1> Retrieving
data from an XML document </H1> <DIV ID="message"></DIV> <P> <INPUT TYPE="BUTTON" VALUE="Read the heading" ONCLICK="getData()"> </CENTER> </BODY> </HTML> |
|
Source Document: ch01_02.xml <?xml version="1.0" encoding="UTF-8"?> <document> <heading> Hello From XML </heading> <message> This is an XML
document! </message> </document> |
XML Editors:
Amaya - free
XML
Spy – free home edition
XMLWriter – 30 day trial
o
Valid tags begin with A to Z, _ , a to z
o
Second
characters may be digits 0 – 9, - , and .
o Tag names are case sensitive
o Tag names cannot include white space
<book> XML in Theory and Practice </book>
<name> Professor F. T. Marchese </name>
Rules:
o
An
element must have start and end tags unless it is an empty element
o
Start
and end tags must form a matched pair
Only have one
tag: Syntax …< />
<heading/>
<heading text =
“Hello from XML” />
o
Each
well formed document must contain a root element with any legal name
o
This
element contains all other elements
e.g.
<document>
<heading>
</heading>
<message>
This is an XML document!
</message>
</document>
Nesting
elements: tags must
pair-up inside XML so they are closed in reverse order:
<document>
<heading>
</heading>
<message>
This is an XML document!
</message>
</document>
o
ASCII
– 1 byte – 256 characters
o
Unicode
– 2 bytes 65536 characters
o
UCS
– Universal character system - 4 bytes – 4.3 billion characters
XML supports:
US-ASCII – US
ASCII
UTF-8 -- Compressed Unicode -- two bytes – 1st
byte ASCII , 2nd byte Unicode subset.
UTF-16 –
Compressed UCS
ISO-10646-UCS-2
-- Unicode
In practice…
XML “processors” support UTF-8
<?xml
version="1.0" encoding="UTF-8"?>
<document>
<heading>
Hello From XML
</heading>
<message>
This is an XML
document!
</message>
</document>
Character
Reference
Character |
Sequence |
< |
< |
> |
> |
‘ |
' |
& |
& |
“ |
" |
e.g.
<message> This text is inside a <message>
element. </message>
Result: This text is inside a <message> element.
<!-- This is a comment -->
Attributes
may appear in:
o
Elements
o
Processing
instructions
o
XML
declarations
Syntax:
attributename =
“value”
e.g.
<brush width=”10”
height =”5” color=”cyan” />
<point x=”10”
y=”100” />
<book title=”Home
Alone 2” review=”bad” />
CDATA are
sections of the XML document that are not parsed.
CDATA – Character Data
PCDATA – Parsed Character Data
<?xml
version="1.0" standalone="yes" ?> - <document> - <text> Here's
how the element starts: - <![CDATA[ <employee status="retired"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> . . . ]]>
</text> </document> |
Namespace – a unique identifier for a set of
names within an XML document
Declaring a
Namespace: assign xmlns:prefix attribute to a unique identifier, e.g.
xmlns:hr=http://www.superduperbigco.com/human_resources
The URIs (Uniform Resource Identifiers) or
URLs specified can point to a document such as a DTD or schema.
Original document |
- <document> - <employee> - <name> <lastname>Kelly</lastname>
<firstname>Grace</firstname>
</name> <hiredate>October 15,
2005</hiredate> - <projects> - <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> - <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document> |
Document using
namespaces |
- <hr:employee xmlns:hr="http://www.superduperbigco.com/human_resources" xmlns:boss="http://www.superduperbigco.com/big_boss"> - <hr:name> <hr:lastname>Kelly</hr:lastname>
<hr:firstname>Grace</hr:firstname>
</hr:name> <hr:hiredate>October
15, 2005</hr:hiredate> <boss:comment>Needs
much supervision.</boss:comment> - <hr:projects> - <hr:project> <hr:product>Printer</hr:product>
<hr:id>111</hr:id> <hr:price>$111.00</hr:price>
</hr:project> - <hr:project> <hr:product>Laptop</hr:product>
<hr:id>222</hr:id> <hr:price>$989.00</hr:price>
</hr:project> </hr:projects> </hr:employee> |
o
A DTD defines the formal rules of a
documents structure
o
Lists
elements, attributes, and entities that may be used in the document
o
Defines the relationship among elements, attributes, and entities
o
DTDs
outline the tree structure of an XML document
o
DTDs
have own structure and syntax
o
DTD
is a series of declarations of the form <! >
o
DTDs
contain 4 keywords:
o ELEMENT – which defines a tag
o ATTRIBUTE – which defines an attribute
of an ELEMENT
o ENTITY – which is used to define an
ENTITY
o NOTATION – which defines a data type
e.g. from
Bates:
<!DOCTYPE letter[ <!ELEMENT letter
(address)> <!ELEMENT address (line1, line2?, line3*, city,
(county|state)?, country?, code?)> <!ELEMENT line1 (#PCDATA)> <!ELEMENT line2 (#PCDATA)> <!ELEMENT line3 (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT county (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT country (#PCDATA)> <!ELEMENT code (#PCDATA)> ]> |
o
DTD
describes structure of XML document starting with root node – letter
o
DTD
is declared by using a <!DOCTYPE> element
o
<!DOCTYPE>
element syntax:
o <!DOCTYPE rootname
[DTD]>
o <!DOCTYPE rootname SYSTEM URI>
o <!DOCTYPE rootname SYSTEM URI
[DTD]>
o Each
tag is declared as an ELEMENT
o Each element may contain data or more
elements, and may have further attributes
o The structure must be declared as 1st
element, e.g. <!ELEMENT letter (address)>
o ELEMENT content follows name and is in
parentheses
o
Content
is a list of items separated by “,” or “|” – known as content model
o
Root
node has another ELEMENT as its content -(address)
o
Address
element contains all components:
<!ELEMENT address (line1, line2?, line3*, city, (county|state)?,
country?, code?)>
o
Comma
between elements means that all may be in XML document
o
Element
ordering is logical for human understanding, not required by XML.
o
Parentheses
used for grouping, and | is logical OR
o
Symbols after items signify
appearance:
Symbol |
Example |
Meaning |
Asterisk |
item* |
Item appears zero or more times |
Comma |
(item1, item2, item3) |
Separates items in sequence |
None |
item |
Item appears exactly once |
Parentheses |
(item1, item2) |
Encloses group of items |
Pipe |
(item1 | item2) |
Separates a set of alternatives |
Plus |
Item+ |
Item appears at least once |
Question Mark |
Item? |
Item appears once or not at all |
o Parsed character data - <!ELEMENT line1 (#PCDATA)>
o
Mixed content model - <!ELEMENT line1 (#PCDATA | house_number | street_name)*>
o Must
obey this form -> #PCDATA -> other elements separated by pipe ->
followed by *
o
Attributes give additional info
about element or content
o
Attributes declared separately and
associated with element:
<!ATTLIST
element attribute type default>
o
element
– name of element to which the attribute applies
o
attribute - attribute name
o
type – XML data
type
o
default
- XML attribute defaults
e.g.
<!ELEMENT country (#PCDATA)> <!ATTLIST country continent
(Europe | Asia | Africa | North America )”Asia” language CDATA
#IMPLIED> |
o
element
– country
o
attribute - continent – followed by an enumerated list
of values
o
default
- Asia
o
attribute - language – followed by CDATA
o
default
- #IMPLIED
XML Attribute
Types
Type
|
Usage |
CDATA |
Character
data – not parsed |
ENTITY |
Attribute
values is reference to an entity declared elsewhere in DTD |
ENTITIES |
Multiple
entities referenced |
ID |
Identifies
a location within document |
IDREF |
References
an ID declared elsewhere in DTD – used for hyperlinking in document |
IDREFS |
Multiple
Ids linked |
NMTOKEN |
Value
can be word or token |
NMTOKENS |
A list
of tokens |
NOTATION |
NOTATION
declared elsewhere |
Enumeration |
List of
possible values in parens |
XML Attribute
Defaults
Default |
Usage |
#REQUIRED |
Value
must be given for each element that has an attribute |
#IMPLIED |
Attribute
is optional – no value must be given |
#FIXED
value |
Attribute
must have value given |
Default |
Default
value is given for attribute |
o
XML document separated into number
of components called Entities
o
Each
entity has a unique name
o
Entities
use to:
o Split large documents
o Content needs to be used in a number
of places with document without duplication
o Different systems may render same
content in different ways
o
Declaration:
o <!ENTITY name definition>
o <!ENTITY name SYSTEM system_identifier
[NOTATION]>
o <!ENTITY name PUBLIC [public_identifier]
system_identifier [NOTATION]>
o
Internal
entity - simplest
definition –– within DTD – wherever referenced in XML document content in DTD
will be substituted for reference.
o Internal entity definition - <!ENTITY name definition>
o External
reference – refers to content outside DTD and
XML file – may be on remote system
o <!ENTITY locationmap SYSTEM
“./images/home.png” NDATA PNG>
§ URI - “./images/home.png”
§ NDATA – Notation data type follows
§ PNG – type of data
NOTATIONS
normally specify applications that can process data:
e.g.
<!NOTATION PNG
SYSTEM “/usr/bin/display”>
<!NOTATION gif SYSTEM "gifviewer.exe">
Internal DTD –
<!DOCTYPE
rootnode[ ]> |
External DTD –
<?xml
version="1.0"?> <!DOCTYPE
rootnode SYSTEM | PUBLIC [public_identifier] URI> |
Example:
from Holzner
XML file |
<?xml
version = "1.0" encoding="UTF-8"
standalone="no"?> <!DOCTYPE
document SYSTEM "ch04_07.dtd">
<document> <employee> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15,
2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document> |
DTD file |
<!ELEMENT
document (employee)*> <!ELEMENT
employee (name, hiredate, projects)> <!ELEMENT
name (lastname, firstname)> <!ELEMENT
lastname (#PCDATA)> <!ELEMENT
firstname (#PCDATA)> <!ELEMENT
hiredate (#PCDATA)>
<!ELEMENT
projects (project)*>
<!ELEMENT
project (product,id,price)> <!ELEMENT
product (#PCDATA)>
<!ELEMENT id
(#PCDATA)> <!ELEMENT price (#PCDATA)> |
XML Schemas:
o
Provide a means for defining the structure,
content and semantics of XML documents through XML itself.
o
Define a richer set of data types such as
booleans, numbers, dates and times, and currencies than the more traditional
DTD
o
XML Schemas make it easier to validate
documents based on namespaces
o
Defined in the W3C's XML Schema Working Group
Purpose - to define the legal building
blocks of an XML document
An XML Schema:
Using
Schema:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <!-- Define the actual document --> <xsd:element name="letter"> </xsd:element> </xsd:schema> |
o
Content of schema – mostly element
definitions
o
Elements may contain sub-elements (e.g.
string or numbers, or both)
o
Simple types - Elements that contain only
data
o
Complex types – all others
Example: Mortgage file (Holzner)
XML
file |
<?xml
version="1.0" encoding="UTF-8"?> <document documentDate="2005-03-02"> <comment>Good risk</comment> <mortgagee phone="888.555.1234"> <name>James
Blandings</name> <location>1234 299th St</location> <city>New York</city> <state>NY</state> </mortgagee> <mortgages> <mortgage loanNumber="66 7777 88"> <property>The Hackett
Place</property> <date>2005-03-01</date> <loanAmount>80000</loanAmount> <term>15</term> </mortgage> <mortgage loanNumber="11 8888 22"> <property>123 Acorn
Drive</property> <date>2005-03-01</date> <loanAmount>90000</loanAmount> <term>15</term> </mortgage> <mortgage loanNumber="33 4444 11"> <property>99 West
Pocusset St</property> <date>2005-03-02</date> <loanAmount>100000</loanAmount> <term>30</term> </mortgage> <mortgage loanNumber="55 3333 88"> <property>19 Johnson
Place</property> <date>2005-03-02</date> <loanAmount>110000</loanAmount> <term>30</term> </mortgage> <mortgage loanNumber="22 6666 99"> <property>345 Notingham
Court</property> <date>2005-03-02</date> <loanAmount>120000</loanAmount> <term>30</term> </mortgage> </mortgages> <bank phone="888.555.8888"> <name>XML Bank</name> <location>12 Schema
Place</location> <city>New York</city> <state>NY</state> </bank> </document> |
XSD
file |
<?xml
version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:annotation> <xsd:documentation> Mortgage record XML schema. </xsd:documentation> </xsd:annotation> <xsd:element name="document" type="documentType"/> <xsd:complexType name="documentType"> <xsd:sequence> <xsd:element ref="comment"/> <xsd:element name="mortgagee" type="recordType"/> <xsd:element name="mortgages" type="mortgagesType"/> <xsd:element name="bank" type="recordType"/> </xsd:sequence> <xsd:attribute name="documentDate" type="xsd:date"/> </xsd:complexType> <xsd:complexType name="recordType"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="location" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> </xsd:sequence> <xsd:attribute name="phone" type="xsd:string" use="optional"/> </xsd:complexType> <xsd:complexType name="mortgagesType"> <xsd:sequence> <xsd:element name="mortgage" minOccurs="0" maxOccurs="8"> <xsd:complexType> <xsd:sequence> <xsd:element name="property" type="xsd:string"/> <xsd:element name="date" type="xsd:date" minOccurs="0"/> <xsd:element name="loanAmount" type="xsd:decimal"/> <xsd:element name="term"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:maxInclusive value="30"/> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> <xsd:attribute name="loanNumber" type="loanNumberType"/> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:simpleType name="loanNumberType"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\d{2} \d{4}
\d{2}"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="comment" type="xsd:string"/> </xsd:schema> |
XML Schema elements are grouped by their function: top level elements, particles, multiple XML documents and namespaces, identity constraints, attributes, named attributes, complex type definitions, and simple type definitions.
The following are elements that appear at the top level of a schema document.
Element |
Description |
Defines an annotation. |
|
Declares an attribute. |
|
Groups a set of attribute declarations so
that they can be incorporated as a group for complex type definitions. |
|
Defines a complex type, which determines
the set of attributes and the content of an element. |
|
Declares an element. |
|
Groups a set of element declarations so
that they can be incorporated as a group into complex type definitions. |
|
Identifies a namespace whose schema
components are referenced by the containing schema. |
|
Includes the specified schema document in
the target namespace of the containing schema. |
|
Contains the definition of a notation to
describe the format of non-XML data within an XML document. An XML Schema
notation declaration is a reconstruction of XML 1.0 NOTATION declarations. |
|
Allows simple and complex types, groups,
and attribute groups that are obtained from external schema files to be
redefined in the current schema. |
|
Defines a simple type, which determines the
constraints on and information about the values of attributes or elements
with text-only content. |
The following are elements that can have minOccurs and maxOccurs attributes. Such elements always appear as part of a complex type definition or as part of a named model group.
Element |
Description |
Allows the elements in the group to appear (or not appear) in any order in the containing element. |
|
Enables any
element from the specified namespace(s) to appear in the containing sequence
or choice element. |
|
Allows one and
only one of the elements contained in the selected group to be present within
the containing element. |
|
Declares an
element. |
|
Groups a set of
element declarations so that they can be incorporated as a group into complex
type definitions. |
|
Requires the
elements in the group to appear in the specified sequence within the
containing element. |
The following are elements that bring in schema elements from other namespaces or redefine schema elements in the same namespace.
Element |
Description |
Identifies a namespace whose schema
components are referenced by the containing schema. |
|
Includes the specified schema document in
the target namespace of the containing schema. |
|
Allows simple and complex types, groups,
and attribute groups that are obtained from external schema files to be
redefined in the current schema. |
The following are elements that are related to identity constraints.
Element |
Description |
Specifies an XML Path Language (XPath)
expression that specifies the value (or one of the values) used to define an
identity constraint (unique, key, and keyref elements). |
|
Specifies that an attribute or element
value (or set of values) must be a key within the specified scope. The scope
of a key is the containing element in an instance document. A key must
be unique, non-nillable, and always present. |
|
Specifies that an attribute or element
value (or set of values) correspond to those of the specified key or unique
element. |
|
Specifies an XPath expression that selects
a set of elements for an identity constraint (unique, key, and keyref
elements). |
|
Specifies that an attribute or element
value (or a combination of attribute or element values) must be unique within
the specified scope. The value must be unique or nil. |
The following are elements that define attributes in schemas.
Element |
Description |
Enables any attribute from the specified
namespace(s) to appear in the containing complexType element or in the
containing attributeGroup element. |
|
Declares an attribute. |
|
Groups a set of attribute declarations so
that they can be incorporated as a group for complex type definitions. |
The following are elements that define named constructs in schemas. Named constructs are referred to with a QName by other schema elements.
Element |
Description |
Declares an attribute. |
|
Groups a set of attribute declarations so
that they can be incorporated as a group for complex type definitions. |
|
Defines a complex type, which determines
the set of attributes and the content of an element. |
|
Declares an element. |
|
Groups a set of element declarations so
that they can be incorporated as a group into complex type definitions. |
|
Specifies that an attribute or element
value (or set of values) must be a key within the specified scope. The scope
of a key is the containing element in an instance document. A key must
be unique, non-nillable, and always present. |
|
Specifies that an attribute or element
value (or set of values) correspond to those of the specified key or unique
element. |
|
Contains the definition of a notation to
describe the format of non-XML data within an XML document. An XML Schema
notation declaration is a reconstruction of XML 1.0 NOTATION declarations. |
|
Defines a simple type, which determines the
constraints on and information about the values of attributes or elements
with text-only content. |
|
Specifies that an attribute or element
value (or a combination of attribute or element values) must be unique within
the specified scope. The value must be unique or nil. |
The following are elements that create complex type definitions.
Element |
Description |
Allows the elements in the group to appear
(or not appear) in any order in the containing element. |
|
Defines an annotation. |
|
Enables any element from the specified
namespace(s) to appear in the containing sequence or choice
element. |
|
Enables any attribute from the specified
namespace(s) to appear in the containing complexType element or in the
containing attributeGroup element. |
|
Specifies information to be used by
applications within an annotation element. |
|
Declares an attribute. |
|
Groups a set of attribute declarations so
that they can be incorporated as a group for complex type definitions. |
|
Allows one and only one of the elements
contained in the selected group to be present within the containing element. |
|
Contains extensions or restrictions on a
complex type that contains mixed content or elements only. |
|
Specifies information to be read or used by
users within an annotation element. |
|
Declares an element. |
|
Contains extensions on simpleContent.
This extends a simple type or a complex type that has simple content by
adding specified attribute(s), attribute groups(s) or anyAttribute. |
|
Contains extensions on complexContent. |
|
Groups a set of element declarations so
that they can be incorporated as a group into complex type definitions. |
|
Defines constraints on a simpleContent
definition. |
|
Defines constraints on a complexContent
definition. |
|
Requires the elements in the group to
appear in the specified sequence within the containing element. |
|
Contains extensions or restrictions on a complexType
element with character data or a simpleType element as content and
contains no elements. |
The following are elements that create simple type definitions.
Element |
Description |
Defines an annotation. |
|
Specifies information to be used by
applications within an annotation element. |
|
Specifies information to be read or used by
users within an annotation element. |
|
Declares an element. |
|
Defines a collection of a single simpleType
definition. |
|
Defines constraints on a simpleType
definition |
|
Defines a collection of multiple simpleType
definitions. |
The following table lists primitive XML schema data types, facets that can be applied to the data type, and a description of the data type.
Facets can only appear once in a type definition except for enumeration and pattern facets. Enumeration and pattern facets can have multiple entries and are grouped together.
Data Type |
Facets |
Description |
string |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents character strings. |
boolean |
pattern, whiteSpace |
Represents Boolean values, which are either
true or false. |
decimal |
enumeration, pattern, totalDigits,
fractionDigits, minInclusive, minExclusive, maxInclusive, maxExclusive,
whiteSpace |
Represents arbitrary precision numbers. |
float |
pattern, enumeration, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents single-precision 32-bit
floating-point numbers. |
double |
pattern, enumeration, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents double-precision 64-bit
floating-point numbers. |
duration |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a duration of time. The
pattern for duration is |
dateTime |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a specific instance of time. The
pattern for dateTime is This
representation may be immediately followed by a "Z" to indicate
Coordinated Universal Time (UTC) or to indicate the time zone. For example,
the difference between the local time and Coordinated Universal Time,
immediately followed by a sign, + or -, followed by the difference from UTC
represented as |
time |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents an instance of time that recurs
every day. The
pattern for time is |
date |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a calendar date. The
pattern for date is |
gYearMonth |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a specific Gregorian month in a
specific Gregorian year. A set of one-month long, nonperiodic instances. The
pattern for gYearMonth is |
gYear |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a Gregorian year. A set of
one-year long, nonperiodic instances. The
pattern for gYear is |
gMonthDay |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a specific Gregorian date that
recurs, specifically a day of the year such as the third of May. A gMonthDay
is the set of calendar dates. Specifically, it is a set of one-day long,
annually periodic instances. The
pattern for gMonthDay is |
gDay |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a Gregorian day that recurs,
specifically a day of the month such as the fifth day of the month. A gDay
is the space of a set of calendar dates. Specifically, it is a set of one-day
long, monthly periodic instances. The
pattern for gDay is |
gMonth |
enumeration, pattern, minInclusive,
minExclusive, maxInclusive, maxExclusive, whiteSpace |
Represents a Gregorian month that recurs
every year. A gMonth is the space of a set of calendar months.
Specifically, it is a set of one-month long, yearly periodic instances. The
pattern for gMonth is |
hexBinary |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents arbitrary hex-encoded binary
data. A hexBinary is the set of finite-length sequences of binary
octets. Each binary octet is encoded as a character tuple, consisting of two
hexadecimal digits ([0-9a-fA-F]) representing the octet code. |
base64Binary |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents Base64-encoded arbitrary binary
data. A base64Binary is the set of finite-length sequences of binary
octets. |
anyURI |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents a URI as defined by RFC 2396. An
anyURI value can be absolute or relative, and may have an optional
fragment identifier. |
QName |
length, enumeration, pattern, maxLength,
minLength, whiteSpace |
Represents a qualified name. A qualified
name is composed of a prefix and a local name separated by a colon. Both the
prefix and local names must be an NCName. The prefix must be associated with
a namespace URI reference, using a namespace declaration. |
NOTATION |
length, enumeration, pattern, maxLength,
minLength, whiteSpace |
Represents a NOTATION attribute
type. A set of QNames. |
The following table lists derived XML schema data types, facets that can be applied to the derived data type, and a description of the derived data type.
Data Type |
Facets |
Description |
normalizedString |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents white space normalized strings.
This data type is derived from string. |
token |
enumeration, pattern, length, minLength,
maxLength, whiteSpace |
Represents tokenized strings. This data
type is derived from normalizedString. |
language |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents natural language identifiers
(defined by RFC 1766). This data type is derived from token. |
IDREFS |
length, maxLength, minLength, enumeration,
whiteSpace |
Represents the IDREFS attribute
type. Contains a set of values of type IDREF. |
ENTITIES |
length, maxLength, minLength, enumeration,
whiteSpace |
Represents the ENTITIES attribute
type. Contains a set of values of type ENTITY. |
NMTOKEN |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents the NMTOKEN attribute
type. An NMTOKEN is set of name characters (letters, digits, and other
characters) in any combination. Unlike Name and NCName, NMTOKEN
has no restrictions on the starting character. This data type is derived from
token. |
NMTOKENS |
length, maxLength, minLength, enumeration,
whiteSpace |
Represents the NMTOKENS attribute
type. Contains a set of values of type NMTOKEN. |
Name |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents names in XML. A Name is a
token that begins with a letter, underscore, or colon and continues with name
characters (letters, digits, and other characters). This data type is derived
from token. |
NCName |
length, pattern, maxLength, minLength,
enumeration, whiteSpace |
Represents noncolonized names. This data
type is the same as Name, except it cannot begin with a colon. This
data type is derived from Name. |
ID |
length, enumeration, pattern, maxLength,
minLength, whiteSpace |
Represents the ID attribute type
defined in the XML 1.0 Recommendation. The ID must be a no-colon-name
(NCName) and must be unique within an XML document. This data type is derived
from NCName. |
IDREF |
length, enumeration, pattern, maxLength,
minLength, whiteSpace |
Represents a reference to an element that
has an ID attribute that matches the specified ID. An IDREF
must be an NCName and must be a value of an element or attribute of type ID
within the XML document. This data type is derived from NCName. |
ENTITY |
length, enumeration, pattern, maxLength,
minLength, whiteSpace |
Represents the ENTITY attribute type
in XML 1.0 Recommendation. This is a reference to an unparsed entity with a
name that matches the specified name. An ENTITY must be an NCName and
must be declared in the schema as an unparsed entity name. This data type is
derived from NCName. |
integer |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents a sequence of decimal digits
with an optional leading sign (+ or -). This data type is derived from decimal. |
nonPositiveInteger |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits, whiteSpace |
Represents an integer that is less than or
equal to zero. A nonPositiveInteger consists of a negative sign (-)
and sequence of decimal digits. This data type is derived from integer. |
negativeInteger |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer that is less than
zero. Consists of a negative sign (-) and sequence of decimal digits. This
data type is derived from nonPositiveInteger. |
long |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum value
of -9223372036854775808 and maximum of 9223372036854775807. This data type is
derived from integer. |
int |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum value
of -2147483648 and maximum of 2147483647. This data type is derived from long. |
short |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum value
of -32768 and maximum of 32767. This data type is derived from int. |
byte |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum value
of -128 and maximum of 127. This data type is derived from short. |
nonNegativeInteger |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer that is greater than
or equal to zero. This data type is derived from integer. |
unsignedLong |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum of
zero and maximum of 18446744073709551615. This data type is derived from nonNegativeInteger. |
unsignedInt |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum of
zero and maximum of 4294967295. This data type is derived from unsignedLong. |
unsignedShort |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum of
zero and maximum of 65535. This data type is derived from unsignedInt. |
unsignedByte |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer with a minimum of
zero and maximum of 255. This data type is derived from unsignedShort. |
positiveInteger |
enumeration, fractionDigits, pattern,
minInclusive, minExclusive, maxInclusive, maxExclusive, totalDigits,
whiteSpace |
Represents an integer that is greater than
zero. This data type is derived from nonNegativeInteger. |
o
Simple
Types – used for an
element that contains only document content
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="today" type=”xsd:date” /> <xsd:element name="user" type=”xsd:string” /> </xsd:schema> |
o
Defining simple types – take an
existing simple type and apply a restriction using a facet
o Facets
– rules which are applied to a base type to change it in some
way
Example 1: Defining
myInteger, Range 10000-99999 <xsd:element name="workingInts" type=”myInteger” />
<xsd:simpleType name="myInteger"> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="10000"/> <xsd:maxInclusive value="99999"/> </xsd:restriction> </xsd:simpleType> |
Example 2: Using the
Enumeration Facet <xsd:element name="USA" type=”USState” />
<xsd:simpleType name="USState"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="AK"/> <xsd:enumeration value="AL"/> <xsd:enumeration value="AR"/> <!-- and so on ... --> </xsd:restriction> </xsd:simpleType> |
o
Complex
Types – defined using
complexType element
o May include subelements, element
content and attributes
Sequence - Requires the elements in
the group to appear in the specified sequence within the containing element.
Example 1: Defining the
USAddress Type <xsd:complexType name="USAddress" > <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:decimal"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/> </xsd:complexType> |
Example 2: Defining
PurchaseOrderType <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType> |
o
XML
Schemas can specify the types of attributes
o Declaring: <xsd:attribute name="orderDate" type="xsd:date"/>
o
Used
in above example means that all elements of PurchaseOrderType will
support this attribute.
o
References
an existing definition
o e.g. <xsd:element ref="comment" minOccurs="0"/>
Compositors
o Sequence
- Requires the elements
in the group to appear in the specified sequence within the containing element.
The root element is named
"AAA", from null namespace and contains one "BBB"
element, followed by one "CCC" element. Use the
"sequence" pattern to specify exact order of the elements. The
attributes "minOccurs" and "maxOccurs" are not necessary,
because their default value is 1. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
|
Valid
Document <AAA xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> |
o Restriction
– limits the range of values
Here the value of the element
"root" must be and integer and less than 25.
|
Valid document: <root xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>24</root> |
o All
– sets up an unordered set of elements
The root element is named
"AAA", from null namespace and contains one "BBB" and one
"CCC" element. Their order is not important <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
|
Valid
document: <AAA xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> |
o Choice
– creates a set of optional elements – only one option may be selected
The root element is named
"AAA", from null namespace and contains either "BBB" or
"CCC" elements (but not both). Use the "choice" element. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
|
Valid
Document: <AAA xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> |
o List
–
Now, we want the "root"
element to contain a list of three integers. We will define a general list
(element "list") of integers and then restrict it (element
"restriction") to have exact length (element "length") of
three items. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
|
Valid
Document: root xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>0 0 1</root> |
o Union
The element "root" is to
be from range 0-100 or 300-400 (including the border values). We will make a
union from two intervals. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
|
Valid
Document: <root xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>50</root> |
o
Group
To define a group of common
attributes, which will be reused. The root element is named "root",
it must contain the "aaa" and "bbb" elements, and these
elements must have attributes "x" and "y". <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
|
Valid
Document: <root xsi:noNamespaceSchemaLocation="correct_0.xsd"
xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> |
Tutorial 1: http://www.w3schools.com/css/default.asp
Tutorial 2: http://www.tizag.com/cssT/
CSS
– Cascading Style Sheets –
·
HTML
technology used to format XML
·
Levels:
CSS1, CSS2
·
Style
sheets are collections of style rules for formatting XML content marked-up by
tags
e.g.
title {display: block; font-size: 36pt; font-weight: bold;
text-align: center; text-decoration: underline}
Which XML elements to format { how to
format }
selector { property: value; property: value;
... }
e.g.from Holzner
Style sheet: ch08_02.css title {display: block; font-size: 36pt; font-weight: bold; text-align: center;
text-decoration: underline}{display: block; font-size: 16pt; text-align:
center}{display: block; font-size: 28pt; text-align: center; font-style: italic}{display: block;
margin-top: 10} |
XML document: <?xml version="1.0" standalone="yes"?> <?xml-stylesheet
type="text/css" href="ch08_02.css"?> <document> <title>The Discourses</title> <philosopher>Epictetus</philosopher> <book>Book Four</book> <paragraph> He is free who
lives as he wishes to live; who is neither subject to
compulsion nor to hindrance, nor to force; whose movements to
action are not impeded, whose desires attain their
purpose, and who does not fall into that which he would avoid. </paragraph> <paragraph> Who, then, chooses
to live in error? No man. Who chooses to live deceived,
liable to mistake, unjust, unrestrained, discontented, mean?
No man. </paragraph> <paragraph> Not one then of the
bad lives as he wishes; nor is he, then, free. And who
chooses to live in sorrow, fear, envy, pity, desiring and
failing in his desires, attempting to avoid something and
falling into it? Not one. </paragraph> <paragraph> Do we then find any
of the bad free from sorrow, free from fear, who does not
fall into that which he would avoid, and does not obtain
that which he wishes? Not one; nor then do we find any bad man
free. </paragraph> </document> |
|
Background Color:
<html> <head> <style type="text/css"> body {background-color: yellow} h1 {background-color: #00ff00} h2 {background-color: transparent} p {background-color: rgb(250,0,255)} </style> </head> <body> <h1>This is header 1</h1> <h2>This is header 2</h2> <p>This is a paragraph</p> </body> </html> |
Text: Color
<html> <head> <style
type="text/css"> h1
{color: #00ff00} h2
{color: #dda0dd} p
{color: rgb(0,0,255)} </style>
</head> <body> <h1>This is header 1</h1> <h2>This is header 2</h2> <p>This is a paragraph</p> </body> </html> |
Text: Alignment
<html> <head> <style
type="text/css"> h1
{text-align: center} h2
{text-align: left} h3
{text-align: right} </style>
</head> <body> <h1>This is header 1</h1> <h2>This is header 2</h2> <h3>This is header 3</h3> </body> </html> |
Font: Style
<html> <head> <style
type="text/css"> h3
{font-family: times} p
{font-family: courier} p.sansserif
{font-family: sans-serif} </style> </head> <body> <h3>This
is header 3</h3> <p> This
is a paragraph</p> <p
class="sansserif"> This
is a paragraph</p> </body> </html>
|
Font: Size
<html> <head> <style
type="text/css"> h1
{font-size: 150%} h2
{font-size: 20px} p
{font-size: x-large} </style> </head> <body> <h1>This
is header 1</h1> <h2>This
is header 2</h2> <p>This
is a paragraph</p> </body> </html> |
CSS Classes can give HTML multiple
renderings
<html> <head> <style> p.first
{ background-color: gray; color:
blue;} p.second
{ background-color: red; } p.third
{ background: purple; color:
yellow; } </style> </head> <body> <h2>CSS
Classes</h2> <p
class="first">This is the p.first paragraph</p> <p
class="second">This is the p.second paragraph</p> <p
class="third">This is the p.third paragraph</p> </body> </html> |
Borders:
<html> <head> <style> p.solid
{border-style: solid; } p.double
{border-style: double; } p.groove
{border-style: groove; } p.dotted
{border-style: dotted; } p.dashed
{border-style: dashed; } p.inset
{border-style: inset; } p.outset
{border-style: outset; } p.ridge
{border-style: ridge; } p.hidden
{border-style: hidden; } </style> </head> <body> <p
class="solid">This is the solid style</p> <p
class="double">This is the double style</p> <p
class="groove">This is the groove style</p> <p
class="dotted">This is the dotted style</p> <p
class="dashed">This is the dashed style</p> <p
class="inset">This is the inset style</p> <p
class="outset">This is the outset style</p> <p
class="ridge">This is the ridge style</p> <p
class="hidden">This is the hidden style</p> </body> </html>
|
Padding : Change the default padding
that appears inside various HTML elements ( paragraphs, tables, etc ).
<html> <head> <style
type="text/css"> td
{padding: 1.5cm} td.twovalues
{padding: 0.5cm 2.5cm} </style> </head> <body> <table
border="1"> <tr> <td> This
is a tablecell with padding on each side </td> </tr> </table> <br> <table
border="1"> <tr> <td
class="twovalues"> This
is a tablecell with padding on each side. The top and bottom padding have the
same value (0.5cm), while the left and right padding have another value (2.5) </td> </tr> </table> </body> </html>
|
Margins: define the space around
elements.
<html> <head> <style
type="text/css"> p.margin
{margin: 2cm 4cm 3cm 4cm} </style> </head> <body> <p> This
is a paragraph </p> <p
class="margin"> This
is a paragraph with margins </p> <p> This
is a paragraph </p> </body> </html> |