3.3 JAXP: Java API for XML Processing

computerharpySoftware and s/w Development

Dec 2, 2013 (3 years and 9 months ago)

84 views

SDPL 2002

Notes 3: XML Processor Interfaces

1

3.3 JAXP: Java API for XML Processing


How can applications use XML processors?


A Java
-
based answer: through
JAXP


An overview of the JAXP interface

»
What does it specify?

»
What can be done with it?

»
How do the JAXP components fit together?


[Partly based on tutorial “An Overview of the APIs” available at
http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview
/3_apis.html, from which also some graphics are borrowed]

SDPL 2002

Notes 3: XML Processor Interfaces

2

JAXP 1.1


An interface for “plugging
-
in” and using XML
processors in Java applications


includes packages

»
org.xml.sax:

SAX 2.0 interface

»
org.w3c.dom:

DOM Level 2 interface

»
javax.xml.parsers:



initialization and use of parsers

»
javax.xml.transform:



initialization and use of transformers


(XSLT processors)


Included in JDK starting from vers. 1.4

SDPL 2002

Notes 3: XML Processor Interfaces

3

JAXP: XML processor plugin (1)


Vendor
-
independent method for selecting
processor implementation at run time


principally through system properties

javax.xml.parsers.SAXParserFactory,

javax.xml.parsers.DocumentBuilderFactory,

and

javax.xml.transform.TransformerFactory


For example:


System.setProperty(

"javax.xml.parsers.DocumentBuilderFactory",
"com.icl.saxon.om.DocumentBuilderFactoryImpl
");

SDPL 2002

Notes 3: XML Processor Interfaces

4

JAXP: XML processor plugin (2)


By default, reference implementations used


Apache Crimson/Xerces as the XML parser


Apache Xalan as the XSLT processor


Currently supported only by a few compliant
XML processors:


Parsers: Apache Crimson and Xerces, Aelfred


XSLT transformers: Apache Xalan, Saxon

SDPL 2002

Notes 3: XML Processor Interfaces

5

JAXP: Functionality


Parsing using SAX 2.0 or DOM Level 2


Transformation using XSLT


(We’ll perform stand
-
alone transformations later)


Fixes features left unspecified in SAX 2.0 and
DOM Level 2


control of parser validation and error handling


creation and saving of DOM Document objects

SDPL 2002

Notes 3: XML Processor Interfaces

6

JAXP Parsing API


Included in JAXP package


javax.xml.parsers


Used for invoking and using SAX and
DOM parser implementations:


SAXParserFactory

spf =


SAXParserFactory
.
newInstance
();



DocumentBuilderFactory

dbf =



DocumentBuilderFactory
.
newInstance
();

SDPL 2002

Notes 3: XML Processor Interfaces

7

XML

getXMLReader

JAXP: Using an SAX parser (1)

SDPL 2002

Notes 3: XML Processor Interfaces

8

JAXP: Using an SAX parser (2)


We’ve already used this:

SAXParserFactory

spf =



SAXParserFactory
.
newInstance
();


try {



SAXParser

saxParser = spf.
newSAXParser
();


XMLReader

xmlReader =




saxParser.
getXMLReader
();



} catch (Exception e) {



System.err.println(e.getMessage());



System.exit(1);


};


SDPL 2002

Notes 3: XML Processor Interfaces

9

f.xml

parse(


”f.xml”)

newDocument()

JAXP: Using a DOM parser (1)

SDPL 2002

Notes 3: XML Processor Interfaces

10

JAXP: Using a DOM parser (2)


We’ve used this, too:

DocumentBuilderFactory

dbf =



DocumentBuilderFactory
.
newInstance
();



try {

// to get a new

DocumentBuilder:



documentBuilder

builder =



dbf.
newDocumentBuilder
();


} catch (
ParserConfigurationException

e) {



e.printStackTrace());



System.exit(1);


};


SDPL 2002

Notes 3: XML Processor Interfaces

11

DOM building in JAXP

XML

Reader

(SAX

Parser)

XML

Error

Handler

DTD

Handler

Entity

Resolver

Document

Builder


(Content

Handler)

DOM Document

DOM on top of SAX
-

So what?

SDPL 2002

Notes 3: XML Processor Interfaces

12

JAXP: Controlling parsing


Errors of DOM parsing can be handled


by creating a
SAX

ErrorHandler
, which implements
error
,
fatalError

and
warning

methods, and
passing it with
setErrorHandler

to the
DocumentBuilder

setValidating
(boolean)

and

setNamespaceAware
(boolean)



Validation and namespace processing can be
controlled, both for
SAXParserFactories

and
DocumentBuilderFactories

with

SDPL 2002

Notes 3: XML Processor Interfaces

13

JAXP Transformation API


also known as TrAX


Allows application to apply a Transformer to a
Source document to get a Result document


Transformer can be created


from XSLT transformation instructions (to be
discussed later)


without instructions, which gives an identity
transformation (simply copies Source to Result)


SDPL 2002

Notes 3: XML Processor Interfaces

14

XSLT

JAXP: Using Transformers (1)

SDPL 2002

Notes 3: XML Processor Interfaces

15

JAXP Transformation Packages


javax.xml.transform:



Classes
Transformer
and
TransformerFactory
; initialization similar
to parsers and parser factories


Transformation Source object can be


a DOM tree, an SAX XMLReader or an I/O stream


Transformation Result object can be


a DOM tree, an SAX ContentHandler or an I/O
stream

SDPL 2002

Notes 3: XML Processor Interfaces

16

JAXP Transformation Packages (2)


Classes to create Source and Result objects
from DOM, SAX and I/O streams defined in
packages


javax.xml.transform.dom,

javax.xml.transform.sax,

and

javax.xml.transform.stream


An identity transformation from a DOM
Document to I/O stream a vendor
-
neutral
way to serialize DOM documents


(the only option in JAXP)

SDPL 2002

Notes 3: XML Processor Interfaces

17

Serializing a DOM Document as XML text


Identity transformation to an I/O stream Result:

TransformerFactory

tFactory =




TransformerFactory
.
newInstance
();

// Create an identity transformer:

Transformer

transformer =


tFactory.
newTransformer
();


DOMSource

source = new
DOMSource
(myDOMdoc);

StreamResult

result =


new
StreamResult
(System.out);

transformer.
transform
(source, result);


SDPL 2002

Notes 3: XML Processor Interfaces

18

Other Java APIs for XML


JDOM


variant of W3C DOM; closer to Java object
-
orientation (
http://www.jdom.org/
)


DOM4J
(
http://www.dom4j.org/
)



roughly similar to JDOM; richer set of features


JAXB (Java Architecture for XML Binding)


compiles DTDs to DTD
-
specific classes that allow to
read, to manipulate and to write valid documents


http://java.sun.com/xml/jaxb/

SDPL 2002

Notes 3: XML Processor Interfaces

19

JAXP: Summary


An interface for using XML Processors


SAX/DOM parsers, XSLT transformers


Supports plugability of different
implementations


Defines means to control validation, and
handling of parse errors (through SAX
ErrorHandlers)


Defines means to write out DOM Documents


Included in JDK 1.4