Web services
Jan Newmarch
Peninsula School of IT
Monash University
Email: jan@newmarch.name
Home page: http://jan.newmarch.name
Workshop content
Assignments
Workshops
-
Background
-
Web services introduction
-
Web services background
-
WSDL
-
SOAP
-
Calling SOAP methods
-
Session management
-
Security
-
UDDI
-
Computer science and web servicse
-
Internationalisation
-
I18n and web services
Background
Knowledge management
-
Knowledge mansagement is a broad concept that addresses the full
range of processes by which the organisation deploys knowledge
-
These involve the acquisition, distribution and use of
knowledge in the organisation
-
We will only look at distribution and
use technologies in this workshop
The world wide web
-
Began to get popular in 1994
-
Devised as a way of sharing information, with links between
associated documents
-
No central organisation to control the web
-
Anyone can contribute
Knowledge management on the Web
-
Knowledge can come in structured or unstructured forms
-
The Web has minimal structure - hyperlinks with no semantics
-
One direction for the web is to structure information particularly by use
of portals for organisations and forms for business
processes
-
Another direction is the world live web where knowledge, information
(and rubbish) accretes dynamically
World Live Web
-
This represents an evolution of the unstructured formats
-
Information can appear in real time through RSS feeds
-
Opinions can be freely made and annotated through
blogs
-
Dynamic communities of knowledge can grow through
Wikis
-
There are search engines for the WLW such as
Technorati
-
The world live web will not be easily formalised since it is
uncontrolled and ungoverened (no-one can say what should or should not
be on it)
The Physical Web
-
Sensor networks are starting to come onto the computing main stage
-
They can be used for home-care systems for the aged
-
They can be used for emergency networks for disasters such as bushfires
-
They will generate huge amounts of data, sometimes useful but often useless
-
Incorporating vast amounts of data into relevant portals will lead to the
"physical web"
XHTML
-
The original HTML was a hack that worked and was popular
-
It is easier for tools based on XML if HTML is cleaned up to give
XML conformance
-
XHTML does this
-
XHMTL tidies up the syntax of the web, so tools are easier to build
-
XHTML does not make changes to the structure or lack of structure of the web
Semantic Web
-
There have been many attempts to add semantic markup to web content
-
RDF (resource description framework) is one of these
-
Domains of knowledge are built up using an ontology
-
The ontology gives meaning and structure to the content
-
Including semantic markup will be a specialist task
-
Proponents of the WLW are often cynical about the semantic web.
See
http://www.shirky.com/writings/semantic_syllogism.html
Value of semantic web
-
The ordinary web satisfies most people
-
We all know that searches contain a lot of "noise", but can navigate
through this
-
Computers cannot navigate as well as humans so need semantic markup
to aid them
-
The semantic web may help human searches, but will be a better aid
to agent-based computer searches
Form filling
-
The original web was designed to deliver text documents from remote servers
-
Adding images and other document types made it popular but didn't change the
structure
-
Adding forms and backend handlers like CGI scripts, servlets, ASP pages radically
changed the nature of the web
-
Stateless document retrieval was replaced by stateful
interactions
-
User-centred browsing was replaced by transaction-oriented processing
-
The human becomes a supplier of data to a remote system rather than
a consumer of information
Web services
-
Web services are about removing the human from the form-filling loop
-
A program supplies data to a remote system
-
The program receives the reply and processes it
-
The program can run faster, be more reliable and
consistent than the human
-
The program can allow B2B transactions without the
inefficiencies of human interaction
-
To many, web services are the "holy grail" of distributed computing
Web services
Web services
-
Web services are one of the hot new topics, about to "revolutionise
the web"
-
There are many definitions of Web services
-
A web service is a server application that can be invoked
using the HTTP protocol by SOAP calls
-
A Web service is anything that uses the Web (HTTP) protocol
-
A Web service is any B2B internet application
-
A Web service is anything that uses the internet
-
We mean the first definition
Web services definition
-
From the WWW Consortium:
A Web service is a software system identified by a URI,
whose public interfaces and bindings are defined and described using XML.
Its definition can be discovered by other software systems.
These systems may then interact with the Web service in a manner
prescribed by its definition,
using XML based messages conveyed by internet protocols
-
Web services allow applications and agents to browse the Web
on behalf of users
-
Instead of a user navigating HTML screens,
an agent can connect to services on the Web and interact with
them directly without the distraction of presentation elements
-
SOAP/WSDL/UDDI are the technologies behind Web services
Examples
-
The site www.xmethods.com/
is a public site listing a large number of public web services
-
From the xmethods site:
Emerging web services standards such as SOAP, WSDL and UDDI will enable
system-to-system integration that is easier than ever before.
This site lists publicly available web services.
- Examples are
-
Zip code to geographical coordinates
-
Car rental quotes
-
Nasdaq stock exchange quotes
-
California weather service
-
Access Amazon.com using SOAP
-
Slashdot news feed
-
Note: well over half of the "services" on this
site are not Web services at all, but things like ordinary
HTML Web pages
Web service interaction
Web Service Components
-
WSDL (Web Services Description Language) is the metalanguage
used to describe a Web service. It corresponds to the IDL
(Interface Definition Language) of other RPC systems
-
SOAP (Simple Object Access Protocol) is the wire format for
remote method calls and their replies.
It corresponds to the XDR level of Sun's RPC
-
UDDI (Universal Description, Discovery and Integration)
is the resource registration and lookup mechanism.
It corresponds to the Naming Services of CORBA, RMI
Base technology limitations
-
There is no security
-
There is no session management
-
There is nothing that captures business processes
-
There is nothing that captures workflow
-
...and so on
The web services zoo
The web services zoo
Current organisations
-
W3C: the World Wide Web Consortium, in charge of HTTP, HTML and other fundamental
web technologies
-
Oasis: concentrating on e-business standards, looking at the higher layers
of the WS stack
-
WS-I: concerned with implementations of WS that failed to interoperate,
are setting standards for interoperability
-
Liberty Alliance: concerned with single signon mechanisms, overlaps with WS
authentication
Money making?
-
The positive (http://it.asia1.com.sg/newsdaily/news004_20031001.html
Insure.NET, a Web services platform initiated by Microsoft and the Infocomm Development
Authority, is set to save insurance companies here millions of dollars a year.
-
The negative (http://www.cio.com/archive/100103/standards.html
IT'S ALREADY A GIVEN: Your company is going to waste money on Web services.
Research company Gartner predicts American business is going to squander $1 billion on
misguided Web services projects by 2007. Exactly how much of that will come out of your
pocket depends in part on how many confusing, overlapping Web services standards
emerge in the next few years.
Gartner hype cycle
Gartner regularly publishes a graph on the state of software.
The latest (at August 2005) is
Web service background
Client-server paradigms
Base
|
sockets
|
Remote Procedure Call
|
web services, Sun RPC, COM+
|
Remote objects
|
CORBA
|
Downloadable objects
|
Java RMI
|
Mobile objects
|
Aglets
|
Messaging
|
JMS
|
Web Service Environmental Support
Web services are basically RPC across port 80. They require the
following environmental support
-
An HTTP server configured to pass Web service requests
to a CGI script/servlet/JSP/ASP/...
-
A user agent that is not a browser to prepare
Web method requests and understand responses
-
The user agent may make use of a browser for handling HTTP
requests from the user side
-
Firewall access to allow arbitrary content to pass on port 80
(exploit a security hole!)
Web Service Standards Requirements
-
Web services use XML extensively, and use XML DTD's and Schema
-
XML-defined standard data types are used
-
XML generators and parsers are required (places requirements on
PDAs)
Ordinary Procedure Call
-
The imperative languages use the procedure as a means of
structuring the language.
The language will have conditionals, loops and procedure calls.
-
When a procedure is called, it usually makes use of the stack,
pushing parameters onto the stack and reserving space for local variables
Remote Procedure Call
Web Services Remote Procedure Call
Components of RPC
An RPC system consists of the following pieces
-
A set of data types that can be sent as parameters and
received as results
-
A representation of these data types on the wire
-
A message format for sending procedure name and parameters
and getting procedure reply
-
A means of representing the procedures and their signatures
(parameters and result types) in a form independent of implementation -
this will be used to generate two implementations, one on the client
side, the other on the server. This acts as a specification of
a remote procedure call
-
A mechanism for automatically generating a client stub
-
A mechanism for automatically generating a server stub
-
A way of linking the server-side implementation of the
procedure to the server stub so that the stub can call it
-
Address of the service
-
Locating the service
Web Services and Standardisation
RPC requirement | SOAP |
defined data types | defined by SOAP |
wire format for data |
defined by SOAP using XML |
wire format for messages |
defined by SOAP over HTTP |
an IDL | defined by WSDL |
generation of client stubs |
vendor specific |
generation of server stubs |
vendor specific |
linking implementation |
vendor specific |
Address of service |
URL |
Locating service |
UDDI |
Competitors
-
There are many, many competing systems and web services aren't anything
special: cross-industry support is their only unique feature
-
There are many serialisation techniques to compete with SOAP.
See
www.pault.com/pault/pxml/xmlalternatives.html
-
There are many specification techniques to compete with WSDL, such as
Sun's RPC/ONC, CORBA IDL, Java RMI, Jini, Salutation, ...
-
There are many lookup mechanisms as well as UDDI, with their own spheres
of applications: CORBA naming, CORBA trading, RMI naming and LDAP, UPnP,
Jini, ...
-
There is nothing novel from a computer science viewpoint about web services:
mainly it is another piece of engineering using 30 year-old technology
Measurement converter service in Java
public class Converter implements java.rmi.Remote {
public float inchToMM(float float_1) throws
java.rmi.RemoteException {
float _retVal = float_1 * 2.54;
return _retVal;
}
public float mmToInch(float float_1) throws
java.rmi.RemoteException {
float _retVal = float_1 / 2.54;
return _retVal;
}
}
Measurement converter client in Perl
#!/usr/bin/perl
use SOAP::Lite;
print SOAP::Lite
-> uri('http://localhost/Converter') # Converter service
-> proxy('http://localhost:8088/axis/Converter.jws') # Axis service
-> inchToMM(1.0)
-> result;
Google
-
Google and Amazon both give web service access to their systems
-
A google query can be done by
#!/usr/bin/perl
# Source: http://hacks.oreilly.com/pub/h/170
# googly.pl
# A typical Google Web API Perl script
# Usage: perl googly.pl
# Your Google API developer's key
my $google_key='XXX';
# Location of the GoogleSearch WSDL file
my $google_wdsl = "googleapi/GoogleSearch.wsdl";
use strict;
# Use the SOAP::Lite Perl module
use SOAP::Lite;
# Take the query from the command-line
my $query = shift @ARGV or die "Usage: perl googly.pl \n";
# Create a new SOAP::Lite instance, feeding it GoogleSearch.wsdl
my $google_search = SOAP::Lite->service("file:$google_wdsl");
# Query Google
my $results = $google_search ->
doGoogleSearch(
$google_key, $query, 0, 10, "false", "", "false",
"", "latin1", "latin1"
);
# No results?
@{$results->{resultElements}} or exit;
# Loop through the results
foreach my $result (@{$results->{resultElements}}) {
# Print out the main bits of each result
print
join "\n",
$result->{title} || "no title",
$result->{URL},
$result->{snippet} || 'no snippet',
"\n";
}
WSDL
Role of Specification Languages
-
A specification is an implementation-free description of a service
-
A specification should be understandable by all implementors
-
A specification may be of syntax only, or include some semantics
-
Syntax is easy: Sun RPC, CORBA IDL, Java interfaces
-
Semantics is hard
-
Meaning is virtually impossible (e.g. Z, pre/post)
-
Environment is possible (e.g. where it runs)
Simple Service
CORBA
A CORBA IDL specification would look like
interface Converter {
float inchToMM(in float value);
float mmToInch(in float value);
};
Java RMI
A Java RMI specification would look like
public interface Converter implements Remote {
public float inchToMM(float value)
throws RemoteException;
public float mmToInch(float value)
throws RemoteException;
}
WSDL
The WSDL specification looks like
<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="http://schemas.xmlsoap.org/wsdl/"
xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:tns="urn:Converter" targetNamespace="urn:Converter" name="ConverterService">
<message name="inchToMMRequest">
<part name="param1" type="xsd:float"/>
</message>
<message name="inchToMMResponse">
<part name="return" type="xsd:float"/>
</message>
<message name="mmToInchRequest">
<part name="param1" type="xsd:float"/>
</message>
<message name="mmToInchResponse">
<part name="return" type="xsd:float"/>
</message>
<message name="java_rmi_RemoteException">
<part type="xsd:string" name="java_rmi_RemoteException"/>
</message>
<message name="com_iona_xmlbus_webservices_ejbserver_ConversionException">
<part type="xsd:string" name="com_iona_xmlbus_webservices_ejbserver_ConversionException"/>
</message>
<portType name="ConverterPortType">
<operation name="inchToMM">
<input message="tns:inchToMMRequest" name="inchToMM"/>
<output message="tns:inchToMMResponse" name="inchToMMResponse"/>
<fault message="tns:java_rmi_RemoteException" name="java_rmi_RemoteException"/>
</operation>
<operation name="mmToInch">
<input message="tns:mmToInchRequest" name="mmToInch"/>
<output message="tns:mmToInchResponse" name="mmToInchResponse"/>
<fault message="tns:java_rmi_RemoteException" name="java_rmi_RemoteException"/>
<fault
message="tns:com_iona_xmlbus_webservices_ejbserver_ConversionException" name="com_iona_xmlbus_webservices_ejbserver_ConversionException"/>
</operation>
</portType>
<binding name="ConverterBinding" type="tns:ConverterPortType">
<soap:binding transport="http://schemas.xmlsoap.org/soap/http/" style="rpc"/>
<operation name="inchToMM">
<soap:operation soapAction="" style="rpc"/>
<input name="inchToMM">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</input>
<output name="inchToMMResponse">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</output>
<fault name="java_rmi_RemoteException">
<soap:fault name="java_rmi_RemoteException"
use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</fault>
</operation>
<operation name="mmToInch">
<soap:operation soapAction="" style="rpc"/>
<input name="mmToInch">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</input>
<output name="mmToInchResponse">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</output>
<fault name="java_rmi_RemoteException">
<soap:fault name="java_rmi_RemoteException"
use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</fault>
<fault name="com_iona_xmlbus_webservices_ejbserver_ConversionException">
<soap:fault
name="com_iona_xmlbus_webservices_ejbserver_ConversionException"
use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</fault>
</operation>
</binding>
<service name="Converter">
<port name="ConverterPort" binding="tns:ConverterBinding">
<soap:address location="http://www.xmlbus.com:9010/ionasoap/servlet/Converter"/>
</port>
</service>
</definitions>
Components of WSDL Specification
-
Data types
-
-
XML data types are okay
-
Add extra data types here
-
Messages
-
These describe the messages that can be sent in amy direction, including parameter
information
-
inchToMMRequest
-
inchToMMResponse
-
java_rmi_RemoteException
- ...
-
Ports
-
These link operations and the messages that make up an operation, and
relate input/output abstract messages to actual messages
-
inchToMM(inchToMMRequest, inchToMMResponse,
java_rmi_RemoteException)
- ...
-
Bindings
-
These determine the wire transport mechanism, such as RPC
-
Service
-
name: Converter
address: http://...
...
WSDL Specification Revisited
<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="http://schemas.xmlsoap.org/wsdl/"
xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:tns="urn:Converter" targetNamespace="urn:Converter" name="ConverterService">
<message name="inchToMMRequest">
<part name="param1" type="xsd:float"/>
</message>
<message name="inchToMMResponse">
<part name="return" type="xsd:float"/>
</message>
<message name="mmToInchRequest">
<part name="param1" type="xsd:float"/>
</message>
<message name="mmToInchResponse">
<part name="return" type="xsd:float"/>
</message>
<message name="java_rmi_RemoteException">
<part type="xsd:string" name="java_rmi_RemoteException"/>
</message>
<message name="com_iona_xmlbus_webservices_ejbserver_ConversionException">
<part type="xsd:string" name="com_iona_xmlbus_webservices_ejbserver_ConversionException"/>
</message>
<portType name="ConverterPortType">
<operation name="inchToMM">
<input message="tns:inchToMMRequest" name="inchToMM"/>
<output message="tns:inchToMMResponse" name="inchToMMResponse"/>
<fault message="tns:java_rmi_RemoteException" name="java_rmi_RemoteException"/>
</operation>
<operation name="mmToInch">
<input message="tns:mmToInchRequest" name="mmToInch"/>
<output message="tns:mmToInchResponse" name="mmToInchResponse"/>
<fault message="tns:java_rmi_RemoteException" name="java_rmi_RemoteException"/>
<fault
message="tns:com_iona_xmlbus_webservices_ejbserver_ConversionException" name="com_iona_xmlbus_webservices_ejbserver_ConversionException"/>
</operation>
</portType>
<binding name="ConverterBinding" type="tns:ConverterPortType">
<soap:binding transport="http://schemas.xmlsoap.org/soap/http/" style="rpc"/>
<operation name="inchToMM">
<soap:operation soapAction="" style="rpc"/>
<input name="inchToMM">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</input>
<output name="inchToMMResponse">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</output>
<fault name="java_rmi_RemoteException">
<soap:fault name="java_rmi_RemoteException"
use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</fault>
</operation>
<operation name="mmToInch">
<soap:operation soapAction="" style="rpc"/>
<input name="mmToInch">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</input>
<output name="mmToInchResponse">
<soap:body use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</output>
<fault name="java_rmi_RemoteException">
<soap:fault name="java_rmi_RemoteException"
use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</fault>
<fault name="com_iona_xmlbus_webservices_ejbserver_ConversionException">
<soap:fault
name="com_iona_xmlbus_webservices_ejbserver_ConversionException"
use="encoded" namespace="urn:Converter" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
</fault>
</operation>
</binding>
<service name="Converter">
<port name="ConverterPort" binding="tns:ConverterBinding">
<soap:address location="http://www.xmlbus.com:9010/ionasoap/servlet/Converter"/>
</port>
</service>
</definitions>
WSDL Versions
-
WSDL version 1.2 is undergoing finalisation by the W3C
-
It cleans up WSDL v1.1, and is incompatable with it
(e.g. more namespace use, elimination of operator overloading)
-
But that doesn't matter, since no-one will ever write the stuff...
Role of WSDL in Web Services
-
Implement a service using your favourite programming language
(Java, C#, VB, ...)
-
Reverse engineer this to a WSDL spec using a tool supplied
by your Web Service vendor
-
Forward engineer this to your next favourite language
-
This is software engineering at its worst: hack an implementation
and then work out the specification afterwards. It leads to broken,
unmaintainable systems.
See e.g.
Rethinking the Java SOAP Stack by Loughran and Smith
where they use the term contract-last development
Apache Axis WSDL Tools
-
Axis is available from www.apache.org, and builds on the IBM
Web Services Toolkit (WSTK)
-
If a web service is available from Axis, then you can download
a WSDL description to a browser by calling the web service URL
with "?WSDL" appended. It may not show up as anything readable:
the browser will fail to interpret most of the tags, but it
can be saved to a file
-
In the WSTK 3.2 you can either give a Java interface and generate a WSDL interface,
or a Java class (earlier versions would not accept an interface)
-
e.g
interface ConverterInterface {
float inchToMM(float value);
float mmToInch(float value);
};
-
Then run a WSDL generator
AXIS_LIB=/usr/local/axis-1_1/lib # change this for your lib dir
for i in $AXIS_LIB/*.jar
do
CP=$CP:$i
done
java -cp "$CP" org.apache.axis.wsdl.Java2WSDL \
-o ConverterInterface.apache.wsdl \
-l http://localhost/wstk/services/ConverterInterface \
ConverterInterface
-
A Java interface can be created from a WSDL spec by
java -cp $WSTK_CP org.apache.axis.wsdl.WSDL2Java \
ConverterInterface.ibm.wsdl
-
Stub and tie generation will be dealt with later
There and Back Again
-
Java -> WSDL -> Java may give a different result to what you started from
- Original
interface ConverterInterface {
float inchToMM(float value);
float mmToInch(float value);
};
-
Round trip
package DefaultNamespace;
public interface Converter extends java.rmi.Remote {
public float inchToMM(float in0) throws java.rmi.RemoteException;
public float mmToInch(float in0) throws java.rmi.RemoteException;
}
Conclusion
-
WSDL is a totally obscure specification language
-
WSDL forces the user to write down many things that are automated and specified in
other description languages
-
It would be extremely difficuly to generate a correct WSDL
document by hand
-
The reverse of good s/w engineering principles is followed: write the code
and then reverse-engineer the specification
-
The implementations of WSDL vary in quality
SOAP
What SOAP Claims to be
-
SOAP is "Simple Object Access Protocol"
-
It is "a lightweight
protocol for exchange of information in a decentralized, distributed
environment"
-
"It is an XML based protocol..."
What SOAP is
-
Basically, SOAP/WSDL/UDDI is the latest in RPC technologies
-
It is the first open standard RPC supported by Microsoft, and was originally
developed by Microsoft, IBM and Artima
-
It is now part of the W3C standards track
-
Version 1.2 is under finalisation by W3C
-
SOAP consists of
-
A description of allowable data types
-
A message format
-
A binding to HTTP transport, and (non-normative) examples of other bindings
(e.g. email)
What SOAP isn't
-
It isn't object oriented - it is procedural
-
It isn't simple - it requires an understanding of XML,
including XML namespaces
-
It isn't lightweight - it needs XML generators/parsers
on client and server
sides, and just sending a one-byte character of payload can
use 1000 bytes of message
-
It isn't complete as an RPC system
(e.g. no standardisation of language bindings)
XML Descriptions
-
XML currently uses two systems for describing XML documents
-
XML DTDs (Document Type Definitions) are good for describing
document structure
-
XML Schema are good for describing data types
-
Both may be needed
XML DTDs
XML DTDs (Document Type Definitions) are good for describing document
structure e.g.
XML Schema
-
Schema are good for data type descriptions
-
For example, if you want to have the data
<age> 45 </age>
<height> 5.9 </height>
<color> blue </color>
-
then you would describe the data type using the schema
<element name="age" type="positiveInteger"/>
<element name="height" type="float"/>
<element name="color">
<simpleType base="xsd:string">
<enumeration value="green">
<enumeration value="blue">
</simpleType>
</element>
-
XML Schema also describes how to make lists and unions of other
data types. e.g. for a list
<simpleType name='sizes'>
<list itemType='decimal'/>
<simpleType>
with example data
<cerealSizes xsi:type='sizes'> 8 10.5 12 </cerealSizes>
XML Schema Standard Data Types
SOAP Messages
SOAP messages are either
-
a request for a service
-
a response from a service
Java mapping of data types
Sun has proposed a Java standard for mapping begtween XML and Java types
Most XML standard data type are represented by a Java data type
XML
|
Java
|
boolean
|
boolean
|
byte
|
byte
|
dateTime
|
java.util.Calendar
|
double
|
double
|
float
|
float
|
int
|
int
|
integer
|
java.math.BigInteger
|
string
|
String
|
Some types can't be represented since they don't exist in Java,
such as unsigned int
. If a type can be "nillable"
(that is, have a nil
value) and it would be a
Java primitive type, then it is wrapped in a Java class such
as Integer
.
SOAP Request
A SOAP request consists of
- The envelope which defines the various namespaces
used by the rest of the message
-
The header is optional, and can carry authentication,
payments, etc
-
The body carries the payload of the message. For RPC it
contains
-
the method/procedure name
-
the arguments
SOAP Response
A SOAP response is just like a request, except the body
contains the result.
SOAP Message Structure
<?xml version="1.0" ?>
<env:Envelope xmlns:env="http://www.w3.org/2001/12/soap-envelope">
<env:Header>
::::
</env:Header>
<env:Body >
::::
</env:Body>
</env:Envelope>
SOAP Header Blocks
-
SOAP headers are optional, and are designed to allow for
expansion
-
A header can specify an actor which is a SOAP processing
node along the message route e.g. next or a URL
-
When a site matches the header it takes corresponding action
-
Actions could be e.g.
-
Log the entry
-
Discard the message if crossing a firewall (?)
-
The "anonymous" role (no actor specified) is the target of the message
SOAP Body for RPC Request
SOAP Body for RPC Response
-
When used for RPC, the body of a response contains a "struct" which is
the return
-
The struct label is
<Request>Response
<?xml version='1.0' ?>
<env:Envelope xmlns:env="http://www.w3.org/2001/12/soap-envelope" >
<env:Body>
<m:reserveAndChargeResponse
env:encodingStyle="http://www.w3.org/2001/12/soap-encoding"
xmlns:m="http://travelcompany.example.org/" >
<m:confirmation>
<reference>FT35ZBQ</reference>
<viewAt>
http://travelcompany.example.org/reservations?code=FT35ZBQ
</viewAt>
</m:confirmation>
</m:reserveAndChargeResponse>
</env:Body>
</env:Envelope>
SOAP HTTP Binding
-
SOAP 1.2 only defines HTTP as transport protocol
-
Other protocols (email, Java Messaging, etc) are left unspecified
-
Notable points
-
POST (not GET) is supported
-
The document type is
application/soap
-
Errors are signalled using HTTP 500 "Internal Server Error"
SOAP HTTP Request
POST /Charging HTTP/1.1
Host: travelcompany.example.org
Content-Type: application/soap; charset="utf-8"
Content-Length: nnnn
<?xml version='1.0' ?>
<env:Envelope xmlns:env="http://www.w3.org/2001/12/soap-envelope" >
<env:Body>
<m:reserveAndCharge>
::::::
</m:reserveAndCharge>
</env:Body>
</env:Envelope>
SOAP HTTP Response
HTTP/1.1 200 OK
Content-Type: application/soap; charset="utf-8"
Content-Length: nnnn
<?xml version='1.0' ?>
<env:Envelope xmlns:env="http://www.w3.org/2001/12/soap-envelope" >
<env:Body>
<m:reserveAndChargeResponse>
::::::
</m:reserveAndCharge>
</env:Body>
</env:Envelope>
Calling SOAP Methods
Roll Your Own
Perl SOAP::Lite
-
The Perl SOAP::Lite module is available from
http://guide.soaplite.com
-
Given the service "spec"
Service host: www.soaplite.com
Service name: Demo
Methods:
String hi()
String bye()
-
A Perl server is
use SOAP::Transport::HTTP;
SOAP::Transport::HTTP::CGI
-> dispatch_to('Demo')
-> handle;
package Demo;
sub hi {
return 'hello world';
}
sub bye {
return 'goodbye, world';
}
Make this available as e.g. CGI script hibye.cgi
-
A Perl client is
use SOAP::Lite;
print SOAP::Lite
-> uri('http://www.soaplite.com/Demo') # Demo service
-> proxy('http://services.soaplite.com/hibye.cgi') # script for service
-> hi()
-> result;
Apache Axis Server
-
Apache Axis 1.1 is the Apache SOAP server
-
It runs as a servlet within e.g. Jakarta Tomcat
-
The Axis servlet is responsible for dispatching calls to the Web service
-
Install Axis in a Tomcat servlet directory such as
.../jakarta-tomcat/webapps/axis
-
Copy the Java implementation source code with
jws
extension to
the toplevel of the Axis directory e.g.
.../jakarta-tomcat/webapps/axis/Converter.jws
-
Access the service URL by e.g.
http://localhost:8088/axis/Converter.jws
-
The Axis servlet is responsible for generating WSDL, ties, and linking
the service implementation
Apache Axis service and Perl client
-
The file
Converter.jws
would contain
public class Converter implements java.rmi.Remote {
public float inchToMM(float float_1) throws
java.rmi.RemoteException {
float _retVal = float_1 * 2.54;
return _retVal;
}
public float mmToInch(float float_1) throws
java.rmi.RemoteException {
float _retVal = float_1 / 2.54;
return _retVal;
}
}
-
This is stored in
$TOMCAT_HOME/webapps/axis/Converter.jws
-
Perl client to Java Converter Web service
#!/usr/bin/perl
use SOAP::Lite;
print SOAP::Lite
-> uri('http://localhost/Converter') # Converter service
-> proxy('http://localhost:8088/axis/Converter.jws') # Axis service
-> inchToMM(1.0)
-> result;
Apache Axis Client - do it yourself
-
The basic classes to interact with a service are
Service
and Call
-
These assume you know
-
the location of the service (the endpoint address)
-
the operation (the method you are calling)
-
the parameters to the call (an array handed to
Call.invoke()
import org.apache.axis.client.Call;
import org.apache.axis.client.Service;
import javax.xml.namespace.QName;
public class TestClient {
public static void main(String [] args) {
try {
String endpoint =
"http://nagoya.apache.org:5049/axis/services/echo";
Service service = new Service();
Call call = (Call) service.createCall();
call.setTargetEndpointAddress( new java.net.URL(endpoint) );
call.setOperationName(new QName("http://soapinterop.org/", "echoString"));
String ret = (String) call.invoke( new Object[] { "Hello!" } );
System.out.println("Sent 'Hello!', got '" + ret + "'");
} catch (Exception e) {
System.err.println(e.toString());
}
}
}
Apache Axis client using WSDL
-
WSDL2Java
generates client-side stubs
-
For the
ConverterInterface
service, it generates
-
ConverterInterface.java
- the Java interface you use
-
ConverterInterfaceServiceLocator.java
- finds the
service and returns a ConverterInterfaceService.java
-
ConverterInterfaceService.java
- has a method
to get the client HOPP
-
ConverterInterfaceSoapBindingStub.java
-
an implementation
of the interface which is the client part of the HOPP
public class Tester
{
public static void main(String [] args) throws Exception {
// Make a service
ConverterInterfaceService service = new ConverterInterfaceServiceLocator();
// Now use the service to get a stub which implements the SDI.
ConverterInterface port = service.getConverterInterface();
// Make the actual call
float mm = port.inchToMM(1.0);
}
}
Browser access to Web services
-
To access a web service from a browser, another client-server layer is required
-
A Perl script to act as this additional middle server/client is
#!/usr/bin/perl
use CGI;
use SOAP::Lite;
$query = new CGI;
# get CGI query params
$mm = $query->param('mm');
$inch = SOAP::Lite
-> uri('http://localhost/Converter') # Demo service
-> proxy('http://localhost/cgi-bin/converter.pl') # script for service
-> mmToInch($mm)
-> result;
print $query->header();
print $query->start_html(-title=>'Converter result');
print "<p>
$mm millimetres is $inch inches
</p>
</body> </html>";
-
With Web page
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Draft//EN">
<html> <head>
<title>Converter Client</title>
</head>
<body>
<h1>Converter Client</h1>
<form action="http://localhost/cgi-bin/converter-bridge.pl" method="post">
Millimetres
<input type="text" width="10" name="mm">
<input type=submit value="mm To Inch">
</form>
</html>
Conclusion
-
There are implementations of Web services from many languages: Java, Perl, Python, Ada, tcl, etc
-
Some are close to unusable
-
Some are very easy to use
-
The Apache Axis and Perl-Lite are notably easy. Probably Microsoft is easy too
-
Web services are intended for programs, not people. To supply a human-oriented
front end means an extra client-server step
Session management
HTTP and sessions
-
IP is connectionless
-
TCP is built above IP and includes session management i.e.
a connection is established and is used for synchronous
bi-directional communication. The connection identifies the
session
-
HTTP is built above TCP and is sessionless: logically each HTTP GET
establishes a separate TCP session, and subsequent GET's have no logical
link to the previous TCP session
-
Optimisations in HTTP 1.1 allow multiple sessionless HTTP conections
to share a TCP connection
Cookies and other hacks
-
To build a shopping cart, a session is needed
-
Hacks on top of HTML/HTTP are used
-
Cookies
-
URL rewriting
-
Hidden form fields
Sessions and SOAP
-
SOAP uses the HTTP transport layer, and adds no session control:
SOAP is sessionless
-
None of the HTML/HTTP hacks are available
-
If you want sessions, build session support yourself
What is involved
-
The client and server need to maintain a shared identifier
-
SOAP can only support this by an extra parameter to each method call
-
The client and server will need to build session information keyed to this
session identifier
-
Timeouts will be needed to handle clients or servers going away
Conclusion
-
Sessionless HTTP was fine while only static pages were delivered
-
Mechanisms for HTML sessions are hacks that do not translate
into Web services
-
The W3 Consortium mantra that "HTTP is the universal protocol" means
that you have to redo all the work done for TCP that was discarded by
HTTP
Security
General
-
Security is not an afterthought
-
But for Web services, it is
-
Security involves
-
Identification
-
Authentication - proof of identity
-
Authorisation - privileges belonging to identity
-
Integrity - data is not tampered with
-
Confidentiality - data is not exposed to others
-
Delegation and administration
-
Recording
-
Monitoring
-
SOAP ignores all of these
XML Encryption
-
The XML Working Group is looking at things like signatures and encryption
-
Encryption of XML documents can be done at
-
the entire document
-
An entire tag such as <CreditCardInfo>
-
The tag content of its contained tags
-
The tag content of its character data
Breaking the SOAP specifications
-
There is no concept of encryption in SOAP itself
-
Encrypting any SOAP element breaks the SOAP specification.
This rules out encrypting
-
An entire tag such as <CreditCardInfo>
-
The tag content of its contained tags
-
The only binding mandated by the W3C is of SOAP to HTTP
-
Encrypting an entire document breaks this binding
-
All that is left is the encryption of character data (e.g. credit card number):
at which point you lose the data-typing of SOAP - and expose a point of attack
Conclusion
-
Security is not an afterthought
-
But for Web services, it is
Curriculum issues
Distributed Computing using Java
This is the lecture syllabus for a course I teach in distributed programming
Comments
-
The lecture course covers the different distributed paradigms
-
Web services only use a single paradigm
-
Web service protagonists will sometimes add messaging to this,
but this hasn't been standardised yet
-
The different paradigms sit above TCP/IP and are specified
by different protocols
-
Protocol design and parameters are an important part of a
distributed system course
-
Security is a major issue and is growing in importance
-
My course doesn't deal heavily with discovery systems -
these are left to a different course illustrating these
with Jini, UPnP and others
UDDI
UDDI (Universal Description, Discovery and Integration)
-
UDDI is intended to be the ultimate directory for online businesses
-
A business can register itself with a UDDI registry giving info such as
-
business description (name, short description, contacts, categories, urls, etc)
-
business service (description, categories, pointers to info)
-
binding template: information about a service entry point and
implementation specifications
-
tModel: service types (categories, pointers to interface definitions, etc)
-
A UDDI registry can be searched for businesses, services or service types
Registry Entry
UDDI and WSDL
-
UDDI is often called a respository for WSDL documents
-
UDDI is not tied to WSDL
-
UDDI separates specification (tModel) from implementation (binding template)
-
WSDL puts both specification and binding in the same document
-
UDDI "best practice" recommends that WSDL documents should be in two parts, with
binding "including" specification
Public/private Registries
-
IBM, Microsoft, SAP, ... run public registries
(
uddi.ibm.com
, uddi.microsoft.com
, ...)
-
These synchronise, so all registries contain the same info
-
These could hold fantastic amounts of information - all the world's yellow
pages, plus extra service information
-
Private registries can be run within LANs, between cooperating groups of industries,
etc
Private Registries
-
IBM supply a private registry
-
Sun JWSP contains a Registry Service in
$JWSP_HOME/samples/registry-server
Access to Registries
-
A client can search a registry
-
A client can publish/delete information to a registry
-
A variety of APIs exist
-
Unfortunately...
-
You can't search for a web service by name - that is too fine-grained
-
You can search for all web services
-
Once the client is given the complete list, it can do an
internal search for the one it wants
-
The onus of search is placed on the client rather than
the registry
Perl client
-
The
SOAP::Lite
package also includes UDDI::Lite
-
This is not well documented yet
-
Example client
use UDDI::Lite +autodispatch =>
proxy => 'http://uddi.microsoft.com/inquire',
;
my $list = find_business(name => 'xmethods');
my $bis = $list->businessInfos;
for ($bis->businessInfo) {
my $s = $_->serviceInfos->serviceInfo;
print $s->name, ' ', $s->businesskey, "\n";
}
Java Client
-
The JAXR specification gives Java classes to access UDDI registries
-
IBM's UDDI4J also does this
-
These are too complex to give here
Computer science and web services
Paradigms
-
Web services are mainly about RPC
-
(Many documents talk about other models, but they don't seem to exist)
-
RPC is important, but is fairly low on the "paradigm stack"
The REST criticisms
-
HTTP optimisation:
-
The web was originally intended to move static documents from one place to another,
and HTTP 1.0 was designed as the protocol for this
-
When backend scripts and forms were introduced, the it was possible that a
document request could change the state of a web server. HTTP 1.1 was designed
for this, with POST and GET, possible caching on GET, etc
-
Web services only use POST, and break most of the HTTP 1.1 optimisations
-
Reference addressing
-
Everything on the web has an address (a URI)
-
New references can be created, and only need to exist at binding time
-
This effectively allows reference parameters as well as
value contents
-
Web services can return XML documents, but these do not have addresses
-
Web services only support value contents and do not
support references
Other critcisms of SOAP
-
Web service interactions are sessionless - fine, but that makes it harder
to handle state transitions
-
Some vendors have non-standardised solutions for session management
-
O/O systems such as CORBA would criticise SOAP since they often allow creation of
objects and return a binding to them
-
Grid computing has another addressing layer above url's, primarily to allow
for creation and addressing of dynamic services
Over-flexibility in WSDL
-
"All problems in Computer. Science can be solved
by another level of indirection" - David. Wheeler
-
WSDL allows you to specify things at various levels and link them together
-
In practice, this freedom is never used, and is left to implementations
which can change since there is no standard mapping
Security as an afterthought
-
Systems designed without security as a primary consideration
may need security added later
-
Adding security to an existing system is a matter of luck
-
Web services add security by breaking their own specifications
-
This is a good object lesson :-)
Borrowed technologies - UDDI
-
UDDI was invented independently of web services
-
Web services are just a small subset of the services
that UDDI can deal with
-
UDDI was designed to search on categories such as name,
category of organisation and type of organisation
-
Searching for type of service is coarse-grained
and can just return all web services
-
UDDI doesn't allow for the sort of search that clients will want
-
Most clients don't seem to use UDDI
My research
General
-
Generally, I am looking at middleware for small devices
-
The devices may be computer- and network-enabled household
devices such as the "internet fridge" and sensor devices
-
The middleware technologies are usually Jini and UPnP,
sometimes web services
-
I am also doing research into internationalisation
-
Sometimes these results spill over into education,
intellectual property and copyright issues
Jini
-
This is Java-specific middleware
-
It uses an object-based service registry, where Java objects
are put in the registry and searches ae done by Java types
-
This is type-safe and also allows for objects to be downloaded from
services to clients
-
Downloadable objects allow much more flexibility in communication:
-
The service is represented by a proxy that is downloaded
-
The proxy can be "smart" or "dumb"
-
The proxy can communicate with the service using any protocol,
including SOAP (but usually JRMP or Jeri)
-
Jini can use multicast to give "hands-free" announcements and discovery of
services or unicast outside of the local network
UPnP
-
UPnP (Universal Plug 'n Play) was introduced by Microsoft to combat
Jini, Salutation and other emerging technologies
-
FUD for many years, it finally matured and there are many UPnP devices
-
UPnP allows network devices to plug in, join the local network seamlessly
and be discovered and used
-
It uses SOAP and an XML service description language (not WSDL)
Jini/UPnP bridge
-
There are many middleware systems, and no single one will "win"
-
Therefore, you need to "bridge" between them
-
There are always "impedance mismatches" e.g. Jini supports mobile
objects, UPnP doesn't
-
We tried a different design pattern where the registry supplies
a smart Jini proxy that talks UPnP
UPnP and REST
-
UPnP currently uses SOAP
-
This requires XML parsers and generators
-
Even a small XML parser takes 30kb and this may be too large
for tiny devices
-
The REST criticisms of SOAP apply to UPnP as well
-
We built a REST-based of UPnP which doesn't use SOAP
-
Significant improvements in memory, processing time and
network performance
Web service session management using SIP
-
Web services have no direct support for session management
-
The web servie transaction proposals are heavyweight, complex and
are re-inventing the wheel
-
SIP (session initiation protocol) is an IETF standard designed
for general purpose session management
-
We added multi-party session management to web services using
SIP sessions and SIP identifiers in the SOAP header
-
This was successfully used for transaction management without the
overheads and complexity of a new standard
Audio
-
There is intense competition in the home audio market for the
"next generation" networked audio systems
-
PC vendors want a PC-centric solution
-
Set-top vendors want a set-top box to be in control
-
The hifi people want a hifi control centre
-
None of these are a net-centric control-free system
-
Sun: the network is the computer
Me: the network is the audio system
-
We designed and built a network-based audio system
-
This allows any source to join the network e.g. a friend with a iPod
can join the network
-
Any sink can be used e.g. bluetooth headphones, the car stereo system
Dynamic class discovery
-
"An operating system is an application with no top"
-
We have been experimenting with applications with no bottom
-
These are common: the bottom is supplied by e.g. DLLs
-
We use network search to find the bottom: an application
can gets its classes from anywhere in the local nework
-
Uses
-
A travel program can pick up classes from the local environment
e.g. the King's palace, the Sydney Opera House
-
Version upgrades can be brought into a system on a PDA
-
Special nodes in a network can collect, improve and distribute
software to others
-
More flexible than OSGi
Lightweight grid
-
Grid computing allows many computers to cooperate on a task
-
Grid computing is useful for data-heavy or compute-heavy applications
-
Grids are used at Monash for gene analysis and for analysis of
synchrotron data
-
Grids are very heavyweight - layers above web services, with even clients
requiring 1G RAM
-
Homes have huge amounts of data coming in over the cable TV
-
We ar designing a lightweight home grid capable of running on e.g.
the dishwasher to do data mining on TV data
Flexible service discovery
-
Discovery by name (CORBA, RMI, UDDI) is not type-safe because words
are ambiguous
-
Discover by types (Jini) is too rigid and requires too much knowledge
-
"Ontologies are the answer" - yes, if everyone agrees on the same ontology
-
In a non-homogenous world, no-one wil agree on an ontology
-
We want to borrow lessons from the World Live Web to build dynamic
and flexible discovery systems that exploit adhoc ontologies
Ownership
-
Currently, people demonstrate ownership of something by a receipt
-
A receipt is separate from the thing it describes - it can be lost, etc
-
We are looking at "smart" objects which can hold the information
about their owner
-
When you buy something, there is a protocol of "ownership change"
-
Objects can describe their owner without a separate piece of paper
-
Many issues: privacy, anonymity of purchase, death,...
Intellectual property and courseware
-
Thyere has been an enormous shift in power structures between the
academic and the university caused by web-learning repositories
-
We have looked at some of the issues there...
-
...and counteracted some by use of Open Content licenses
Research issues
Computer science
-
Mapping between objects and XML
-
Mapping between database schema and XML
-
Programming languages tuned to XML
-
New distributed system paradigms
-
Cross-language mappings
-
Semantics of web services
Software engineering
-
Design patterns for web services
-
Web services for mobile and transient systems
-
Integrating web services with other distributed models
-
Fixing up design flaws in SOAP and WSDL
-
More flexible discovery systems tuned to web services
-
Security models that work
-
Web services for sensor networks
Information systems
-
Web services in non-homogenous ontologies
Internationalisation
Diversity
-
There are about 6,000 living languages in the world (more than 850
them in Papua New Guinea!)
-
Many languages use the Latin alphabet but most do not
-
French, German, etc use similar sets with individual character
differences
-
Some have their own characters, such as the Thai alphabet
-
Others such as Chinese are hierographic-based
where one character stands for a whole word or
sentence, as opposed to the Latin alphabet,
where each character represents one sound
-
There are at least 192 countries recognised by the UN
and 240 by the ISO, and many of
these have regional groups (such as the Welsh in Britain) and
unrecognised groups (such as the Basque separatists) in Spain
-
There are many different currencies in the world:
pound, dollar (many different ones!), franc, euro, baht,
yuan, ringgit, ...
-
There are many different calendars: Islamic, Thai, Chinese, ...
-
People write addresses in different ways: Chinese start with the
country on the first line, with name at the bottom
-
Names are represented in different ways: some cultures do not even
have two names, and only the Christian's have "christian name"
as well as "surname"
-
About 70% of the Web is in English, but only 40% of web users have English
as first language
Internationalisation
Localisation
-
"l10n" reverses this by customising an application to a local
culture
-
Text strings are determined, date formats, currency formats, ...
-
Icons and images are chosen to match the culture
-
Input methods allow data input for different languages
Culture specific properties
-
Dates - format and calendars
-
Number formats (1.2 vs 1,2 in Europe)
-
Time formats
-
Colours (white is good in western countries, bad in many Asian)
-
Characters - alphabets and hieroglyphics
-
Images (mailbox with content indicator is US specific)
-
Direction of writing
-
Sorting of words (different character orderings across Europe)
-
Character coding (ASCII, ISO8859-1, Unicode)
Locales
-
Applications need to be written in ways that do not require hard-coded
strings and images
-
At runtime an application should be able to find strings, images,
fonts, sounds etc that allow the application to display and collect information
-
A way is needed to express
-
Country of location
-
Language of choice
-
Culture
-
Locales offer a simple way of doing this - not perfect, but better than nothing
-
The idea is that by specifying a locale, you specify everything:
date formats, addresses, fonts, images, cultural terms, ...
Unrealistic?
Locale representation - ASCII
Locales can be written in textual form
-
Language only: "en", "fr"
-
Language and country: "en_EN", "fr_FR", "fr_CA", "fr_CH"
-
ISO 639 extensions which add extra information to languages use a "dot"
notation e.g. "zh_TW.big5" (Chinese Taiwan - Big5) or
"zh_TW.eucTW" (Chinese Taiwan - EUC)
-
Variants are added using an "@" notation e.g."zh_TW.eucTW@radical" or
"zh_TW.eucTW@stroke"
Issues with locales
-
Neither country codes nor language codes are stable in time.
There is no guaranteee that a locale used by a program now
will be the same in 20 years time - maintenance, evolution issues
-
A locale is meant to identify a "culture" - language/country is only
an approximation to this
-
The gypsies (Romany) form an identifiable culture, as do the Jews
and (to a lesser extent) religious groups such as Sunni Muslims.
They don't belong to any single country/language combination
-
Is Los Angeles "hip-hop" a culture? Locales don't go down to that level
of detail
-
Europe is now switching to the euro for currency. Should "fr_FR"
use French francs or euros?
-
If an American is in Australia, should they use their American locale
(zip codes vs postcodes) or Australian (dd-mm-yy vs mm-dd-yy)?
-
Different software (eg ANSI C, Unix, Java) has different scopes and
capabilities from locales.An ANSI C application will not be as portable
as one using Unix C, which in turn may not be as flexible as a Java one.
Should the user need to know the source language of an application?
-
There are about 160 language codes and 240 country codes. That makes nearly
40,000 combinations. While most make no sense, which ones do? Different
vendors implement different subsets
-
How do you perform matches in a partially complete environment? e.g.
if the locale is "fr_CH" (French spoken in Switzerland)
and this doesn't exist, should it match
"fr_CA" (French spoken in Canada) or "it_CH" (Italian spoken in Switzerland).
Different software matches in different ways - inconsistent user experience
-
Use of 2-letter ASCII language codes only allows 26*26 combinations (676) -
too few for the 6000+ languages currently in existence
-
See Tex Texin,
What's wrong with locales?
Java PropertyResourceBundle
-
This is the simplest way in Java of handling locale-specific resources
which can be handled by strings
-
A property resource bundle is a text file of "key=value" pairs such as
OKButtonLabel=Ok
CancelButtonLabel=Cancel
LogoFile=company_logo.gif
-
These are stored in a file
baseName_locale.properties
-
The English version would be in file
MyApp_en.properties
OKButtonLabel=Ok
CancelButtonLabel=Cancel
LogoFile=company_logo.gif
-
The Australian English version would be in file
MyApp_en_AU.properties
OKButtonLabel=She'll be right mate
CancelButtonLabel=Get lost
There is no need to repeat the LogoFile
since it can be found
by the klunky version of inheritance used by resource bundles
Common Locale Data Repository
Distributed versus local applications
-
A local application can assume known data-types and message formats
-
e.g. a Java program can ask for the set of locales and will get an
array of
Locale
objects
-
The definition of
Locale
objects gives methods to get
strings for country and language
-
A distributed application will send messages from a process on one
computer to a process on another
-
The format of messages must be either
-
Text messages have three aspects
-
character set e.g. Unicode
-
coded character set e.g. Unicode 16-bit code points
-
character encoding e.g. UTF-8 or UTF-16
There may also be also transport encoding issues e.g. big-endian or little-endian
RFC 1958
-
RFC 1958 "Architectural Principles of the Internet" gives general
guidelines for applications, protocols etc designed for the internet
http://www.faqs.org/rfcs/rfc1958.html
-
With regard to names, it states:
"4.3 Public (i.e. widely visible) names should be in case-independent
ASCII. Specifically, this refers to DNS names, and to protocol
elements that are transmitted in text format."
-
"5.4 Designs should be fully international, with support for
localisation (adaptation to local character sets). In particular,
there should be a uniform approach to character set tagging for
information content."
HTTP charset negotiation
HTTP language negotiation
Server-driven negotiation
From the HTTP 1.1 specification:
-
If the selection of the best representation for a response is made by an algorithm located at the server,
it is called server-driven negotiation.
-
Selection is based on the available representations of the response (the dimensions over
which it can vary; e.g. language, content-coding, etc.) and the contents of particular header fields in the request
message or on other information pertaining to the request (such as the network address of the client).
-
Server-driven negotiation is advantageous when the algorithm for selecting from among the available representations
is difficult to describe to the user agent, or when the server desires to send its "best guess" to the client along with the
first response (hoping to avoid the round-trip delay of a subsequent request if the "best guess" is good enough for the
user).
-
In order to improve the server's guess, the user agent MAY include request header fields (Accept, Accept-
Language, Accept-Encoding, etc.) which describe its preferences for such a response.
-
Server-driven negotiation has disadvantages:
-
It is impossible for the server to accurately determine what might be "best" for any given user, since that
would require complete knowledge of both the capabilities of the user agent and the intended use for the
response (e.g., does the user want to view it on screen or print it on paper?).
-
Having the user agent describe its capabilities in every request can be both very inefficient (given that only
a small percentage of responses have multiple representations) and a potential violation of the user's
privacy.
-
It complicates the implementation of an origin server and the algorithms for generating responses to a
request.
-
It may limit a public cache's ability to use the same response for multiple user's requests.
Agent-driven Negotiation
From the HTTP 1.1 specification:
-
With agent-driven negotiation, selection of the best representation for a response is performed
by the user agent after
receiving an initial response from the origin server.
-
Selection is based on a list of the available representations of the
response included within the header fields or entity-body of the initial response, with each representation identified
by its own URI.
-
Selection from among the representations may be performed automatically (if the user agent is
capable of doing so) or manually by the user selecting from a generated (possibly hypertext) menu.
-
Agent-driven negotiation is advantageous when the response would vary over commonly-used dimensions (such as
type, language, or encoding), when the origin server is unable to determine a user agent's capabilities from examining
the request, and generally when public caches are used to distribute server load and reduce network usage.
-
Agent-driven negotiation suffers from the disadvantage of needing a second request to obtain the best alternate
representation. This second request is only efficient when caching is used. In addition, this specification does not
define any mechanism for supporting automatic selection
-
HTTP/1.1 defines the 300 (Multiple Choices) and 406 (Not Acceptable) status codes for enabling agent-driven
negotiation when the server is unwilling or unable to provide a varying response using server-driven negotiation.
Apache type-map negotiation
Common user negotiation
HTML and i18n
There is limited support for i18n in HTML
-
The charset can be specified
-
Unicode characters can be used
-
Language tags can be used at any place
-
Right-to-left markup is allowed
-
Support for special language quotes is included
-
Lists can be numbered using different numbering systems
(western, hebrew, etc)
JavaScript
-
JavaScript has virtually no support for i18n
-
It will support Unicode characters in strings, and you can use the "\uXXXX" notation
-
The Date object has a toLocaleString() method
Web services and i18n
i18n
Lang information in header block
-
One way of passing locale information is to use the header block
-
This is not standardised, and is not represented in the WSDL, so it
would have to be an "out-of-band" agreement
-
Here is a client passing info in the header
import org.apache.axis.client.Call;
import org.apache.axis.client.Service;
import org.apache.axis.message.SOAPHeaderElement;
import javax.xml.namespace.QName;
public class TestHelloI18n {
public static void main(String [] args) {
try {
String language = "en";
String endpoint =
"http://localhost:8090/axis/HelloI18n.jws";
Service service = new Service();
Call call = (Call) service.createCall();
call.setTargetEndpointAddress( new java.net.URL(endpoint) );
call.setOperationName(new QName("http://soapinterop.org/", "sayHello"));
// set the xml:lang header
call.addHeader(new SOAPHeaderElement("xml", "lang",
language));
String ret = (String) call.invoke( new Object[] { "Hello!" } );
System.out.println("Sent 'Hello!', got '" + ret + "'");
} catch (Exception e) {
System.err.println(e.toString());
}
}
}
-
Here is a service getting it back
import org.apache.axis.MessageContext;
import org.apache.axis.Message;
import org.apache.axis.message.SOAPEnvelope;
import org.apache.axis.message.MessageElement;
import org.apache.axis.message.SOAPHeaderElement;
import java.util.Vector;
public class HelloI18n {
public String sayHello(String s) throws Exception {
MessageContext context = MessageContext.getCurrentContext();
Message request = context.getRequestMessage();
SOAPEnvelope env = request.getSOAPEnvelope();
// get the xml:lang header
SOAPHeaderElement header = env.getHeaderByName("xml", "lang");
String language = header.getValue();
return "Language " + language + ": " + s;
}
}
i18n Measures and Axis
-
Some of this works fine, some is messy, some is broken...
-
The measure service is okay
import measure.Inch;
import measure.Mm;
public class MeasureConverter {
public Mm inchToMM(Inch in) {
Mm mm = new Mm();
mm.setValue(in.getValue() * 25.4);
return mm;
}
public Inch mmToInch(Mm mm) {
Inch in = new Inch();
in.setValue(mm.getValue() / 25.4);
return in;
}
}
This can be copied to AXIS/MeasureConverter.jws
-
The services uses non-standard data types. In order for Axis to handle
these, it must know how to serialize and deserialize them
-
Serialization and deserialization can be done if the datatype is a
Java Bean and a special entry is made in the WSDD file
-
Serialization and deserialization can be done if the datatype is a
Java Bean and
-
code to implement
getSerializer
and
getDeserializer
is included in the datatype
-
a type descriptor is included in the file
-
This changes the class
Measure
to
package measure;
public class Measure implements java.io.Serializable {
private double value;
private String type;
public Measure() {
}
public Measure(double v, String t) {
value = v;
type = t;
}
public void setValue(double v) {
value = v;
}
public double getValue() {
return value;
}
public void setType(String t) {
type = t;
}
public String getType() {
return type;
}
private static org.apache.axis.description.TypeDesc typeDesc =
new org.apache.axis.description.TypeDesc(Measure.class, true);
static {
typeDesc.setXmlType(new javax.xml.namespace.QName("http://measure", "Measure"));
org.apache.axis.description.ElementDesc elemField = new org.apache.axis.description.ElementDesc();
elemField.setFieldName("value");
elemField.setXmlName(new javax.xml.namespace.QName("", "value"));
elemField.setXmlType(new javax.xml.namespace.QName("http://www.w3.org/2001/XMLSchema", "double"));
elemField.setNillable(false);
typeDesc.addFieldDesc(elemField);
elemField = new org.apache.axis.description.ElementDesc();
elemField.setFieldName("type");
elemField.setXmlName(new javax.xml.namespace.QName("", "type"));
elemField.setXmlType(new javax.xml.namespace.QName("http://schemas.xmlsoap.org/soap/encoding/", "string"));
elemField.setNillable(true);
typeDesc.addFieldDesc(elemField);
}
/**
* Return type metadata object
*/
public static org.apache.axis.description.TypeDesc getTypeDesc() {
return typeDesc;
}
/**
* Get Custom Serializer
*/
public static org.apache.axis.encoding.Serializer getSerializer(
java.lang.String mechType,
java.lang.Class _javaType,
javax.xml.namespace.QName _xmlType) {
return
new org.apache.axis.encoding.ser.BeanSerializer(
_javaType, _xmlType, typeDesc);
}
/**
* Get Custom Deserializer
*/
public static org.apache.axis.encoding.Deserializer getDeserializer(
java.lang.String mechType,
java.lang.Class _javaType,
javax.xml.namespace.QName _xmlType) {
return
new org.apache.axis.encoding.ser.BeanDeserializer(
_javaType, _xmlType, typeDesc);
}
}
-
It changes the class
Inch
to
package measure;
public class Inch extends Measure {
public Inch() {
setType("inch");
}
// Type metadata
private static org.apache.axis.description.TypeDesc typeDesc =
new org.apache.axis.description.TypeDesc(Inch.class, true);
static {
typeDesc.setXmlType(new javax.xml.namespace.QName("http://measure", "Inch"));
}
public static org.apache.axis.encoding.Serializer getSerializer(
java.lang.String mechType,
java.lang.Class _javaType,
javax.xml.namespace.QName _xmlType) {
return
new org.apache.axis.encoding.ser.BeanSerializer(
_javaType, _xmlType, typeDesc);
}
/**
* Get Custom Deserializer
*/
public static org.apache.axis.encoding.Deserializer getDeserializer(
java.lang.String mechType,
java.lang.Class _javaType,
javax.xml.namespace.QName _xmlType) {
return
new org.apache.axis.encoding.ser.BeanDeserializer(
_javaType, _xmlType, typeDesc);
}
}
-
The extra code can be found by generating the WSDL for the service
and then back to the Java code
-
Inheritance breaks the return data type in Axis 1.2
-
The method
inchToMM
is okay to take Inch
as a
datatype that inherits from Measure
-
But: the method
inchToMM
returns Measure
as a
datatype rather than Mm
-
For this method to work,
Mm
must not inherit from
Measure
:
package measure;
public class Mm {
private double value;
private String type = "mm";
public Mm() {
}
public Mm(double v, String t) {
value = v;
type = t;
}
public void setValue(double v) {
value = v;
}
public double getValue() {
return value;
}
public void setType(String t) {
type = t;
}
public String getType() {
return type;
}
// Type metadata
private static org.apache.axis.description.TypeDesc typeDesc =
new org.apache.axis.description.TypeDesc(Mm.class, true);
static {
typeDesc.setXmlType(new javax.xml.namespace.QName("http://measure", "Mm"));
org.apache.axis.description.ElementDesc elemField = new org.apache.axis.description.ElementDesc();
elemField.setFieldName("value");
elemField.setXmlName(new javax.xml.namespace.QName("", "value"));
elemField.setXmlType(new javax.xml.namespace.QName("http://www.w3.org/2001/XMLSchema", "double"));
elemField.setNillable(false);
typeDesc.addFieldDesc(elemField);
elemField = new org.apache.axis.description.ElementDesc();
elemField.setFieldName("type");
elemField.setXmlName(new javax.xml.namespace.QName("", "type"));
elemField.setXmlType(new javax.xml.namespace.QName("http://schemas.xmlsoap.org/soap/encoding/", "string"));
elemField.setNillable(true);
typeDesc.addFieldDesc(elemField);
}
public static org.apache.axis.encoding.Serializer getSerializer(
java.lang.String mechType,
java.lang.Class _javaType,
javax.xml.namespace.QName _xmlType) {
return
new org.apache.axis.encoding.ser.BeanSerializer(
_javaType, _xmlType, typeDesc);
}
/**
* Get Custom Deserializer
*/
public static org.apache.axis.encoding.Deserializer getDeserializer(
java.lang.String mechType,
java.lang.Class _javaType,
javax.xml.namespace.QName _xmlType) {
return
new org.apache.axis.encoding.ser.BeanDeserializer(
_javaType, _xmlType, typeDesc);
}
}
-
Similar comments apply to the method
mmToInch
These classes should be collected into a jar
file and
copied to AXIS/WEB-INF/lib/
-
A client that uses the WSDL generated files is
import DefaultNamespace.*;
import measure.*;
public class TestWSDLConverter {
public static void main(String [] args) throws Exception {
// Make a service
MeasureConverterService service =
new MeasureConverterServiceLocator();
// Now use the service to get a stub which implements the SDI.
MeasureConverter port = service.getMeasureConverter();
// Make the actual call
Inch in = new Inch();
in.setValue(2.0);
Mm mm = (Mm) port.inchToMM(in);
System.out.println("Answer is " + mm.getValue());
}
}
Summary
-
Web services are a new version of RPC, heavily reliant on XML
-
They have all the standard disadvantages of RPC
-
Location dependence
-
Slow
-
Prone to network errors
-
Prone to security attacks
-
In addition, they have their own disadvantages
-
Verbose transport protocol
-
Heavy processing on client and server
-
Ludicrously centralised services in UDDI
-
Web services have learnt none of the lessons from e.g. Service Location Protocol,
Jini, mobile agents
-
Some very talented engineers have hidden the complexity e.g. M/S .NET, Perl::Lite, Axis
-
Web services can teach many interesting lessons about computer science,
computer engineering and the social aspects of software adoption
References
Jan Newmarch (http://jan.newmarch.name)
jan@newmarch.name
Last modified: Fri Dec 9 15:58:04 EST 2005
Copyright ©Jan Newmarch