Over the years we have become accustomed to listening to music and general audio from an increasing variety of sources. From only hearing music in concert halls, we can now hear it from TVs and radios, piped throughout shopping malls and elevators, and blaring out from spruikers at individual shops. In addition to that, more and more people are carrying their own portable audio sources, culminating in the current generation of iPods which can store 10,000 music files. The variety of audio sources and possible audio sinks can only be expected to increase with more and more devices being able to generate and consume audio. In addition, we can expect that sources and sinks will become more volatile, with consumers moving within range and out of range of a multiplicity of sources.
Most architectures for home audio-visual systems such as the Java Media Framework or Microsoft Direct Show are based on a local model, where all generators (e.g. TV tuner card) and consumers (e.g soundcard) are all on the same machine. Even though JMF supports remote audio by means of HTTP and RTP, it hides these under a local programming model.
Network architectures are either based on existing middleware such as C++, often extending it in some way, or build their own middleware structure oriented towards a particular view of A/V. In the first class are systems such as Multimedia System Services and ... In the second category are systems such as Network-integrated Multimedia Middleware (NMM). There is work on distributed A/V systems using Java such as HAVi but this is quite specific to the firewire networking protocol.
This paper is oriented towards providing a large-scale service-based architecture where the emphasis is on service advertisement and discovery, simplified as much as possible, with recovery under failure as services disappear. The framework acts at an abstract level of service description but implementation levels maintain the capability of accomodating many transport protocols, of handling multiple presentation formats, being able to manage issues such quality of service and even being able to use multiple middleware systems.
The system uses Jini for service management. This is a middleware system built on Java that is able to fully exploit Java networking capabilities and object mobility.
The structure of this paper is as follows: the next section discusses Jini as a service management middleware. The following section discusses and defines the service interfaces for our system. After this, additional interfaces that give lower level information are discussed. Some implementation techniques follow this. The succeeding section looks at scalability issues, and finally the paper concludes with a summary and discussion of future work.
Java is a platform-independant language in which programs are compiled to portable byte code. It has become widely accepted from the enterprise level down to embedded systems in small devices. While the scale of hardware variation has lead to different levels of virtual machine and core libraries (CLDC, CDC, J2SE, J2EE), there is still a much higher degree of conformability than in languages compiled to the object code layer.
In addition, Java has well-defined introspection mechanisms, which leads to standard serialisation techniques. These can be used to separate object data from class code so that instance data can be moved across a network and combined with class definitions from a separate source. This can be used as the basis for mobile systems of various kinds, from RMI to Jini to mobile agent systems.
Jini exploits the mobility of Java code with a service management system tuned towards network realities. It gives service advertisement and discovery, with resilient recovery mechanisms in case of failure. It is interface based, with total flexibility in implementation.
The advantages of this are
There are many variables that affect how A/V is sourced, moved around a network and delivered
Interfaces should contain all the information about how to access services. With audio, all the information about a service can be quite complex: for example, a service might offer a CD track encoded in 16-bit stereo, big-endian, 44.1khz sampling in WAV format from an HTTP server. This information may be needed by a consumer that wants to play the file.
But at the most abstract layer an A/V system consists of three players:
For simplicity we define two interfaces: Source
and Sink
.
To avoid making implementation decisions about pull versus push, we
have methods to tell a source about a sink, a sink about a source, to tell
the source to play and the sink to record. Again, how they decide how to do this
is upto the source and sink. Sometimes this won't work: an HTTP source may not
be able to deliver to an RTP sink, or a WAV file may not be managed by an
MP3 player. If they don't succeed in negotiating tranport and content,
then an exception should be thrown. This violates the principle that a service
should be usable based on its interface alone, but considerably simplifies
matters for controller clients.
A controller that wants to play a sequence of audio tracks to a sink will need
to know when one track is finished in order to start the next. The
play()
and record()
methods could block till
finished, or return immediately and post an event on completion.
The second method allows more flexibility, and so needs add/remove
listener methods for the events.
Finally, there are the exceptions that can be thrown by the methods.
Attempting to add a source that a sink cannot handle should throw
an exception such as IncompatableSourceException
.
A sink that can handle only a small number of sources (for example, only
one) could throw an exception if too many sources are added. A source
that is already playing may not be able to satisfy a new request to play.
These considerations lead to a pair of high-level interfaces which seem to be suitable for controllers to manage sources and sinks (other event constants may be added later):
public interface Source extends java.rmi.Remote {
int STOP = 1;
void play() throws
RemoteException,
AlreadyPlayingException;
void stop() throws
RemoteException,
NotPlayingException;
void addSink(Sink sink) throws
RemoteException,
TooManySinksException,
IncompatableSinkException;
void removeSink(Sink sink) throws
RemoteException,
NoSuchSinkException;
EventRegistration addSourceListener(RemoteEventListener listener,
MarshalledObject handback) throws
RemoteException;
}// Source
and
public interface Sink extends java.rmi.Remote {
int STOP = 1;
void record() throws
RemoteException,
AlreadyRecordingException;
void stop() throws
RemoteException,
NotRecordingException;
void addSource(Source src) throws
RemoteException,
TooManySourcesException,
IncompatableSourceException;
void removeSource(Source src) throws
RemoteException,
NoSuchSourceException;
EventRegistration addSinkListener(RemoteEventListener listener,
MarshalledObject handback) throws
RemoteException;
void removeSinkListener(RemoteEventListener listener) throws
RemoteException,
NoSuchListenerException;
}// Sink
The two interfaces given above are enough to identify sources and sinks to a third party client (or to each other). In order to negotiate whether they can talk to each other may require more information, which can be supplied by further interfaces.
The Java Media Framework (JMF) has methods such as getSupportedContentTypes()
which returns an array of strings. Other media toolkits have similar mechanisms.
This isn't type-safe: it relies on all parties having the same strings and attaching
the same meaning to each. In addition to this, if a new type comes along, there isn't
a reliable means of specifying this information to others. A type-safe system can at
least specify this by class files.
Interfaces are more type-safe than strings: a WAV
interface,
an Ogg
interface, etc. This doesn't easily allow extension
to the multiplicity of content type variations (bit size, sampling rate, etc),
but the current content handlers seem to be able to handle most of these
variations anyway, so it seems feasible to ignore them at an application
level.
The content interfaces are just place-holders:
package presentation;
public interface Ogg extends java.rmi.Remote {
}
A source that could make an audio stream available in OggVorbis format would
signal this by implementing the Ogg
interface. A sink that
can manage OggVorbis streams would also implement this interface.
In a similar way, the transport mechanisms may be represented by interfaces.
A transport sink will get the information from a source using some unspecified
network transport mechanism. The audio stream can be made available to any
other object by exposing an InputStream
. This is a standard
Java stream, not the special one used by JMF. Similarly, a transport source
would make an output stream available for source-side objects to write data
into.
public interface TransportSink {
public InputStream getInputStream();
}// TransportSink
and
public interface TransportSource {
public OutputStream getOutputStream();
}// TransportSource
By separating the transport and content layers, we have a model that follows a part of the ISO 7-layer model: transport and presentation layers. The communication paths for a "pull" sink are
The classes involved in a "pull" sink could look like
A variety of implementations have built using these interfaces. The separation of transport and content (presentation) and the networking support built into Java means that the implementations are very small - typically just a few dozen lines.
A number of clients to link sources to sinks have also been built. The simplest just links any source to any sink. More complex graphical user interfaces have also been built, and here the bulk of the code lies in the Swing objects.
In a normal service architecture, creating 10,000 services will create at least 10,000 objects. In Jini 2.0 using Jeri, this number will be substantially larger: the programmer will need to create an exporter for each service, and generate a proxy for each service. Behind the scenes, many more objects may be created.
We tested the memory requirements for such a large number of objects by writing a server which just created an object 10,000 times, created an exporter and proxy and exported the proxy. The results are shown in table XXX. Using a "larger" object, we got table XXX.
We did these tests with normal services, and then with Activatable services.
Using activatable services requires use of an activation server such as
rmi
. Using activation means that the memory load is placed into
the activation server, which caches services on disk and reactivates them at
need. The figures involving service memory use, activation server memory use
and activation server disk use are given in table XXX
Robin: depending on results, you may be able to say: no problems, or we need to put in place a new memory management scheme
Managing 10,000 services will put memory and processing load on the lookup service. These are given in table XXX.
Each service will have an associated lease. Jini normally gives about 5 minutes per lease before it needs to be renewed. 10,000 leases will need renewal at the rate of about 30 leases per second. The network traffic will be ... and the processor usage will be ... (There should be figures for lease renewal for DHCP for comparison - someone must have studied DHCP o'heads for large networks).
Sources and sinks can attempt to link to each directly or via a third party agent.
The Source
and Sink
interfaces form a first step in this.
They may need to negotiate based on further interfaces that each implements.
A sink service that records to a file on disk presents an interesting case that
can be handled within this framework, but which adds additional information.
A service is defined by its contract. A sink must be able to record, or throw a
known exception. A file sink will need to have a file selected.
If none is selected, it could throw a NoFileSelectedException
,
but this would break the contract since a client may not know about this
exception. So a file sink will need to be able to handle this case without
complaint (say by discarding the file or saving it in a default file).
A file sink will expose an interface
public interface FileSink extends common.Sink {
public boolean setFile(File sinkFile) throws RemoteException;
/**
* methods to browse the file system
* Based on FileSystemView from JFileChooser
*/
public File[] getFiles(File dir, boolean useFileHiding) throws RemoteException;
public File getHomeDirectory() throws RemoteException;
public File getDefaultDirectory() throws RemoteException;
public File createNewFolder(File dir) throws RemoteException, java.io.IOException;
}// FileSink
which will allow any third party to browse and choose a sink file.
A GUI client will not be expected to know this interface, though (or
any interface apart from Source
and Sink
).
So it will not be able to choose a file unless the sink itself can provide a UI.
The Jini community has standardised a UI mechanism.
This allows a service to specify one or more user interface objects,
for example based on an AWT Frame
or Swing JDialog
.
A client may choose to use such a UI based on its own preferences.
However, the standard Jini UI will not quite handle
the "file sink" situation. The Jini UI assumes that a client knows
all the interfaces
of a service, and is just replacing its own UI with that supplied by the service.
Roles such as "main UI" allow the service to specify non-modal UI objects
such as Frame
or non-modal JDialog
.
The requirement to choose a file before recording means that the standard Jini UI roles are not adequate. We have therefore added "Setup" and "Supplementary" roles to cover the cases where a service has extra interfaces that the client does not know about, but which may be needed in a modal or non-modal manner (a non-modal additional interface may be a volume control, for example).
We have presented an architecture for A/V systems that will scale to large numbers of services. The system is targeted towards simplicity while still retaining the ability for detailed service negotiation using multiple transport and middleware sytems.
There is much work to be done in exploiting this architecture by filling in the details of various content types. More importantly is to determine limits in service architecture scalability and how to deal with highly dynamic situations.