Saturday, March 24, 2007

What kind of reliability do I get with Web Services Reliable Exchange (WS-RX)?

pencil icon, that"s clickable to start editing the post

A unified standard for reliable exchange of web service messages has been missing in the WS-* family for some time, but is soon to be done (last time I heard was in mid 2006 that it was expected in December 2006). Thinking of when and where got me back to some old thoughts about what it's all about. That is do I need it? and when do I need it?, as well as the opposite answers when can i discard it? Here's my (non expert) thought on that.

On the one hand it seems obvious that it should always be reliable, because why should I make something that doesn't always work? On the other hand so much has been running nicely for so long without these painful thoughts (based on a risk assessment or simply being optimistic and pragmatic. That's taken that I even know what I mean when i write reliable and even more what others mean by it. The term has meaning all through the architecture layer, from the business level of an reliable business partner, to the process level with reliable processes (sub and super processes), down to the reliable messaging and protocols that ensure that. The decision as to the need for reliability has to come from the business needs, but I guess that there are aspects or facets that could make the different layers realize this in different ways, with the proper decoupling between the layers.

It seems natural that the baseline does not involve reliability and to get reliability an effort has to put into it. How big an effort and what measures does it account for, that is what potential problems and error cases does it resist?

Taken a top down approach the Wiktionary has the following definition of Reliability:

  1. The quality of being reliable, dependable, or trustworthy.
  2. A quality of a measurement indicating the degree to which the measure is consistent, that is, over repeated measurements would give the same result.

and the Wikipedia definition of reliability contains:

In general, reliability (systemic def.) is the ability of a system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances.

The IEEE defines it as ". . . the ability of a system or component to perform its required functions under stated conditions for a specified period of time."

and defines a reliable protocol as:

In computer networking, a reliable protocol is one that ensures reliability properties with respect to the delivery of data to the intended recipient(s), as opposed to an unreliable protocol, which does not guarantee that data will be delivered intact, or that it will be delivered at all.

Java Message Service (JMS)

The first thing i think of when associating on reliable messaging is JMS. A quick googling lead to the Novel JMS FAQ the question is messaging reliable?
While the quality-of-service guarantees offered in Message Oriented Middleware solutions can vary greatly, and while real-world reliability often depends on administrative issues (such as cluster size and available resources), all JMS-based messaging services are required to offer assured, once-only delivery of messages as an option for applications where reliability is paramount. JMS also allows for configurations that provide a less robust quality of service, so that in cases where speed of delivery might be more important than assured, once-only delivery, a tailored solution can be built. The reliability of JMS-based messaging solutions is thus configurable.

so reliable for JMS means assured, once-only delivery of messages. The JMS standard (Version 1.1 April 12, 2002) can be picked up at Suns JMS documentation. This is an API and not a wire protocol The JMS API is an API for accessing enterprise messaging systems from Java programs:

Enterprise messaging products (or as they are sometimes called, Message Oriented Middleware products) are becoming an essential component for integrating intra-company operations. They allow separate business components to be combined into a reliable, yet flexible, system.

Yep, it's reliable, but it doesn't say much more about it except in the FAQ 10.1.6 Should JMS Provide End-to-end Synchronous Message Delivery and Notification of Delivery?:

Some messaging systems provide synchronous delivery to destinations as a mechanism for implementing reliable applications. Some systems provide clients with various forms of delivery notification so that the clients can detect dropped or ignored messages. This is not the model defined by JMS.

JMS messaging provides guaranteed delivery via the once-and-only-once delivery semantics of PERSISTENT messages. In addition, message consumers can insure reliable processing of messages by using either CLIENT_ACKNOWLEDGE mode or transacted sessions.

This achieves reliable delivery with minimum synchronization and is the enterprise messaging model most vendors and developers prefer.

JMS does not define a schema of systems messages (such as delivery notifications). If an application requires acknowledgment of message receipt, it can define an application-level acknowledgment message.

These issues are more clearly understood when they are examined in the context of Pub/Sub applications. In this context, synchronous delivery and/or system acknowledgment of receipt are not an effective mechanism for implementing reliable applications (because producers by definition are not, and don’t want to be, responsible for end-to-end message delivery).

so it's up to the application level if further receipts are needed.

OASIS Web Services Reliable Exchange (WS-RX)

On the homepage for the TC, there are a link to the overview RDDL page with a link to the current WS-RX specification

A look in here gives a definition of what WS-RX is all about (could be an easier read and a FAQ would also be nice). In the start it says:

Many errors can interrupt a conversation. Messages can be lost, duplicated or reordered. Further the host systems can experience failures and lose volatile state.

So it's the same thing as JMS, we want to avoid duplicated or 'wrong'-ordered messages and of course loosing messages.

The WS-ReliableMessaging specification defines an interoperable protocol that enables a Reliable Messaging (RM) Source to accurately determine the disposition of each message it Transmits as perceived by the RM Destination, so as to allow it to resolve any in-doubt status regarding receipt of the message Transmitted. The protocol also enables an RM Destination to efficiently determine which of those messages it Receives have been previously Received, enabling it to filter out duplicate message transmissions caused by the retransmission, by the RM Source, of unacknowledged message. It also enables an RM Destination to Deliver the messages it Receives to the Application Destination in the order in which they were sent by an Application Source, in the event that they are Received out of order. Note that this specification places no restriction on the scope of the RM Source or RM Destination entities. For example, either can span multiple WSDL Ports or Endpoints.

The same story with more words and som abstract source and destination semantics.

The protocol enables the implementation of a broad range of reliability features which include ordered Delivery, duplicate elimination, and guaranteed receipt. The protocol can also be implemented with a range of robustness characteristics ranging from in-memory persistence that is scoped to a single process lifetime, to replicated durable storage that is recoverable in all but the most extreme circumstances. It is expected that the Endpoints will implement as many or as few of these reliability characteristics as necessary for the correct operation of the application using the protocol. Regardless of which of the reliability features is enabled, the wire protocol does not change.

I'm not really sure if the guaranteed receipt is an addition comparing with JMS, since I guess it's implicit in the specific JMS implementations.

In an Interview with the OASIS WSRX Co-chair - Paul Fremantle he answers the question How does WSRM compare to existing reliable messaging systems such as JMS? with:

The Java Messaging Service API (JMS) is widely adopted, but this doesn't offer an interoperable wire protocol. So you can't talk from one vendor's JMS implementation to another without some kind of bridging in the middle. The other key difference is that WSRM doesn't require you to change your programming model. Because SOAP is already a message based model, you can add reliability in without having to think about concepts such as queues and topics. The result is that WSRM can be very easy to implement if you already have a SOAP or XML based architecture.

This is more an overall architecture answer, but i guess the meaning in terms of reliability is that they are very much alike.