Sunday, October 21, 2007

Searching UDDI version 3 with the browse pattern - wildcards and findQualifiers

pencil icon, that"s clickable to start editing the post

It's been some since I last looked at UDDI, but the availability of the new OIO e-Business registry has awaken my interest once again. The old OIO UDDI supported only version 2 (see for example my post Using the ISB UDDI #1), whereas the new one supports UDDI version 3. In this post I'll walk through how to do a simple search for all tModels in a version 3 registry by using the web service API that comes with UDDI.

The WSDL for the Inquiry API

The new registry does have a correct Inquire WSDL that supports no less than the ws API's for all three version:

    1 <?xml version="1.0" encoding="UTF-8"?>
    2 <definitions name="UDDI_API_V3"
    3              targetNamespace="urn:uddi-org:api_v3generated/"
    4              xmlns:tns="urn:uddi-org:api_v3generated/"
    5              xmlns="http://schemas.xmlsoap.org/wsdl/">
    6 
    7   <import namespace="urn:uddi-org:api_v2"
    8           location="http://publish.uddi.ehandel.gov.dk:80/registry/uddi/inquiry/2.0/wsdl" />
    9 
   10   <import namespace="urn:uddi-org:api_v3"
   11           location="http://publish.uddi.ehandel.gov.dk:80/registry/uddi/inquiry/3.0/wsdl" />
   12 
   13   <import namespace="urn:uddi-org:inquiry"
   14           location="http://publish.uddi.ehandel.gov.dk:80/registry/uddi/inquiry/1.0/wsdl" />
   15 
   16 </definitions>

I have no idea about challenges with backwards compatibility, but it is covered by the specification. The only confusion here is that we're doing inquiry and an endpoint with the host name on publish.uddi.ehandel.gov.dk but I guess the meaning like it's the production site, where it's public available and as such published. The WSDL for the inquiry API in version 3:

    1 <?xml version="1.0" encoding="UTF-8"?>
    2 <definitions name="UDDI_API_V3"
    3              targetNamespace="urn:uddi-org:api_v3"
    4              xmlns:tns="urn:uddi-org:api_v3"
    5              xmlns:api_v3_binding="urn:uddi-org:api_v3_binding"
    6              xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
    7              xmlns="http://schemas.xmlsoap.org/wsdl/">
    8 
    9   <documentation xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
   10                  xmlns:api_v3_binding="urn:uddi-org:api_v3_binding"
   11                  xmlns:tns="urn:uddi-org:api_v3">
   12     Copyright 2001-2005 Systinet Corp. All rights reserved. Use is subject to license terms.
   13 
   14     WSDL SOAP/HTTP binding for UDDI V3 Security, Publication and Inquiry APIs.
   15   </documentation>
   16 
   17   <import namespace="urn:uddi-org:api_v3_binding" location="uddi_api_v3_binding.wsdl" />
   18 
   19   <service name="UDDI_Inquiry_SoapService">
   20     <port name="UDDI_Inquiry_PortType" binding="api_v3_binding:UDDI_Inquiry_SoapBinding">
   21       <soap:address location="http://publish.uddi.ehandel.gov.dk:80/registry/uddi/inquiry" />
   22     </port>
   23   </service>
   24 
   25   <service name="UDDI_Publication_SoapService">
   26     <port name="UDDI_Publication_PortType" binding="api_v3_binding:UDDI_Publication_SoapBinding">
   27       <soap:address location="urn:unknown-location-uri" />
   28     </port>
   29   </service>
   30 
   31   <service name="UDDI_Security_SoapService">
   32     <port name="UDDI_Security_PortType" binding="api_v3_binding:UDDI_Security_SoapBinding">
   33       <soap:address location="urn:unknown-location-uri" />
   34     </port>
   35   </service>
   36 
   37 </definitions>

The endpoint is given, and I'm ready to next step.

Not that it's significant, but it's a little confusing that the Security and Publish is also mentioned here (under 'Inquiry'), though without specific endpoints.

Coding the ws client with Axis2

Want to search the UDDI by programming? No problem, take the WSDL and run it through your favorite WS-tool, create your search Class and you're done ready to do the search. That was what I did with the Axis2 nightly, and with my preferred binding XmlBeans - well done Axis2 committers!

I could probably have used the Systinet Oracle Java API for the registry, but I don't see much use for it - yes there could be nice utility methods etc. but I don't need them.

The first quick search - uhmm, is it really empty?

Based on my earlier experience with the version 2 Inquiry programmers API, I knew that '%' is used as wildcard, so I decided to do a search on all tModels. In the UDDI version 3 specification this is called the browse pattern which is spot on:

Software that allows people to explore and examine large quantities of data requires browse capabilities. The browse pattern characteristically involves starting with some broad information, performing a search, finding general result sets and then selecting more specific information for drill-down.

The request went like:

    1 <?xml version='1.0' encoding='UTF-8'?>
    2 <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
    3   <soapenv:Body>
    4     <urn:find_tModel xmlns:urn="urn:uddi-org:api_v3">
    5       <urn:name>%</urn:name>
    6     </urn:find_tModel>
    7   </soapenv:Body>
    8 </soapenv:Envelope>

And the response:

    1 <?xml version="1.0" encoding="UTF-8"?>
    2 <Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/">
    3   <Body>
    4     <tModelList xmlns="urn:uddi-org:api_v3">
    5       <listDescription>
    6         <includeCount>0</includeCount>
    7         <actualCount>0</actualCount>
    8         <listHead>1</listHead>
    9       </listDescription>
   10     </tModelList>
   11   </Body>
   12 </Envelope>

Uhmmm, this couldn't be true, so either I was doing something wrong or there would have to be some policy hiding all results for me - it naturally was the former. As described in the next section, in version 3 I have to add a specific FindQualifier in the request:

    1 <?xml version='1.0' encoding='UTF-8'?>
    2 <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
    3   <soapenv:Body>
    4     <urn:find_tModel xmlns:urn="urn:uddi-org:api_v3">
    5       <urn:findQualifiers>
    6         <urn:findQualifier>uddi:uddi.org:findqualifier:approximatematch</urn:findQualifier>
    7       </urn:findQualifiers>
    8       <urn:name>%</urn:name>
    9     </urn:find_tModel>
   10   </soapenv:Body>
   11 </soapenv:Envelope>

And now i get results, a lot in fact (1,4 MB/27729 lines of pretty print XML), as this snippet shows:

    1 <?xml version="1.0" encoding="utf-8"?>
    2 <tModelList xmlns="urn:uddi-org:api_v3">
    3   <listDescription>
    4     <includeCount>6929</includeCount>
    5     <actualCount>6929</actualCount>
    6     <listHead>1</listHead>
    7   </listDescription>
    8   <tModelInfos>
    9     <tModelInfo tModelKey="uddi:systinet.com:uddi:service:porttype:account">
   10       <name>AccountApi</name>
   11       <description xml:lang="en">This portType defines all operations with the accounts.</description>
   12     </tModelInfo>
   13     <tModelInfo tModelKey="uddi:systinet.com:uddi:service:binding:account">
   14       <name>Account_SoapBinding</name>
   15       <description xml:lang="en">This is the SOAP binding for the Account portType.</description>
   16     </tModelInfo>
   17     <tModelInfo tModelKey="uddi:systinet.com:uddi:service:porttype:administrationutils">
   18       <name>administrationUtilsApi</name>
   19       <description xml:lang="en">This portType defines all operations with the administrationUtils.</description>
   20     </tModelInfo>
   21     <tModelInfo tModelKey="uddi:systinet.com:uddi:service:binding:administrationutils">
   22       <name>administrationUtils_SoapBinding</name>
   23       <description xml:lang="en">This is the SOAP binding for the administrationUtils portType.</description>
   24     </tModelInfo>
   25     <tModelInfo tModelKey="uddi:42f92342-c3ed-46ff-8a8a-6518f55d5cd5">
   26       <name>Application response service</name>
   27       <description xml:lang="en">NIST definition of the service interface for handling UBL application response messages.</description>
   28     </tModelInfo>

The header information (in bold) listDescription so there are 6929 tModels in the registry currently.

A deeper look at Find Qualifiers

In section 5.1.4 Find Qualifiers the second row in the table Find Qualifiers by API defines "approximateMatch" with the uuid uddi-org:approximateMatch:SQL99, the introduction says:

Each of the find_xx APIs accepts an optional findQualifiers argument, which may contain multiple findQualifier values. Find qualifiers may be either tModelKeys or may be referenced by a string containing a "short name". Each of the pre-defined findQualifiers in UDDI can be referenced using either the appropriate tModelKey, or by its short name. Registries MUST support both forms, and MUST accept the find qualifiers case-insensitively. The use of tModelKeys for findQualifiers allows extension to create additional new qualifiers, but registries are not obligated to support them. Find qualifiers not recognized by a node will return the error E_unsupported.

Matching behavior for the find_xx APIs when multiple criteria are specified is logical "AND" by default. Find qualifiers allow the default search behaviors to be overridden. Not all find_xx APIs support all findQualifier element values.

later in 5.1.4.3 Find Qualifier Descriptions:

approximateMatch
signifies that wildcard search behavior is desired. This behavior is defined by the uddi-org:approximatematch:SQL99 tModel and means "approximate matching as defined for the character like predicate in ISO/IEC9075-2:1999(E) Section 8.5 like predicate, where the percent sign (%) indicates any number of characters and an underscore (_) indicates any single character. The backslash character (\) is used as an escape character for the percent sign, underscore and backslash characters. This find qualifier adjusts the matching behavior for name, keyValue and keyName (where applicable).

Let's just take the special characters from the last part again:

percent sign '%'
indicates any number of characters
an underscore '_'
indicates any single character.
The backslash character '\'
is used as an escape character for the percent sign, underscore and backslash characters.

This find qualifier adjusts the matching behavior for name, keyValue and keyName (where applicable).

There's more detail in section 5.1.6 About wildcards, where I've put the most important parts in bold:

The default behavior of UDDI with respect to matching is "exact match". No wildcard behavior is assumed. Many UDDI inquiry APIs take the arguments "name," "keyName," and "keyValue" whose values are of type string. All such arguments may be specified using a wildcard character to obtain an "approximate match". In order to obtain wildcard searching behavior, the findQualifier tModel uddi-org:approximateMatch:SQL99 (whose tModelKey is uddi:uddi.org:findqualifier:approximatematch), or its short name "approximateMatch" MUST be specified.

Wildcards, when they are allowed, may occur at any position in the string of characters that constitutes the argument value and may occur more than once. Wildcards are denoted with a percent sign (%) to indicate any value for any number of characters and an underscore (_) to indicate any value for a single character. The backslash character (\) is used as an escape character for the percent sign, underscore and backslash characters. Use of the "exactMatch" findQualifier will cause wildcard characters to be interpreted literally, and as such should not also be combined with the escape character. Detailed rules for interpretation are defined by the above tModel for approximate matching. Examples of the use of wildcards may be found in Appendix G Wildcards.

I havn't figured out if this is really smart or relly technical, since this isn't SQL and it sort of breaks with how it worked in version 2, also why the need to interpret them literally when there's a simple way to escape them? It works and now I know, so it's just my personal opinion.

0 comments :