Pilot Engineering Repository Xsearch

PerX Logo

IESR Comments on PerX Implementation Report (Oct 2006)

Authors: Ann Apps (ann.apps@manchester.ac.uk)

The issues raised by the PerX IESR Implementation Report are very interesting, and will influence the future development of the IESR service. The following comments relate to the appropriate sections of the PerX report.

2. Identification/Addition of Suitable Targets.

This use case (of IESR) begins with a human administrator searching the IESR web interface by entering a term ("engineering") in the 'subject' box.

The design of IESR in terms of subject searching has concentrated so far on the dynamic, machine-to-machine, use. To support consistent searching over collections IESR requires a Dewey classification for each collection. Contributors are also able to supply subject terms according to their own choice of classification scheme, but there is no consistency within IESR over which vocabularies are used, and to some extent this varies with discipline. Thus if a Dewey term for 'engineering' were entered in the subject search box, appropriate collections would be discovered.

Another quick solution to this problem would be to enter the term "engineering" in the 'all fields' search box.

This use case indicates some suggested changes to the IESR contribution guidelines and web interface:

  • This PerX use, and other apparent use scenarios that begin with a person searching the IESR web interface, indicate that IESR should encourage contributors to include subject terms from a vocabulary that uses 'words', as well as Dewey and other coded schemes. It is unlikely that a person would enter a Dewey term. If a terminology service were available to determine the appropriate Dewey term corresponding to the subject 'word' the searcher could first look this term up, but this longer process would probably cause irritation and deter use of IESR. The same consideration applies to translating user requested narrower terms into suitable broader terms for collection discovery. It is unclear as yet within the Information Environment how the inclusion of a terminology service will work, but practically making use of it is outside the scope of IESR.
  • The IESR web interface should be modified to include a range of subject search boxes covering various vocabularies, so that it is clear to discoverers what type of data they are searching.
  • The web interface needs some checking for usability. In particular, in relation to this issue, the simple examples at the side of the search boxes need refinement.

The suggestion that trusted consumer services should be allowed to change IESR data is interesting and may merit some consideration. The current registration and contribution model would not support this. However, IESR will take note of any requests and notifications of inaccuracy from consumer services such as PerX and act upon them. [The current problem of out-of-date data is a consequence of the prototype and uncertain nature of the project until now, and should be addressed in the near future.]

3.1B Retrieval of JORUM Records via Z39.50

The quoted access rights are presumably completely out-of-date. This just further highlights the currency problems with IESR data.

3.2 Retrieval of IESR Records via OAI-PMH.

It was envisaged that the main purpose of the IESR OAI-PMH interface would be for gathering all IESR records to build or augment a local registry, and then to keep this registry up-to-date by incremental collection. Possibly an application would do this in preference to dynamically querying IESR to process every user query.

But the PerX use appears to be using an OAI-PMH request to find details of a particular service. This is effectively turning the OAI-PMH service into an 'obtain' interface. In fact the IESR OpenURL Link-To resolver returns the description corresponding to an IESR identifier with a simpler query string, as does clicking on the identifier.

The point that it is necessary to determine the IESR identifier before using this method is well taken. IESR identifiers are for machine use, and are not very easy to use by people. IESR will add to its development plan an identifier discovery service, maybe returning the identifier corresponding to a title and entity type (and maybe service protocol for a Service). Probably this could be an extension of the OpenURL Link-To Resolver.

OAI-PMH returns the details of a single entity (Collection, Service and Agent), whereas Z39.50 returns a composite Collection description with all its Services and Agents bundled into the record. Apart from that, the same data is returned according to the same XML schema by both interfaces.

3.3 Analysis: JORUM

In the early design stages of IESR it was decided that the only technical detail we needed to capture for an OAI-PMH service is its baseURL. An application can query the service itself with an 'identify' request to find further details. Thus it is interesting to read this PerX use case, effectively a stakeholder request, which may suggest readdressing that decision. Possibly IESR could capture these details in an 'interface' file similar to those for Z39.50 and webcgi.

IESR will add this request to its list of issues to consider at the next metadata review. The information about the Grainger registry is useful. OpenDOAR, the UK listing of OAI-PMH repositories appears also to capture only the baseURL (and doesn't include JORUM). Possibly either WSDL or ZeeRex (following work from the NISO Metasearch Initiative) could be employed.

However a significant issue may be the additional burden on data contributors to provide further details of their OAI-PMH services, especially when this information is already easily available. IESR does not have resource to augment data records itself.

IESR does plan to develop an SRU interface.

4. Automation of PerX-IESR Interaction

IESR does plan to develop an Alert service, to notify of new and changed records. This is currently envisaged as an RSS feed. But possibly a Web Service implementation could also be considered.

IESR still plans to develop a Web Services interface and support SRW (although enthusiasm for SRW appears to be waning in favour of SRU). Using Web Services to automate updates to a service such as PerX will be an interesting avenue to explore in the future.

IESR plans to develop guideline documentation for users of IESR. We are currently developing and documenting use cases to illustrate possible ways in which IESR could be used.

It is intended to continue IESR development and documentation over the next 3 years of funding, in parallel with attracting new contributions and keeping them up-to-date.