The "repository ecology" approach to describing cross-search aggregation service management

A position paper for the ECDL 2007 workshop on "Towards a European repository ecology", Budapest 21 Sept 2007

related papers

Phil Barker & Malcolm Moffat,
ICBL, School of Mathematical and Computer Sciences, Mountbatten Building, Heriot-Watt University, Edinburgh EH14 4AS. {phil.barker|m.moffat}@hw.ac.uk

We propose to use the ecology metaphor to describe and explore some issues raised through the PerX project[1]. This project developed a pilot service providing resource discovery across a series of repositories of interest to the engineering learning and research communities. The fundamental use case behind PerX is that an Engineer who requires information should be able to perform a search across a selected range of data providers and the results should direct him or her to a relevant information resource. The distributed architecture involved in building such a service can easily be described using the JISC Information Environment Architecture[2]: the PerX service is an aggregator in the fusion layer, it has a user interface provided in the presentation layer, and cross-searches information about resources held by several data providers in the provision layer. In the Information Environment Architecture view of PerX, the nature of the content provider services and how to search them is known because of data provided by a service registry, which is part of the shared infrastructure. This is shown schematically in the central part of figure 1.

The actual experience of setting up and maintaining links between PerX and data providers has been described[3], and was critically dependant on many more resources and factors than are shown in the Architectural view. Typically addition of a new data provider involved the PerX service manager gathering information from the wider community of information specialists, from the data provider's website and crucially from a data provider service manager: someone with the authority and/or expertise to commit to providing a data feed under suitable conditions.

We believe that the interactions involved in setting up and maintaining a cross search aggregation service may be modelled as a "habitat" in the repository ecology, and have attempted to sketch some of these interactions in figure 1. We make the following observations relating to this approach:

We hope to be able to expand on these and other observations during the workshop. If the ecology approach is to meet its potential as a communication tool, then it is important that we all speak the same language when employ it: we hope that attendance at this workshop will be important in establishing this language.


Figure 1. Entities and interactions in the PerX cross-search habitat. An engineer uses the PerX user interface to perform a cross search of selected data providers in order to find information. The schematics in the centre of the diagram show  the distributed architecture of data provision, fusion (the aggregator), and presentation the user interface), with a shared service registry which (in theory) can be used obtain information about the available data providers and how to interface with them.  Around this we show the other elements at play in the habitat which were required in order set up the cross-search: a community supporting Information Professionals that provided information about which data providers were relevant and a service manager at each data-provider who provided information specific to that service.


[1] PerX: Pilot Engineering Repository Xsearch http://www.icbl.hw.ac.uk/perx/

[2] JISC Information Environment Architecture http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/

[3] PerX Set-up and Maintenance Report: http://www.icbl.hw.ac.uk/perx/setupmaintenance.htm
and Supplementary Case Study http://www.icbl.hw.ac.uk/perx/casestudyoxford.htm