Pilot Engineering Repository Xsearch |
![]() |
Investigating resource discovery issues in engineering digital repositories | | Home | About | Deliverables | Links | Pilot | |
Case Study: PerX Experience of Harvesting & Utilising Metadata from Oxford JournalsM.Moffat (M.Moffat@hw.ac.uk) - Ver 1.0 (28/03/07) Home>About>Deliverables>PerX Setup & Maintenance Issues>Case Study: Oxford Journals
|
Stage |
|
1 |
November 2006 - Details of the new Oxford Journals OAI-PMH interface are made available at http://www.oxfordjournals.org/help/techinfo/oaipmh.html |
2 |
Multiple OAI Sets are available via the Oxford Journals OAI-PMH interface. For example, there are sets allowing service providers to harvest metadata records for entire journal titles, particular journal volumes or individual journal issues. |
3 |
Twenty three Engineering and Mathematics related Oxford Journal titles are identified as relevant by the Perx Team. |
4 |
OAI-PMH set parameters for the relevant journals are extracted from details provided at the Oxford Journals site. |
5 |
The initial harvest attempt via the Perx Administrative Interface (PAIN) reveals that the interface is unable to deal adequately with the large number of sets provided by Oxford Journals [This is a limitation of the PerX software]. Technical intervention required. |
6 |
Initial test harvest reveals that three of the twenty three OAI-PMH sets are empty. |
7 |
Oxford Journals contacted via email regarding the empty sets and problem is rectified after three weeks. |
8 |
Twenty three relevant sets are successfully harvested by PerX resulting in approximately 36k records. |
9 |
General safe transforms ‘normalisations' performed on harvested metadata which; Convert UTF-8 encoding's, remove spaces between < tags >, remove unnecessary html markup, remove empty metadata elements and double XML encodings. |
10 |
Indexing of Oxford Journals metadata to enable searching via PerX cross search interface. |
Stage |
|
11 |
Analysis of Oxford journals metadata via the PerX cross search interface reveals the following issues:
|
12 |
Basic collection specific transforms manually conducted on the Oxford Journals collection to address some of these issues. These transforms included;
|
13 |
The Oxford Journals target is added to live PerX Pilot service. |
14 |
Feedback passed to Oxford Journals via email regarding the issues encountered. At the time of writing Oxford are looking into the issues and aim to communicate with their OAI online hosts [Highwire Press] in order to resolve them. |
Stage |
|
15 |
Work is ongoing to enable effective automatic harvesting of the Oxford Journals sets required by PerX |
...resource discovery in engineering | | Home | About | Deliverables | Links | Pilot | |