Stockchecking LOCKSS Archival Units

I have been looking at how to check that the material that we are claiming to archive in our LOCKSS Digital Preservation system is actually present. Since we are hoping to be able to serve content from the archive at some point in the future, having accurate records of stored content seems a good idea. There were a number of clues that we may not have as accurate a record of our holdings as we would like:

  1. The Volume Manifests page listed a number of titles where content was ‘not fully collected’.
  2. Inside the LOCKSS Administration Interface we could list the Archival Units and sort the list by Content Size. Quite a number of Archival Units came to the top with zero content.
  3. The LOCKSS Admin Interface can also produce a ‘Title List’ as a spreadsheet. Where the list is limited to just the ‘configured’ titles it should match the holdings we store, but it includes all the volumes where there is zero content, and this overestimates the range of material in the archive. This is the data we hope to use to populate our OpenURL Resolver with information about our LOCKSS holdings.

We do have an authoritative list of the electronic journals to which we have subscriptions or an expectation of free access. We can check our E-Journals A-Z list for information on not just which titles we might expect access. It is important that this source also provides the date ranges where we have access rights.

How to stockcheck your LOCKSS Archive

At least, this is how we did it.

  1. Report a list of titles held in you LOCKSS Archive. Use the Title List page within the LOCKSS Admin interface. Limit the report to ‘Configured titles’ and export this to a CSV file.
  2. Open the CSV file in a spreadsheet. You can view the columns for Publication title, print and online issns, first archived volumes by year and number and any closing years and volume numbers.
  3. Create an extra column in the spreadsheet for ‘Action’ and note down what you are going to do with each title. I used values like: OK, Remove, Remove vols 1-3, Investigate.
  4. Compare each line in the spreadsheet with the list of Archival Units in the LOCKSS Admin interface. Are there uncollected volumes?
    1. Check these against the Journals A-Z list. If we do not have any access rights (because we have no subscription) mark these titles for removal.
    2. If we do have an expectation of being able to archive this title, mark it for further investigation.
  5. Remove (or de-activate) unwanted Archival Units in the Journal Configuration section of the LOCKSS Admin interface.

The first time we ran through this operation we did come across items where we have been over-enthusiastic in ticking the ‘add to collection’ button. The initial list of 265 titles is now down to 154. The archival units we removed were mostly either earlier volumes outside our subscription range or titles where we had access from an aggregator service (like EBSCO or Proquest) rather than directly from the publisher.
Some of these archival units had been blocked with ‘no permission from publisher’ error messages, but not all. Others had managed to collect some content, but on inspection this turned out not to be useful content. Not journal articles but website furniture like images and javascript files.

Results

After completing this stockcheck we now have a more accurate list of the material we have in the LOCKSS Archive. We are sending fewer requests to publishers for content we are not allowed to access and therefore better concentrating on the material we are allowed to preserve. We also have a better idea of what we are and are not allowed to hold in our LOCKSS Archive, so can make better collection management decisions in the future.

About Philip Adams

Senior Assistant Librarian at De Montfort University. I am interested in digital preservation and the use of data to measure a library's impact. All comments own.
This entry was posted in Digital Preservation. Bookmark the permalink.