In this issue:
- What's On - forthcoming events from October 2011 onwards
- What's New - new reports and initiatives since the last issue
- What's What - Repositories and CRISs and research data management - oh, my! - Joy Davidson and Andrew McHugh
- Who's Who 1- Sixty second interview with Alison Heatherington, Digital Preservation Project Manager, Parliamentary Archives
- Who's Who 2 - Sixty second interview with Ginny Browne, Digital Assets Librarian, OCLC
- Your View? - comments and views from readers
What's new is a joint publication of the DPC and DCC
The DCC have a number of events coming up that may be of interest to you. For further details on any of these, please see our DCC events listings at http://www.dcc.ac.uk/events/. You can also browse through our DCC events calendar to see a more extensive list of both DCC and external events.
Introduction to Preservation Training Day
25 October 2011
Do you look after books and documents? Do you want to learn more about ways to protect and preserve them? This training day is aimed at those who are new to the preservation of library and archive collections and would like to learn more. The day starts with an overview of preservation, risk management and prioritization. Subsequent sessions focus on storage of collections; how to handle items; emergency response and digital preservation.
KCL Centre for e-Research Seminar: The BBC Genome Project
25 October 2011
This seminar explores the BBC Genome Project and gives in detail the background work that has gone into making this happen including a pilot project with two years of the Radio Times. The lessons learned from this pilot will also be explained including the eventual workflow from the planning and preparation stage, through to the bulk scanning process. When this significant amount of data is placed online it is going to be a useful reference asset for academic and educational use. This online resource also has the potential to act as a central point to link related programme information such as photographs, scripts and audio-visual material.
The KEEP approach to Digital Preservation
26-27 October 2011
KEEP (Keeping Emulation Environments Portable) is developing emulation services to enable accurate rendering of both static and dynamic digital objects: text, sound, and image files; multimedia documents, websites, databases, videogames etc. This workshop focuses on the technical and legal aspects of emulation in the context of digital preservation. It shows the results of two years KEEP, as well as the main issues linked to transfer tools and emulation.
eSciDoc Days 2011
26-27 October 2011
This two-day conference will provide extensive information about collaborative eResearch environments and related challenges, e. g. the sustainable management of a growing amount of data throughout the research process and the provision of a publishing environment for research results together with research data. Experts with different background in scholarly information management will discuss the latest developments in building digital information infrastructures: researchers and information professionals as well as chief information officers and software developers.
DCC Research Data Management Forum: Incentivising Data Management & Sharing
2-3 November 2011
The seventh RDMF workshop will take place at the University of Warwick on 2nd and 3rd November 2011. The theme for the workshop is Incentivising Data Management & Sharing. Topics to be explored include enablers and barriers for the sharing and reuse of research data; funders’ role in motivating data sharing; support infrastructures and benefits.
KCL Centre for e-Research Seminar: Watching the Detectives: Using digital forensic techniques to investigate the digital persona
8 November 2011
This talk will describe how digital forensic techniques may be used as a research tool for analysing and understanding a person’s digital persona. It will examine the disparate data fragments commonly held on a user’s computer and explore how these may be used to provide insight into the interests and work activities of scholars and other notable individuals. Techniques to be discussed will include data carving, timeline analysis, and emerging techniques to measure emotional state.
8-9 November 2011
This year’s APA conference addresses the theme “Putting the infrastructure in place for digital preservation” and brings together leaders in the field from Europe and around the world, from academic, large scientific research, industrial and commercial stakeholders. The keynote speaker will be EU Commissioner Neelie Kroes, Vice-President for the Digital Agenda; we also have speakers from the Centre of Excellence for Digital Preservation in India, the US National Archives (NARA) and Library of Congress. A senior representative of the British Government will speak on the second day. Looking at the role of the commercial sector in digital preservation and re-use, Microsoft, Google, Oracle and IBM will be taking part in a discussion panel on the High Level Expert Group’s report ‘Riding the Wave’.
DCC Roadshow – Cambridge
9-11 November 2011
The 6th DCC roadshow will be run over three days but each workshop can be booked individually. Attendees are encouraged to select the workshops which address their own particular data management requirements. The workshops will provide advice and guidance tailored to a range of staff, including PVCs Research, University Librarians, Directors of IT/Computing Services, Repository Managers, Research Support Services and practising researchers.
Building a Culture of Research Data Citation Workshop at eResearch Australasia 2011
10 November 2011
The Australian National Data Service (ANDS) is hosting a workshop at the eResearch Australasia 2011 conference based around data citation and the Cite My Data service. The workshop is designed for data publishers and users in the research sector who need to gain a deeper understanding of the issues and technologies around data citation and the ways data citation can be supported at their organizations.
Keeping Legal: what you can and can't do under copyright law
10 November 2011
Presented by Charles Oppenheim for the Scottish Law Librarians Group, this course will cover a wide range of copyright issues, in order to aid information professionals to work effectively within the legal limits of copyright law. The course will include a reminder of the principles and practice of copyright; investigate how to legally copy, covering licences and exceptions to copyright; look at related rights such as database rights, moral rights, and performers rights; review recent developments, including the Digital Economy Act, the Hargreaves Review, and orphan works; discuss risk management; investigate scenarios for discussion and reporting back, and incorporate a quiz, and a Q & A session.
Digital Preservation Training Programme (DPTP)
14-16 November 2011
The DPTP is a modular training programme, built around themed sessions that have been developed to assist you in designing and implementing an approach to preservation that will work for your institution. Through a wide range of modules, the DPTP examines the need for policies, planning, strategies, standards and procedures in digital preservation, and teaches some of the most up-to-date methods, tools and concepts in the area. It covers these topics via a mixture of lectures, discussions, practical tasks and exercises, and a class project.
Surviving the Recession: maximising your value
15 November 2011
The one-day ASLIB Engineering and Technology Group and the Aerospace and Defence Librarians Group seminar will feature presentations on electronic collection management; professional skills in an age of austerity; and shared services.
How do we make the case for research data centres?
17 November 2011
The recent report Data Centres: their use, value and impact was co-sponsored by RIN and by JISC through the Managing Research Data Programme. To mark the launch of this important report, RIN are hosting a discussion event at the Wellcome Collection.
Intellectual Property Rights and Digital Preservation
21 November 2011
This briefing day, co-sponsored by the DPC and JISC Digital Media, is intended to examine and discuss key concepts of intellectual property rights as they impact on digital preservation. It will provide a forum to review and debate the latest developments in the law as it applies to preservation and it will initiate a discussion on how simple legal processes can be deployed. Based on commentary and case studies from leaders in the field, participants will be presented with emerging tools and technologies and will be encouraged to propose and debate the future for these developments.
3rd IEEE International Conference on Cloud Computing Technology and Science
29 November – 1 December 2011
The “Cloud” is a natural evolution of distributed computing and of the widespread adaption of virtualization and SOA. In Cloud Computing, IT-related capabilities and resources are provided as services, via the Internet and on-demand, accessible without requiring detailed knowledge of the underlying technology. The IEEE International Conference and Workshops on Cloud Computing Technology and Science, steered by the Cloud Computing Association, aim to bring together researchers who work on cloud computing and related technologies.
Annual General Meeting of the Digital Preservation Coalition
1 December 2011, Belfast
The Annual General Meeting of the Digital Preservation Coalition will take place at the Public Records Office of Northern Ireland, Belfast in the afternoon of the 1st December. Details to follow.
For more information on any of the items below, please visit the DCC website at http://www.dcc.ac.uk.
STFC's scientific data policy
STFC has updated its Scientific Data policy. STFC's Executive Board recently approved the introduction of an over-riding data policy to provide guidance to its staff and communities. The policy consists of a set of general principles that cover the wide variety of scientific communities and existing practices that fall within STFC's remit. The key principle of the policy is that all funded activities are required to have a data management plan, which must be in line with recommended good practice. These individual plans will then have the added check of being subject to approval by the relevant STFC boards and panels.
Benefits from the Infrastructure Projects in the JISC Managing Research Data Programme report
JISC’s Managing Research Data programme has, with an investment of nearly £2M, funded a strand of eight Research Data Management Infrastructure (RDMI) projects to provide the UK Higher Education sector with examples of good research data management. This report provides an analysis and synthesis of the benefits from this work identified by the eight RDMI projects in their benefits case studies, the benefits and enhancements that accrued to existing tools and methodologies from them, and the emerging business cases for sustainability being built by the RDMI projects.
International Journal of Digital Curation, Volume 6, Issue 2 now available
The International Journal of Digital Curation Volume 6, Issue 2 is now available. Its eighteen papers cover a wide variety of topics, from educating the next generation of digital curators to improving current practice with new technologies and methods. The papers that take a disciplinary perspective this issue explore the challenges of curating telehealth monitoring data, video games and virtual worlds. Meanwhile, representing the institutional perspective are papers dealing with various aspects of research data management: policy, training, researcher support and guidance, and service design and delivery.
Library of Congress To Launch New Corps of Digital Preservation Trainers
The Digital Preservation Outreach and Education program at the Library of Congress held its first national train-the-trainer workshop on September 20-23, 2011, in Washington, DC. The DPOE Baseline Workshop aims to produce a corps of trainers who are equipped to teach others, in their home regions across the U.S., the basic principles and practices of preserving digital materials. Examples of such materials include websites; emails; digital photos, music, and videos; and official records.
Data centres at heart of UK data sharing culture
This new study by JISC and the Research Information Network has found that data centres have been instrumental in developing a culture of data sharing among researchers. As part of a wider body of work, this evidence will help to build a case for improving data sharing practice in the UK. Although deposit levels are promising, the study concluded that researchers need more encouragement and support to deposit data in these centres. Making data available for reuse helps maximize the value of publicly funded research in the UK by providing researchers with essential references, avoiding duplication, and allowing repurposing of information for new enquiries. The report concludes that research data centres perform an important role by making high quality and reliable research results available in a way which makes it quick, easy and cheap for researchers to access.
Fedora 3.5: The latest release of Fedora is ready for integration into large repository systems
This release of Fedora, the robust framework for building digital repositories, focuses on several "under the hood" changes that improve Fedora's ability to be integrated and tested as part of larger repository systems. Version 3.5, led by Aaron Birkland, Developer, Cornell University, also sets the stage for a full move to Spring-based pluggability (http://www.springsource.com/) and configuration.
Leeds University Joins the DPC
The DPC is delighted to announce that Leeds University is the latest member to join the Coalition.
EPrints plugin for autocompletion via Names API now available
JISC provided funds for the Names Project to create a plugin for EPrints which would make use of the Names API to autocomplete creator fields with information from the Names system.
Carolina researchers tapped to develop national data
The University of North Carolina at Chapel Hill is leading a new effort to address key data challenges facing scientific researchers in the digital age. The National Science Foundation has awarded nearly $8 million over five years to the DataNet Federation Consortium, a group that spans seven universities, to build and deploy a prototype national data management infrastructure. About half the award will support research and development at UNC. The consortium will address the data management needs of six science and engineering disciplines: oceanography, hydrology, engineering design, plant biology, cognitive science and social science. The infrastructure project will support collaborative multidisciplinary research through shared collections and archives and data publication within digital libraries.
JISC Podcast: Two years of economic uncertainty: sustainable business models
JISC-led Strategic Content Alliance and Ithaka S+R release final report on their Case Studies in Sustainability, revealing how different business models for online resources fared during the economic downturn. Some of the projects profiled include the UK’s National Archives’ Licensed Internet Associates programme, which has shown major revenue growth in recent years despite budget cuts felt by the entire institution; Cornell University’s eBird, which has experimented with partnerships to develop new revenue generating offerings for users; and the University of Southampton’s Library Digitisation Unit, which has made strategic choices to better align its mission with that of the university. Nearly all of the projects profiled live under the umbrella of larger institutions. One of the key findings to emerge is that many of these projects are relying on their host institutions for support to an even greater extent than two years ago. Whether this is a good arrangement and what this means for their future remains to be seen.
Editorial: Repositories and CRISs and research data management – oh, my!
Joy Davidson and Andrew McHugh
In the lead up to the 2008’s Research Assessment Exercise (yesterday’s equivalent to the forthcoming Research Excellence Framework1) we saw a number of UK HEIs looking to exploit their institutional repositories (IRs) to expedite their submission process. Michael Day’s 2008 review of institutional repositories and the RAE2 indicated that IRs may be adapted to become useful tools for automating some aspects of bibliometrics and citation gathering. However, in addition to the number of high quality research publications produced by an institution the 2014 REF aims to assess far less quantifiable aspects of institutional research activity including impact, benefits and international reputation. What’s an institution to do? For many of us, current research information systems (CRIS) are starting to look quite attractive. Indeed, a large number of UK HEIs are members of euroCRIS3 which provides forums and workshops on CRIS related topics.
But what exactly is a CRIS? According to wikipeida, a CRIS is a ‘database or other information system storing data on current research by organizations and people, usually through some kind of project activity, financed by a funding programme’4. CRISs are not new. They’ve been around since the 1990’s. However, commercial CRISs including PURE5 and Converis6 have emerged on the market in recent years specifically designed to support UK HEIs in their next REF submission process. The recent JISC Research Information Management (RIM) Projects Event showcased a number of ongoing initiatives that are piloting CRISs, research information management approaches and are working to refine the CRIS related Common European Research Information Format (CERIF)7 standard to better describe many aspects of research activity.
It is important to remember that most institutions will not have a single system for collecting and managing their research information. Rather, many will have a number of disparate institutional sources containing bits of the overall research picture. The Research Management and Administration System project (RMAS)8 project led by the University of Sunderland is building a CRIS by pursuing a highly modular and cloud based design philosophy. Several aspects of research information system functionality are represented as modules including areas of academic excellence, funding sources, proposal management, costing and pricing. This project could highlight potential costs savings to be made by using shared services and will test the potential to integrate disparate research information sources.
However, in many ways it is the ongoing work on the CERIF standard that offers the greatest potential for longer term benefit rather than specific systems or system components. The standard enables relationships between research entities to be made explicit. CERIF can be utilised to define relationships between researchers, institutions, projects, funding bodies and research outputs. A principle goal is the establishment of interoperability between disparate sources of research information. As noted above, this is likely to be of interest for most research-led organisations that rely on multiple sources of research information. The Measuring Impact under CERIF (MICE)9 project in particular looked at how CERIF might be expanded to cover research impact more fully. A data model10 has been developed which expresses links between CERIF objects (organizations, projects, people) and impact indicators and measures. Evaluations of the model were undertaken at St. Andrews and a range of other institutions yielding a range of feedback. The University of Sunderland is leading the newly funded C4D project which will pilot an implementation of CERIF to cover data sets. This extension would enable a fuller view of research activity for the institution and funding bodies but could also provide an improved mechanism for researchers to search for data to reuse in new research activity. Reuse is often hindered by a lack of contextual information about data creation and early management. There may also be great potential for linking data outputs and roles to data management plans using CERIF.
The Research Outcomes project11, a collaboration across the research councils is developing a ‘common approach for gathering quantitative and qualitative evidence of the outcomes and impact of their investments’12. The new system will use Je-S authentication to make the system easier to use for researchers and support staff. MRC, STFC and NERC will continue to use their current systems for the foreseeable future, but in 3-5 years it is hoped that the standardised system will cover the majority of the research landscape. Collated outcomes will include publications, other research outputs, collaborations/partnerships, further funding, staff development, dissemination/communication, IP and exploitation, awards/recognition and other forms of impact. Research councils are working with HEFCE to ensure that outcomes and outputs collected are in line with the classifications to be used for the forthcoming Research Excellence Framework. The system will offer grant holders functionality to upload via web form or spreadsheet (for bulk upload), and for universities or institutional repositories to automatically generate bulk-upload spreadsheets, deposit via SWORD, and harvest collated data using OAI-PMH. The project is hopeful of launching CERIF import and export functionality following its launch later this year.
As the 2014 REF edges ever closer, the work of these projects and initiatives is likely to be of great interest to the UK HEI community. However, these projects will likely yield results that will be of interest and value to any organisation that carries out research activity and wants to better manage and exploit their research information.
Who's Who: Sixty second interview with Alison Heatherington, Digital Preservation Project Manager, Parliamentary Archives
Where do you work and what’s your job title?
I work for the Parliamentary Archives, and am based in the Palace of Westminster. My job title is ‘Digital Preservation Project Manager’.
Tell us a bit about your organisation
Parliament has three parts: the House of Lords, the House of Commons, and the Monarch. The main work of Parliament is to make laws, debate topical issues and look at how our taxes are spent to help run the country. The monarch’s role is to sign the laws that are voted for by both houses. The Parliamentary Archives provides a records management and archives service to both houses, and supports Parliament through information management and the preservation of Parliamentary heritage. We preserve the records of both houses, and other records relating to Parliament. The Archives provides access to the archival collection, which consists of over 3 million records of political, local history and genealogical interest. Members of the public can search the online catalogue and visit the searchroom to view records. To date the archives has mainly consisted of paper-based records, but we are now starting to archive Parliament’s digital records.
How did you end up in digital preservation?
Originally I have a background in ancient history, classics and archaeology, so it does occasionally seem strange to find myself dealing with brand new historical records. However, I like to think that I’m helping to ensure that there will be some historical sources from our time available for future generations!
After my first degree I did a masters degree in computing. This then led me, in 2000, to start work for NDAD (National Digital Archive of Datasets) at the University of London Computer Centre, which really got me started in this field. We worked with old datasets from various government departments, tried to make sense of them and migrate the contents into usable formats – in some ways it was digital archaeology, which I really enjoyed. Since that time the profession has moved on a great deal, but some things have remained the same, and the skills that I learned in that job have certainly been very useful over the years!
What projects are you working on at the moment?
I am helping to set up a digital repository system for Parliament. This will be based in the Archives, and will perform a similar function to the current archival repository (the Victoria Tower), enabling us to preserve and provide access to our collection of digital records. As well as setting up the technical system, I am also working on embedding digital preservation processes, standards and understanding in the organisation.
What are the challenges of digital preservation for an organisation like yours?
Digital Preservation is still rather a new concept, and there are challenges in understanding how a very complex and traditional organisation like Parliament can best engage with it. We have already made progress in this area, and lots of different areas of the organisation are now very actively engaged with us, which is very reassuring!
Another challenge that we face is simply in identifying and understanding the sources of our data. We will be preserving data from all parts of the organisation, so it is important for us to make sure we identify what is out there, and prioritise our work so that we can safeguard the most vulnerable objects first.
In general, however, I would say that we face many of the same challenges as other organisations who are trying to establish digital preservation solutions – challenges of securing budgets (especially in these tough times); challenges of managing change and also of managing people’s expectations; challenges of making sure we understand our technical requirements and set up a system which meets those requirements. Also challenges of knowledge transfer and staffing – this is still rather a niche field, and recruiting and retaining skilled staff can be an issue.
What projects would you like to work on in the future?
I’ve worked, for the last few years at least, on the ‘setting up’ phase, at the beginning stages of digital preservation projects. I think it would be very satisfying, therefore, to spend some time actually using a working repository, ensuring that data gets ingested, and then exploring the preservation side of things in more depth. I’d like to get more involved in researching and developing new kinds of active preservation processes, which will be of direct benefit to the data in the repository. Actually carrying out data migrations would be quite fun, I think!
What sort of partnerships would you like to develop?
The Parliamentary Archives is a business archive – we preserve records created by our own organisation. I would like to develop more relationships with other similar archives who are also beginning to tackle the challenges of preserving their own digital data – I think that sharing our experiences and offering each other support can be very beneficial. We already have good working relationships with several other archives and libraries (such as the Wellcome Library, the LSE Library and also TNA), but developing new relationships with organisations that are tackling the same challenges is always welcome.
If just one tool or standard could be brought into existence that would make your job easier, what would it be?
I can think of lots of things that are needed, but if I had to choose one thing... although we won’t be in a position to do migrations for at least a year or so, ourselves, I would really welcome a good migration tool, developed specifically for the purpose of digital preservation. A tool that carried out migrations with the purpose of retaining significant properties, but which also facilitated the process of comparing these significant properties, would make our lives easier when we start migrating.
If you could save for perpetuity just one digital file, what would it be?
That’s a tough question! I would probably choose a digital video file of a nature documentary, as a record of what our planet looked like in the early stages of the 21st century. I would like future generations to have a glimpse of what our world looked like. If tigers, red squirrels, glaciers and coral reefs are not around for our grandchildren to see, then this footage will be the all the more important. Analogue video footage won’t survive in perpetuity, so that single digital video file may be the only thing we have!
Finally, where can we contact you or find out about your work?
Who's Who: Sixty second interview with Ginny Browne, Digital Assets Librarian, OCLC
Where do you work and what's your job title?
I work in the OCLC Library at OCLC's headquarters in Dublin, Ohio as the Digital Assets Librarian.
Tell us a bit about your organization.
OCLC is a worldwide library cooperative with the stated public purpose of "furthering access to the world's information and reducing the rate of rise of per-unit costs." We are owned and governed by librarians who work all around the world. We care deeply about libraries, the people who work in them, and the people they serve. OCLC was founded by academic librarians, and the corporate culture still reflects a passion for what we are building and for the organizations we serve.
What projects are you working on at the moment?
The official Archives is a relatively new thing for the OCLC Library. We had been keeping things for the archives from the beginning, without a plan for organizing and really preserving them. We recently developed a finding aid and a plan, and are trying to archive new material as it is created along with catching up on older things. We preserve physical materials as well as digitizing whatever we can. We use the OCLC Digital Archive and CONTENTdm to preserve our digital materials and make them available online.
How did you end up in digital preservation?
By accident! I started my work life as an administrative assistant in a long string of different places. My previous employer gave me an opportunity to do technical support over the phone for their products, and eventually asked me to build a database we could put on the web for customers to solve their own problems. I agreed on condition that they send me to Library School because I knew librarians knew how to do that sort of thing. Just as I was graduating the company was purchased, and in the ensuing chaos I was laid off. Very soon after that I was hired at OCLC as "Knowledge Management Librarian." The focus of my job has gradually shifted to preservation and digitization, and this year we made it official with the title change to Digital Assets Librarian.
What are the challenges of digital preservation for an organization such as yours?
Our biggest challenge is the volume of material to be preserved with limited resources. There is so much to preserve and only a limited number of hours in a day. Because we are fighting a two-front war (keeping up with new items and finding and preserving old ones) priorities are constantly shifting. I'll spend a week on really old things, and then have to catch up on new items that were created that week. I love the variety, but it also gets hectic some of the time.
What sort of partnerships would you like to develop?
I would like to partner more closely with other archival groups in Ohio to try to coordinate our efforts in public awareness about and advocacy for local archives.
If we could invent one tool or service that would help you, what would it be?
I would like a "button" named Archive in every program that OCLC staff use so when they are finished with a document or a program or any other work product, they could click that button and the item would be put in the queue to be archived.
And if you could give people one piece of advice about digital preservation . . . ?
Just do it! Even if you're not sure it's an important item, even if you suspect the creators don't want it preserved, even if you don't know what to call it or where to put it when it's done, just do it! Preserve it. Save it. Often you don't know what will turn out to be seminal for a new product or service, or an idea about how the world should work, so just preserve it. In a hundred years someone else may want to throw it out as unimportant, but that is better than wishing they had something that you allowed to be destroyed.
If you could save for perpetuity just one digital file, what would it be?
It's a file that I have in the queue right now: The original proposal founding OCLC as an organization. It lays out the reason we are here and have done all this work, the same reason that we need to continue the work started in the 60s to make the world's information more easily available to all of the world's people.
Finally, where can we contact you to find out about your work?