Archiving

The archival process is a critical part of data management. It is often forgotten, so you should ideally start to think about archiving as part of the data collection.

Table of Contents

What is data archiving?

It is common to think about archiving as the last chain of a data management procedure. That is when you are done with your research, and want to "box up" your data and do other things. Unfortunately, many projects do not include time and resources for archiving. So a lot of data is left in the storage and is never archived. Then it cannot be reused. Storing is not the same as archiving!

What to archive?

The goal should be that we collect data that is of so high quality that it is publishable and shareable. In many cases, however, we collect data that should be discarded for one or more reasons. So it is up to the researcher to evaluate which data to archive.

A general rule of thumb should be that all articles should be published with supporting data. This is also becoming the standard in many journals. Just be careful about the license that you select (or is forced upon you).

It is also generally smart to release your data as a dataset, or database with more datasets. If you do this properly (see below), the dataset may also be citeable.

Where to archive?

This is the big question. For now, many researchers have just left their data on our internal servers. That is not an archive. It is becoming increasingly common that journals require depositing data used in papers in a specific archive. This is a good idea since it creates a link between the article and data. However, it is not a satisfactory solution for longterm archiving of data. For example, in many cases, only a subset of the dataset will be deposited. In addition to such depositing of data, you should think about archiving your data properly and according to the FAIR principles.

DataverseNO

UiO is a partner in DataverseNO and has its own collection. Datasets uploaded by UiO researchers are curated by the UiO Library before they are made available. This is recommended by the DSC and UiO. The benefit from using DataverseNO is the curation process which will help make your dataset useful to others. The downside is it will take a bit longer and the archive is not as popular as some others. It is a RITMO recommendation that you use DataverseNO. It is totally ok to post in multiple places

The SIKT archive

Another option is to use SIKT's archive. The benefit of depositing data with SIKT is that a professional data curator will check the data before being deposited. This ensures that the data files are readable and that the metadata are understandable. It is a free service to UiO researchers. Data deposited with SIKT will be FAIR, but not necessarily open. If closed, researchers will need to contact SIKT to check whether they can get access or not.

Field-specific archives

Instead of, or in addition to, using the DataverseNo archive, it may be relevant to use a field-specific archive. One relevant archive for mocap data is Physionet. They also have professional data curators that go through the data. This archive's added benefit is that they also know mocap data, so the data curation is more detailed.

There are now also popping up "data journals", such as Research Data Journal for the Humanities and Social Sciences. Here data will be published after a peer-review process.

General-purpose archives

Several archives are open to all sorts of data. The two below are examples that are often used at RITMO. There is no data curation with these. 

  • Open Science Framework. This is part of a larger suite of tools for Open Science practice. A US-based organization backs it, but they have servers in Germany that comply with GDPR.
  • Zenodo. This is a pure data archive run by CERN and funded by the EU. This is used quite often by many researchers and is very easy to use. 

If you go for a general-purpose archive, you should remember no quality control of the data. So the data may not fully comply with the FAIR principles.

Please beware that other general-purpose archives may not comply with GDPR and/or be commercial. It may be required to archive your data in such archives as part of a publication process. However, you should be careful with what type of license that is used. You should also still archive data in one or more free and open repositories.

Web page

It has been common to "archive" data on own web pages for some time. Remember that putting your data on a web page is not in compliance with the FAIR principles. However, it may be a good idea to make a landing page for your data. The Oslo Standstill Database informs about the different datasets it contains, with pointers to the data stored with SIKT and Physionet.

Archiving Checklist

  • Start to think about archiving as part of the data collection
  • Think about the FAIR principles when storing the data. This includes adding proper metadata and the use of open formats.
  • If you are asked to archive data with an article: do it, but do not give away your copyright
  • If you don't know where to archive: deposit in DataverseNo
  • If you have a good field-specific archive, go for it
    • If not, choose a general archive
  • It is ok to archive in both Dataverse and an open archive like Zenodo
  • Create a UiO landing page that points to your archive(s) if applicable
  • Cite your archive(s) like publications in Cristin
Published Jan. 26, 2021 10:46 AM - Last modified Aug. 27, 2025 10:55 AM