Skip to Main Content

Research data management

Managing backup copies

It is advisable to draw up a data management plan that specifies, from the outset of a research project, which data and related documents are to be retained and protected, using back-up copies.

A good rule of thumb is 3-2-1: 3 copies of files, on 2 different media, including 1 off-site copy.

Ideally, all research data judged to be of high quality should be retained, together with the related documents, i.e. metadata, documents describing the methodology of data collection and database design, and documents describing the ways in which the database can be used or transformed.

  • There is no general rule on how often database backups should be updated. As the research project progresses and new data becomes available, the files stored should be regularly updated.
  • It is then advisable to use an update plan, which can automatically call up a review to determine whether any changes need to be made to the database or any of the related documents.
  • As a general rule, consider the impact on the continuation of your work if files created or modified since the last backup were to disappear.

No type of technology is perfect, which is why it is advisable to use different technologies in your backup strategy.to copy the same database. The most common technologies include:

  • Network directories: placed on institutional servers which are protected and automatically backed up on a regular basis.
  • Computer hard disk: flexible while the database is being developed, but must be used in conjunction with another technology to reduce the risk of breakage or loss.
  • Portable media (USB sticks, external hard drives, CDs, DVDs): affordable and useful on the move, but high risk of data loss or corruption.
  • Cloud directories: generally inexpensive commercial services. Variable levels of data protection and recovery capacity, depending on the type of service contract.

See the Storage recommendations section to see what is recommended for the HEC community.

It is important to use a file format that will allow long-term use of the data. We recommend the use of open formats (txt, csv, tab, flac, xml), which facilitate access, and the use of Unicode encoding (e.g. UTF-8).

If you use proprietary software, it's important to record the name and version of the software, as well as any other details (operating system, software dependencies, etc.) that could have an impact on data access.