Skip to main content

System Status: 

Data Replication and Backups

Learn about critical data backup options and planning for Cybersecurity Certification for Research (CCR).

You must have a critical data backup plan in place to complete your Cybersecurity Certification for Research (CCR). Because data backup planning can become complicated, the CCR team welcomes and encourages consultations. Contact us at ccr-support@ucsd.edu.

What Should Be Backed Up?

Laptops and workstations

All laptops and workstations should be backed up. Commercial software such as Crashplan or Druva can be used for these.

Small to medium scale data storage

Researchers with 1-20 TBs of data are encouraged to use the San Diego Supercomputer Center’s (SDSC) Qumulo storage service, which available in small increments through Research IT Services. Email ccr-support@ucsd.edu for more information.

Large scale data storage

Researchers with more than 20 TBs of data should explore the full set of options available on the UC San Diego Research Data Storage Explorer.

The Library's Research Data Curation service can help support the long term storage and availability of data sets, and offers general data curation strategy consultation.

Types of Backups

It's important to maintain trusted copies of your data according to best practices. Consider the following criteria when implementing a data recovery strategy to minimize the risk of data loss, particularly research data. 

We'll be referring to the first copy of data as a replica and separate copies as backups.

Replicas

Some data at UC San Diego (e.g., email, iCloud, Google Drive, OneDrive) is already replicated to the cloud. These replicas are insurance against risks like theft or hardware failure, and can provide access to important data when your primary system (laptop or desktop) isn’t available.

These replicas, however, are not necessarily protected from severe threats such as ransomware.

Backups

Backups are separate copies of your data that ensure you are able to recover the data if needed.

Backups are typically updated at regular intervals and are designed to protect against accidental and malicious data loss events such as:

  • Deletion
  • Corruption
  • Hardware failures
  • Malware
  • Ransomware Attacks

Software and services that provide the right level of protection for backups will have features that make it harder for older copies to be deleted or modified. These features prevent an attacker from removing the old copies and encrypting the current data.

Replica and backup services often create multiple copies of the data on their systems to reduce the odds of a customer’s data being lost from a single failure.

Requirements and Best Practices

Determine what you need to back up

Choosing what to back up is often a question of how much data you have and whether there is a cost.

Survey the scope of your data, including laptops, desktops, and servers.
Modest amounts of data, e.g., on a laptop, there may be a departmental or other solution that is provided.

If you have several servers in a rack or a local storage device, multiple copies of a few backups will likely require a larger scale solution.

Prioritize

Here are some questions to help determine what data to prioritize:

  • What is the highest priority data?
  • Do you need to backup raw data, the result of analysis, or both?
  • Can any of the data be retrieved elsewhere or recreated?
  • What would be the cost and time involved to recreate the data?
  • What is the cost of backing up the data?

Backup Solution Criteria

After you've prioritized the data to be backed up, use these recommended criteria for both the solution you choose and the policies you implement.

Backups should:

  • Occur at least daily
  • Include data you cannot replace (e.g., recently collected instrument or survey data)
  • Be stored in a physically secure and off-site location
  • Be protected from being overwritten or destroyed (immutable)
  • Include several points in time or versions of your data
  • Not automatically expire (and delete) any data
  • Support all the file types and sizes that comprise your data
  • Be encrypted. Encryption is highly recommended to ensure integrity and privacy while in transit over a network and while stored (at rest).

Where to Back Up Your Data

On-Premise Backups

Establishing a local backup solution requires considerable management. Whether it is a dedicated backup hardware/software solution or the simple, straightforward and inexpensive strategy of copying your data to an external hard drive or memory stick.

On-premise methods have a number of risks:

  • Someone must remember to swap out the storage device to retain offline copies of your data.
  • Availability and access to local storage devices are subject to catastrophic events, such as flood or fire, in the building.
  • It is difficult to protect local (on-premise) backups from being overwritten by malware or ransomware.
  • Storage devices must be maintained and upgraded regularly and can break or be stolen

Third-Party Backup Services

Backups services will typically include important features such as replicating your data across multiple locations, encryption, versioning, and ease of management.

A full-service backup solution may include the client software and provide the storage for your data. 

Consider the following:

  • How frequently are backups updated?
  • How many versions of file changes are retained? And where are they stored?
  • How long are previous versions retained?
  • Are certain types of files or sizes excluded from backups?
  • If the source data is deleted, how long will it be retained in the backups?
  • What is the total cost for a managed backup solution?

Cloud Backup

Writing your own backups scripts or using backup software with cloud storage (e.g., S3 in Amazon Web Services) can be a complex process that is more suited to projects planning to, or already using, cloud computing or storage.

This solution is also suited to projects with extremely large archival storage needs that are not addressed by other systems.

Other Campus-Recommended Backup Solutions

Some departments and divisions offer backup services through divisional IT. These will be listed here as we become aware of them:

Scripps Institution of Oceanography (SIO) offers Code42 CrashPlan to members of SIO

Institute of Geophysics and Planetary Physics (IGPP) offers Code42 Crashplan to members of IGPP

Resources and Definitions

Backup Information Resource

For help with any research computing and data matters, contact Research IT Services, research-it@ucsd.edu.

Definitions

Disaster Recovery Plan (DR)

Broader than backups, it a DR includes policies and procedures to enable a complete return to productivity after a catastrophic event.

Digital preservation

The effort to ensure access to data over a long period, preserve its integrity and protect against obsolescence of technology such as changes in format, hardware or software.