The HFS Storage Protect backup service is a valuable University resource with a large but ultimately finite capacity. Certain guidelines are therefore required to ensure the fair use of resources so that the facilities can be used as widely as possible.
Currently we recommend a single backup account store no more than 10TB of data. We are aware that some systems exist which manage much greater amounts, but unfortunately our resources do not extend to covering these. Neither do we support 'partitioning' such systems into multiple backup accounts to circumvent the 10TB limit and so such systems must therefore look to implementing their own backup solution.
Independent of the amount of data stored, where a large number of files are held locally, we strongly recommend, and may insist, that these are either bundled into a local archive file prior to sending to an HFS Storage Protect server, or the local disk is repartitioned into smaller filesystem/volume/drive partitions. There is no magic limit here, as the requirement for this depends partially on the client machine's own resources and ability to sort large numbers of files. Typically, the figure revolves around several million file objects in any one partition.
Additionally we insist that our clients:
- Connect to our service and upload data at a reasonable speed: see Connection speed to the HFS Storage Protect servers.
- Try to avoid backing up certain types of data: see What to exclude from backups.
- Try to ensure that only University work-related data is included in the backup: see What to exclude from backups.
Connection speed to the HFS Storage Protect servers
Long slow backups can cause considerable problems with the service: they lock server resources that need to be spread across many clients. Consequently, we demand that clients upload data at a reasonable rate such that the upload of their data does not take an unreasonable amount of time.
At twelve hours a client session will be cancelled. It is to be noted that this is an extreme measure enacted in order to manage server resources: it should not be seen as a boundary limit below which all backups will be tolerated. The guiding principle is: if you have a large amount of data to back up, then you must have the machine resources and network bandwidth with which to do it.
Please ensure that, as well as sufficient network bandwidth, a server client has the CPU and memory resources to sort and process the required number of files. As a guide, the IBM backup client requires 300 bytes of memory per file, and so a partition of 3 million files will require just short of 1GB of memory.
Should a client consistently fail to complete its backups, and there is no transient error situation (for example network misconfiguration) as the cause, then we reserve the right to exclude this client from the service.
Current limits are documented in our FAQ item What limits are there on use of the HFS Storage Protect Service?. It should be noted that there is currently no maximum connection speed limit.
What to exclude from backups
The HFS Storage Protect backup service is intended for active data that is in use by current members of the University. Certain file types which should not be backed up are excluded from backup by a set of default exclusions. However, these exclusions cannot catch all examples. Therefore we ask that users ensure that the following data are excluded from Storage Protect backup on all client machines which they own or manage.
- Data that is not related to your work with the University.
- Data on drives shared from other machines (this is excluded by default on Windows machines).
- Block-level (or other format) files that constitute an 'image' of a machine or part of a machine.
- Backups of the local machine or of other machines (for example native Windows backups; macOS Time Machine is excluded by default).
- Virtual machine images which are not already excluded by default (the latter being *.vmdk(.*) and *.vmem).
- Copyrighted data for which the client is not the owner or the licensee or holder of said copyright.
- Other bootable images on a multi-platform-boot system (for example the /WINDOWS partition on a Linux system).
- Continuously date- and/or time-stamped files.
- Duplicate data: it is a waste of resources to back up the same data more than once, either from one or from multiple machines.
- Outlook archive files, called archive.pst: these are date-stamped by Outlook every time it is run, and therefore get needlessly resent by the backup software on every backup. (As an alternative, backup of these files is permitted if you detach them from Outlook via File > Data File Management. Please contact hfs@ox.ac.uk if you wish to do this.)
- Large database files: you should not use the standard Backup-Archive Client to back up either large running databases or flat file 'dumps' of such databases.
Examples of how to exclude files and folders from Storage Protect backups can be found in our page on how to exclude files and folders from backup. Further help can be obtained by contacting hfs@ox.ac.uk.
Additionally
Additionally, we may insist on local measures being implemented in order that the backup of that data does not consume a large amount of our system resources. These local measures may include, but are not limited to excluding files, repartitioning into smaller partitions or preprocessing files before backup.