Go to the U of M home page
School of Physics & Astronomy
School of Physics and Astronomy Wiki

User Tools


computing:department:unix:file_storage

This is an old revision of the document!


Data storage on Unix

Home directory storage and backups

Your home directory on the Unix cluster has a usage quota, to avoid too much space being taken by individual users. The standard quota is normally somewhere around 10GB. You can see your current usage and quota by visiting MyPhys.

If you run into your quota, and after reviewing your file usage find that you still need more space, you can reply to the warning email to request a change from us. Please understand, however, that the amount of space available is limited. Your home directory is not intended for large research data sets, for which separate project-specific storage should be used.

<note>Please make sure you don't send intensive writes to your home directory, as this causes slowdowns for all other users on our system. We may be forced to kill any processes which are causing such problems</note>

The home directories are backed up nightly, and in addition filesystem “snapshots” are created every few hours (currently at 09:00, 12:00, 15:00, 18:00).

Project areas

If you need a shared area for your project where multiple people can develop or store code, we can create such an area and back it up for you. This is intended for small-scale usage; if you need to store large data sets then read under “Project data storage” below.

Project Data storage

Other file systems are provided for research or project-specific data, under the /data hierarchy. This storage space is purchased by the research group. It can take the form of simple single drives in linux workstations, part of the shared research RAID pool, or dedicated RAID systems for large-scale storage needs.

These file systems are usually named either after the research group group (for a fileserver volume), or with the name of the workstation which hosts it, and contain further directories organized by user or by project. These areas should be used for large data sets and storage for local processes. Note that these areas are provided by the automounter - they are not activated until they are first accessed, so they won't necessarily appear in the output of commands like df.

<note warning> Research data areas are not backed up

  • For our ZFS data storage, we create nightly snapshots which are kept for 2 days - these can help you recover from accidentally deleted files. Linux RAID storage does not have snapshots.
  • If you have critical research data which requires backup, you should talk to us about the available options.

</note>

Local "scratch" storage

“Scratch” space is space that is not backed up, and generally only used for temporary storage.

Your scratch directory on the local system is always named /scratch/local/username, and can be accessed using the environment variable $SCRATCHDIR. Since it's directly connected to the computer you're using, access to data on it is generally faster than to network storage (such as your home directory). This can make it a good choice for processing bulky data. However, don't place any files you may want to keep long-term here - files which have not been accessed for 30 days or more may be purged from this area, or when a workstation is updated.

<note>When running jobs under Condor, the environment variable CONDOR_SCRATCH_DIR gives the name of a directory where the job may place temporary data files.</note>

Temporary directories

Please don't use /tmp for creating temporary files, if you can help it. This area is very limited in size, and filling it may cause problems for the system.

We define an environment variable TMPDIR, which points to a suitable (and larger) area for temporary files. Also consider whether SCRATCHDIR (described above) might be more suitable…

Remote access to file systems using sshfs

You can use fuse (the userspace filesystem driver) to mount any remote filesystem which you have ssh access to. You can use this to access files from other systems outside of Tate Lab. For example:

mkdir ~/mnt  #create a place to put it, can be called anything you want.
sshfs username@remotesystem:/path/to/mount ~/mnt

later, to unmount it:

fusermount -u ~/mnt
computing/department/unix/file_storage.1450284687.txt.gz · Last modified: 2015/12/16 10:51 by allan