LONI Storage Policy
Last revised: 11 December 2014
h3
h4
▶ Table of Contents
1. Storage Systems
Available storage on LONI high performance computing machines is divided into six file systems (Table 1).
File System | Description |
---|---|
/var/scratch | Local storage on each node where existence of files are guaranteed only during job execution (i.e. could be deleted as soon as the job finishes). |
/work | Storage provided for the input, intermediate, and output files of a job. Files are not backed up and may be subject to purge! |
/home | Persistent storage provided to each active user account. Files are backed up to tape. |
/project | Persistant storage provided by request for a limited time to hold large amounts of project-specific data. Files are not backed up. |
All file systems share a common characteristic. When they are too full, system performance begins to suffer. Similarly, placing too many individual files in a directory degrades performance. System management activities are aimed at keeping the different file systems below 80% of capacity. It is strongly recommended that users hold file counts below 1,000, and never exceed 10,000, per subdirectory. If atypical usage begins to impact performance, individual users may be contacted and asked to help resolve the issue. When performance or capacity issues become significant, system management may intervene by requiring users to offload files, stopping jobs, or take other actions to ensure the stability and continued operation of the system.
Management makes every effort to avoid data loss. In the event of a failure, we will make every effort to recover data. However, the volume of data housed makes it impractical to provide system-wide data backup. Users are expected to safeguard their own data by making sure that all important code, scripts, documents and data are transferred to another location in a timely manner. The use of inexpensive, high capacity external hard drives attached to a local workstation is highly recommended for individual user backup.
Back to Top2. File System Details
2.1 /var/scratch
/var/scratch space is provided on all compute nodes, and is local to each node (i.e. files stored in /var/scratch cannot be accessed by other nodes). The size of this file system will vary from system to system, and possibly across nodes within a system. This is the preferred place to put any intermediate files required while a job is executing. Once the job ends, the files it stores in /var/scratch are subject to deletion. Users should not have any expectation that files will exist after a job terminates, and are expected to move the data from /var/scratch to their /work or /home directory as part of the clean up process in their job script.
2.2 /work
/work is the primary file storage area to utilize when running jobs. /work is a common file system that all system nodes have access to. It is the recommended location for input files, checkpoint files, other job output, as well as related data files. Files on /work are not backed up, making the safekeeping of data the responsibility of the user.
User may consume as much space as needed to run jobs, but must be aware that since the /work file systems are fully shared, they are subject to purge if they become overfull. If capacity approaches 80%, an automatic purge process is started by management. This process targets files with the oldest age and size, removing them in turn until the capacity drops to an acceptable level. The purge process will normal occur once per month, as necessary. In short, use the space required, but clean up afterwards. See the description of /project space below for longer term storage options.
2.3 /home
All user home directories are located in the /home file system. /home is intended for the user to store source code, executables, and scripts. /home maybe considered “persistent” storage. The data here should remain as long as the user has a valid account on the system. /home will always have a storage quota, and is clearly a hard limit. While /home is not subject to management activities controlling capacity and performance, it should not be considered permanent storage, as system failure may result in the loss of information. Users should arrange for backing up their own data.
2.4 Project
The /project file system may be available on some systems, and provides space specific to a project. /project space allocations must be requested, and they are available for a limited time period. To qualify as the PI of a storage allocation, the user must satisfy the PI qualifications of a computational allocation and must have an active computational allocation when the request is submitted. Allocations are typically 12 months or less. Two months before an allocation expires, the user will be notified by email. Users may request to have the allocation extended. Renewal requests should be submitted at least 1 month prior to expiration to allow decision and planning time. If the storage allocation is not extended, the user will have 1 month after the expiration date to off-load their data. Users should have no expectation that data will persist after a project’s expiration, thus alternate safe keeping and data protection actions must be taken in advance.
Back to Top3. System Specific Information
System | File System | Storage (TB) | Quota (GB) | Purge File Limit (Million) |
---|---|---|---|---|
QB2 | /work | 400 (Lustre) | (none) | 4 |
/home | (NFS) | 5 | N/A | |
/project | 800 (LPFS) | By request | N/A | |
QB3 | /work | 1500 (Lustre) | (none) | 4 |
/home | (NFS) | 10 | N/A | |
/project | 1500 (LPFS) | By request | N/A |
4. Job Use
On all systems, jobs must be run from the /work file systems. Running jobs from the /home or /project file systems is strongly discouraged. Files should be copied from /home or /project space to /work before a job is executed, and back when a job terminates, to avoid excessive I/O during execution that degrades system performance.
Back to Top5. /project Allocation Requests
Limitations: Space /project file system by request only for periods of 12 months. A request for initial or renewal allocation is made via a web form as described below. In the interest of fairness, renewals are subject to availability and competing requests.
How to Apply: A storage allocation may be requested by completing a web form. The provided information should fully justify the need for the storage, and indicate how the data will be handled in the event the allocation can not be renewed. A user group can be set up for sharing access to the space if the requestor includes a list of user names.
Allocation Class: Allocations are divided into 3 classes, each with a separate approval authority (Table 3). All storage allocation requests are limited by the available space, and justifications must be commensurate with the amount of space required.
Class | Size | Approval Authority |
---|---|---|
Small | Up to 100GB | LONI Operations Manager |
Medium | Between 100GB and 1TB | LONI Director |
Large | Over 1TB | LONI Resource Allocation Committee |
Requests that must be approved by the LONI Resource Allocation Committee will be forwarded to allocations@loni.org.
Back to Top