LSU HPC Policy
LSU HPC Usage Policy
h3
h4
▶ Table of Contents
1. Introduction
The HPC@LSU computing facilities, encompassing its hardware, software, network connections, and data, are a vital but limited resource. Therefore HPC@LSU is obligated to protect its facilities and ensure they are used properly. Allocations are used to assure equitable consumption of system resources, while personal access to the systems is controlled via user accounts. You, personally, are constrained by legal and other obligations to protect resources and the intellectual property of others on the systems.
Given the extensive nature of these resources, responsibilities are attached to the right of access. Responsible user conduct is required to assure fair access by all researchers. Failure to use these resources properly may result in various penalties, including, but not limited to, loss of access, administrative, academic, civil and/or criminal action.
Back to Top2. Requirements
2.1. Individual Account Management
You have the responsibility to protect your account from unauthorized access, and for the proper expenditure of allocated resources. Tools are provided to help you manage your resources, but you are responsible for using these tools properly.
2.1.1. Valid Contact Email
Every account has a contact email address associated with it. Email is used as the primary communication channel between LSU@HPC and the user. The user is responsible for keeping this address up to date. In the event that an address is discovered to be invalid, the associated user account will be locked.
2.1.2. No Account Sharing
Your account is for your use only. It is not to be shared with others; neither students nor other collaborators. Others who need access must request their own account. User authentication certificates, if they are used, are considered personal accounts and not to be shared.
2.1.3. Protecting Passwords
Passwords and certificates are the keys to account access. You are responsible for their protection and proper use. Protective behavior includes
- never sharing passwords
- never writing passwords down where they can be easily found
- never using applications which expose passwords on the network (e.g. telnet).
See Guidelines, below, for more information.
The private key portion of an authentication certificate is the equivalent of a password. If you use certificates, you are responsible for ensuring that file and directory permissions prevent others from reading or copying any private keys.
2.1.4. Mailing List Membership
While an account is active, the user is required to be a member of the users@lists.hpc.lsu.edu mailing list. This list is used to notify everyone of important changes and events. List membership is automatic upon creation of the account, and persists until the account is deactivated. Anyone wishing to be removed from the list should send an email to sys-help@loni.org and request that their account be terminated. This will result in removal from the email list the next time an active account scan is performed.
Back to Top2.2. Fair Use
Access to a system does not grant you blanket authorization for any and all activity. You must recognize that reasonable use may vary from machine to machine within a system, depending on the function of the machine. Machines in a cluster serve one of two basic functions.
2.2.1. Computational Nodes
These are the nodes which perform the intended heavy computational work of a cluster. They are accessed via the job scheduler (e.g. LoadLevel or PBS) through the submission of job control scripts. Computational tasks are never to be run on the cluster head nodes, as this has the potential of impairing operation of the entire system.
2.2.2. Head Nodes
Head nodes are used for the interactive work required to prepare a computational task. Appropriate work on the head nodes includes: light editing, compiling, post analysis previewing, and job script creation. Extensive post processing should be done on the compute nodes. Any processing on the head nodes which adversely impacts system operation is subject to preemptory termination. Repeated abuse may result in the locking of user accounts or other administrative action as deemed appropriate.
2.2.3. Authorized Use
Authorized use of the system is defined by the purpose outlined in the research proposal on which the usage allocation is based. Any resulting use is limited to activities reasonably required to accomplish that purpose.
2.2.4. Unacceptable Behaviors
The following activities are explicitly deemed unacceptable and are subject to the penalties outlined below:
- using, or attempting to use, computing resources without a valid allocation and user account
- usage for purposes other than those stated in the allocation proposal.
- tampering with or obstructing the operation of the facilities.
- reading, changing, distributing, or copying others' data or software without their explicit permission.
- using HPC@LSU resources to attempt to gain unauthorized access to other (non-LSU) sites.
- activities in violation of local or federal law
2.3. Reporting Suspicious Activity
You are responsible for reporting, as soon as possible, any suspicious activity you notice on your account, and exposure or compromise of passwords, passphrases, or certificates. See Section 5 for reporting procedures.
Back to Top2.4. Data Confidentiality
You are responsible to ensure the confidentiality of any data you use on HPC@LSU resources that is restricted from general public access. Technologies are provided to preserve the confidentiality of data, but it is your responsibility to use that technology appropriately. Such data may include:
- Intellectual property, such as report drafts or research in progress,
- Proprietary data, such as data owned by a specific company, or licensed applications.
- Regulated data, such as medical, personal identifying information, or student records.
It is your responsibility to be aware of any requirements on a particular data set.
Back to Top2.5. Acknowledgment
Papers, publications, and web pages of any material, whether copyrighted or not, based on or developed under LSU-supported projects must acknowledge this support by including the following statement:
- Portions of this research were conducted with high performance computing resources provided by Louisiana State University (http://www.hpc.lsu.edu).
2.6. Software Development
Software developed with allocations approved by HPC@LSU is subject to the guidelines published by the LSU Office of Intellectual Property, Commercialization, and Development.
Back to Top2.7. Publication
Work performed under a peer-reviewed allocation must be published in the open literature.
Back to Top2.8. Non-Academic User Requirements
Non-academic (corporate/industrial, government, etc.) users frequently have more stringent usage requirements than those that might be provided by HPC@LSU. It is the user's responsibility to assure the resources used satisfy the requirements of their organization.
Back to Top2.9. Software Licenses
All software used on HPC@LSU systems must be appropriately acquired and used according to the specified licensing. Possession or use of illegally copied software is prohibited. Likewise users shall not copy copyrighted software or materials, except as permitted by the owner or the copyright. Some installed software may require special authorization in order to be used. Users must abide by any licensing requirements and protect it from misuse.
Back to Top2.10. Final Reports
Requests for subsequent allocation awards will not be allowed until an end of project report has been received for all prior awards. It is recommended that continuing projects also include a copy of prior award final reports as an attachment to the submitted proposal.
Back to Top2.11. Additional Requirements
Individual sites may be subject to organizational policies with additional requirements beyond this policy. Those organizations will make those policies available. It is your responsibility to be aware of and abide by those policies.
Back to Top3. Penalties
Failure to abide by these policies may result in a variety of penalties imposed.
3.1 Account Suspension/Revocation
Accounts may be temporarily suspended or permanently revoked if compromised or abused. Your account may be suspended without advance notice if there is suspicion of account compromise, system compromise, or malicious or illegal activity.
Back to Top3.2. Loss of Allocation
Unauthorized behavior can result in loss of your current allocation, and may lead to the inability to obtain future allocations.
Back to Top3.3. Administrative Action
Unauthorized activity may be reported to your PI, your advisor, or LSU authorities for administrative review and action.
Back to Top3.4. Civil Penalties
Civil remedies may be pursued to recoup costs incurred from unauthorized use of resources or incident response due to compromise or malicious activity.
Back to Top3.5. Criminal Penalties
Activities in violation of university, federal, state, or local law may be reported to the appropriate authorities for investigation and prosecution.
Back to Top4. Disclaimers
4.1. Support/Diagnostic Access
HPC@LSU site personnel may review files for the purposes of aiding an individual or providing diagnostic investigation for HPC@LSU systems. User activity may be monitored as allowed under policy and law for the protection of data and resources. Any or all files on HPC@LSU systems may be intercepted, monitored, recorded, copied, audited, inspected, and disclosed to authorized site or law enforcement personnel, as well as authorized officials of other agencies, both foreign and domestic. By using HPC@LSU systems, users acknowledge and consent to this activity at the discretion of authorized site personnel.
Back to Top4.2. Access Notification
Access to user data and communications will not normally be performed without explicit authorization and/or advance notice unless exigent circumstances exist. Post-incident notification will be provided in such cases.
Back to Top5. Guidelines
The following are suggestions for helping maintain the security of your account.
Back to Top5.1. Password Management
- Do not write down your password where it can be easily found and/or associated with your account.
- Do not tell anyone your password, not even HPC@LSU support staff. Support staff will never need your password, will never ask for it, and will never send a password in e-mail, set them to a requested string, or perform any other activity which could reveal a password.
- If someone insists they need your password to do something, report it to the HPC@LSU helpdesk: sys-help@loni.org.
- Do not store your password(s) in unencrypted files or even in encrypted files if possible.
- Pick passwords that are difficult to guess. Birthdays, family names, and single dictionary words are examples of easily guessed passwords.
- Change your password periodically, even if you have no reason to believe that anyone else has it.
5.2. Password Exposure
If you think your password may have been compromised or exposed, but have no reason to believe that your account has been used, change your password immediately.
Back to Top5.3. Account Compromise or Suspicious Activity
If you believe your account has been compromised or find signs of suspicious activity, take the following actions:
- notify the HPC@LSU helpdesk immediately (sys-help@loni.org)
- do not modify files found in your account
- do not execute unknown programs you might find
- if possible, do not use your account until the issue is resolved
Some indications of account compromise include:
- files in your home directory or project areas which you did not create
- alteration or deletion of your files not done by you
- discrepancies between your allocation balance and what you think you have used
6. Contacts
The central point of contact for any problems and concerns with regards to this policy is the HPC@LSU help desk which can be reached at sys-help@loni.org. Other modes of contact are listed on the HPC@LSU web site contact us page.
Back to TopLast revised: 11 December 2009
LSU HPC Allocations Policy
h3
h4
▶ Table of Contents
1. System Allocations
LSU (Louisiana State University) maintains several high performance computing (HPC) systems. The machines currently include several Intel Xeon-based x86 clusters. Detailed information on the various systems can be found on the HPC web site. LSU controls access to these resources via a formal user account and resource allocation processes. All decisions in these matters are made by the LSU HPC Resource Allocation Committee (HPCRAC), which is made up of faculty representatives from different discplines across LSU.
A 2-step process is required to gain access to LSU HPC resources, be they computational time or data storage. The first step requires applying for a system user, or login, account. The second step requires associating the user account with an allocation. An allocation provides the means for assigning, tracking, and controlling resource consumption. While the process is similar for both computational time and storage space, there are separate policy documents for each. This particular document will focus on allocations for computational time.
An allocation of computational time is analogous to a bank account. It provides processor time that a user may expend as they see fit on one or more systems. When that time is consumed, no more work can be done until a new allocation is identified. Individuals who are awarded an allocation (called the principle investigator, or PI) have the ability to add users to their allocation account thus allowing multiple people to use one allocation. The PI consequently retains ultimate responsibility for who uses their allocation and how it is expended.
Computational allocations are awarded in units called service units (SU), where 1 SU corresponds to running a program for 1 wall clock hour on 1 processing core. How cores are associated with a program varies from system to system. All allocations are awarded for 1 year periods, and are considered active until they expire, even if the SU balance has gone to zero. As the policy will outline, there are limits on both the total allocation amounts, and number of active allocations a PI may hold at any given point in time.
Back to Top2. PI Qualification
For the purpose of computational allocations, only active LSU Baton Rouge campus faculty members, permanent research staff, and postdoctoral researchers (subject to HPCRAC Chair review and approval) are qualified to serve as a PI. Adjunct and Visiting professors do not qualify. Faculty members who recently retire or re-locate to another institution can qualify to serve as a PI ONLY if they are involved in advising actively registered students/researchers at LSU.
Requesting an allocation involves submission of a project proposal describing the intended usage and providing justification for the resource amounts requested. Actual submission is controlled by web pages on the HPC web site, and varies in complexity with the size of the request.
Once an allocation is awarded, the PI is granted access to tools which allow tracking of SU consumption, as well as the management of additional users who may use it. Associated with each allocation is an account code which is used by the system to determine how to charge for resources consumed. The PI is ultimately responsible for all users they authorize on an allocation, but has the option of formally delegating some other user on the allocation to manage this process in their stead. It becomes the joint responsibility of the PI and the authorized users to make sure the allocation is expended properly, and for the purposes intended.
LSU welcomes members of other institutions who are collaborating with LSU researchers to use LSU resources, but they cannot be the PI requesting standard allocations (Startup or Research) or granting access. They must ask a qualified PI at LSU to sponsor them for a user account and add them to an appropriate existing allocation.
Back to Top3. Allocation Categories and Process
At the present time, allocations are machine-specific awards, allowing use only on the platform for which they are awarded. HPC resources are partitioned by category, and requests go through the HPCRAC (HPC Resource Allocation Committee). The HPCRAC represents several distinct approval authorities: the Vice President for Research and Economic Development (VPRED), the Center for Computation & Technology (CCT) Director, the HPCRAC Chair, and the HPCRAC as a panel. This reflects the sponsorship and many intended uses for LSU HPC resources. The resources intended for each use are collected into 5 categories as shown in Table 1.
Allocation Category | Available Resources | Allocation Authority |
---|---|---|
Economic Development | 10% | Vice President for Research and Economic Development |
Discretionary | 10% | Center for Computation and Technology Director |
Startup Allocation (0-150,000 SU's) | 15% | HPCRAC Chair |
Research Allocation (> 150,000 SU's) | 60% | HPCRAC |
To facilitate management of proposals, they will be assigned to one of the following classes (Table 2).
Class | Description |
---|---|
Economic | A proposal to assist commercial interests with the adoption of high performance computing as part of their business process. Awarded by the VCRED from the Economic Development resources. Economic allocations may be renewed by the VCRED. |
Discretionary | A proposal determined to be of value, but lying outside the normal allocation process. Awarded by the CCT Director from the Discretionary resources. Discretionary allocations may be renewed by the director. |
Startup | A proposal requesting an allocation for the purpose of exploring the value of high performance computing for a new project. A PI is allowed to have a maximum of 2 startup allocations active at any given time. Startup allocation of up to 150,000 SU's are awarded by the HPCRAC Chair. |
Research | A request for a large amount of time for a significant research project. May be awarded by simple majority agreement of the HPCRAC. Research requests are limited to 5 million SU's, and a PI may have a total of 9 million SU's active at any given time. With a proper compelling justification (e.g. the project requires a very large number GPU hours) and upon the approval of the HPCRAC, the total SUs can potentially be increased to 12 million subject to the availability of computing resources. |
All allocations have a duration of one year. Startup allocations may be awarded at any time during the year, while Research allocations are granted at the beginning of every calendar quarter.
Application deadlines for Research allocations are 5 p.m. local time on the 1'st day of the month prior to the start of a calendar quarter, namely: December 1, March 1, June 1, and September 1. A PI may request, or be requested, to make a formal presentation to the HPCRAC in conjunction with the proposal submission.
At the end of any allocation, a short summary report is required by the HPCRAC. Missing reports may delay proposal processing. Any request for continued allocation request must include a summary report. See below for desired report content and can use this template as a reference. PI's should review their allocations, identify those missing reports, and attach reports by visiting their My Allocations page.
Allocations can not be extended. They are based on estimated resources (core-hours) available during the award period, and those resources effectively dissipate, whether used or not, as time passes. The PI is ultimately responsible for assuring allocations are used at a rate needed to support their project over the time period of the allocation.
Back to Top3.1. Standard Allocation Limits, Requests
3.1.1. Startup Allocations (≤150,000 SU's)
Who may apply:Any qualified PI may request a Startup allocation for up to 150,000 SU's. Consideration must be given to the limitation on the number of simultaneous Startup allocations a PI may have active at any one time, as specified in Table 2. The intent is to support low intensity projects, such as small analysis efforts or course work, or program benchmarking and characterization work in preparation for applying for a Research allocation.
How to apply: An application may be submitted using the HPC web interface. Completing the web form alone is sufficient. The text provided should briefly explain the computation time on the selected HPC platform, the intended methodology, and the current state of the codes or application that will be used.
Review and Awards: Startup allocations may be made at any time throughout the year, but their start time is set to the beginning of the allocation quarter they are made in.
Back to Top3.1.2. Research Allocations (>150,000 SU's)
Types of research alloation and requirements: Depending on the amount of SUs being requested, research allocations are categorized into three types. As shown below, each type should include the required information and can opt to include the optional information.
- Type A (>150,000 and ≤300,000 SUs): the proposal should include a technical merit session which should describe the background, problem statements, methodology, research schedule and software characteristics (estimation of computational time) within a 4-page limit.
- Type B (>300,000 and ≤1,000,000 SUs): the proposal should include an additional section on impacts and outcomes of any previous allocations in addition to the technical merit and software characteristis within a 5-page limit. If the PI of the proposal did not have any pervious record of usage of the HPC allocation, the PI will be recommended to submit a type A proposal or the committee will consider the proposal as type A. If a PI is a new researcher at LSU but has previous usage records at other HPC facilities, the PI should describe it in the proposal.
- Type C (> 1,000,000 SUs): the proposal should include an additional section on related external research funding and/or LSU internal demand (list research groups or a support letter) in addition to the sessions of technical merit, software characteristics, previous impacts and outcomes within a 6-page limit.
Note: if a PI has multiple submitted proposals and active allocations at the review time and the total amount of SUs is more than 1,000,000, the review committee will consider the case similar to a type C proposal.
Type | Size | Technical merit | Software characteristics | Previous impact and outcome | External funding or LSU demand | Number of pages | Sample proposal |
---|---|---|---|---|---|---|---|
A | >150,000 and ≤300,000 SUs | Required | Required | Optional | Optional | 4 | |
B | >300,000 and ≤1,000,000 SUs | Required | Required | Required | Optional | 5 | |
C | >1,000,000 SUs | Required | Required | Required | Required | 6 |
Who may apply:Any qualified PI may request a Research allocation for more than 150,000 SU's, up to the maximum defined in Table 2. Research allocations can be for their own use, or on behalf of a research or operational group in support of a well-defined compute-intensive project. Requests must take into account any limitation given in Table 2 on size and the total SU's a PI may have active at any one time.
How to apply: A request for an allocation requires that a formal proposal be submitted via the HPC web interface to the HPCRAC. In addition to the web form information, the formal proposal must be in PDF format, and should comply to the requirements stated above. Proposals that fail to do so will NOT be reviewed by the committee.
The following proposal guideline should be followed:
- Background – Section describing how the resources will be used to address the problem (i.e. student access for course work, specific models, etc).
- Problem Statement – Section describing the desired outcomes of the project.
- Methodology – Section describing the computational methodology that will be used. This should include the applications required.
- Software characteristics - Section describing the performance, scaling and other characteristics of software application that will be used, ideally based on small scale test-runs on the resources of interest.
- Research Schedule – Section describing the research schedule, including the anticipated expenditure of granted resources (estimation of computational time). Allocations are assumed to be uniformly consumed over their lifetime. If this will not be the case, an estimate of expenditure by quarters is required.
- Impacts and outcomes of previous allocations - Section limited to 1 page or less, describing impacts and outcomes of any previous allocations. List reports, presentations, publications that were enabled by previous allocations. Mandatory for Type B and C proposal.
- External funding and/or LSU internl demand - Section limited to 1 page or less, describing related external research funding and/or LSU internal demand (list research groups or a support letter). Mandatory for Type C proposal.
- All research allocation requests submitted by postdoctoral researchers must include 1) "Postdoctoral advisor: [fill in the name of advisor]" in the header (below the proposal title); and 2) a letter from the postdoctoral advisor stating that he or she fully supports the request. Support letters should be appended to the proposal.
Please bear in mind that the High-Performance Computing Resource Allocation Committee (HPCRAC) is comprised of faculty members with a very wide range of scientific backgrounds, and any of them may be assigned to review your proposal. When describing the goals of your research, as well as the tools and approaches you plan to use, write your proposal in such a way that it is informative to people working in the same or related fields, and understandable to a scientifically literate lay reader.
You can draft your alloation proposal using the following MS Word template:
Sample proposals for each type can be found here:Please note that the PDF file is limited to a maximum of 2 MB, and the upload will fail if the file is over this limit.
Review and Awards: Applications for a Research allocation will undergo competitive peer review and will be allocated quarterly by the HPCRAC. Requests will be granted – in whole or in part – based on the availability of HPC resources and based upon the description of the proposed activities and the appropriate use of technology, with priority given to funded projects. Allocation decisions made by the HPCRAC are deemed final. Unresolved appeals will be directed to the Vice President for Research & Economic Development (VCRED).
Review Process: Each allocation proposal will have two reviewrs who whill review the proposal based on the required and optional information in the submitted proposal. After each reviewer recommends the amount of SU's based on the information in the proposal, those two reviewers will decide the amount of SUs for type A or type B allocation proposals. Regarding the type C allocation proposal, the amount of SUs should be decided by the majority of the allocation committee after two reviewers recommend that.
3.1.3. Instructional Allocations
Instructional allocations are for academic courses or training activities. For each course or training event, the Instructor(s) may request an instructional allocation of no more than 200,000 SU's.
Who may apply: Any qualified PI may request instructional allocations.
How to apply: A request for an instructional allocation requires that a formal proposal be submitted via the HPC web interface to the HPCRAC. In addition to the web form information, the formal proposal must be in PDF format, and should include course introduction, statement of purpose, number of students and justification for the requested amount of resources. The proposal should be limited to 2 pages. Proposals that fail to do so will NOT be reviewed by the committee.
Review and Awards: same as research allocations.
Review Process: same as research allocations.
Back to Top3.2. Director's Discretionary Allocation Request (submit to CCT Director)
Each calendar year, up to 10% of the available HPC resources will be allocated at the discretion of the CCT Director and the Chief Information Officer (CIO). Allocations made from this pool will be for a period not to exceed one year. Awardees will be expected to work in conjunction with CCT or HPC@LSU technical staff to ensure that project implementations are in line with and strengthen the CCT's broad interdisciplinary mission. At the termination of a project (or annually, if an award is extended beyond one year) awardees shall provide a summary report (see below for desired report content) of project activities.
Back to Top3.3. Economic Development or 'On Demand' Allocation Request (submit to VCRED)
The VCRED, in consultation with the chair of the HPCRAC and the CCT Director, may grant Economic Development or On Demand access to HPC resources for high priority projects like render farm for economic development purposes, hurricane storm surge modeling, or other possible operational responsibilities that are deemed sufficiently different from standard research proposals. Each calendar year, up to 10% of the available HPC resources may be allocated by the VCRED. At the termination of a project (or annually, if an award is extended beyond one year) awardees shall provide a summary report (see below for desired report content) of project activities.
Back to Top3.4. Allocation Management
An allocation is considered valid so long as a positive resource balance remains, and the expiration date has not been exceeded. Once an allocation expires, or has been fully consumed, users accounts will be blocked from submitting work against the allocation. There is no mechanism for extending an allocation beyond one year, nor for adding resources once an allocation has been expended.
User accounts must be associated with a valid allocation, and if not, will be retained for a maximum of 1 year pending authorization against a valid alloation. With these restrictions in mind, the PI is required to use the tools provided to monitor system usage and control authorization of project member accounts. PIs are strongly advised to carefully budget their usage appropriately throughout the year. Automatic reminder emails will be sent by the management system as an allocation nears expiration. PIs are ultimately responsible for assuring that a current and actively monitored management email address has been assigned to each allocation.
At the end of any allocation, a short summary report must be submitted to the allocation committee (see below for desired report content). Failure to submit this report may be used in the consideration of future allocations.
Back to Top3.5. Early Allocation Access
If a PI who has already been awarded a Research allocation by HPCRAC puts in a new request, and there is a good reason to start the project before the next cycle, then the HPCRAC Chair can instruct the HPC@LSU staff to award as much as 25% of the project request immediately. Justification must be provided in the proposal. The HPCRAC committee would be made aware of this action and its reasons by the HPCRAC Chair using the "LSU HPC Allocations" email list.
Back to Top4. Machine Access Policy
4.1. Job Queueing
Various workload balancing algorithms are used to determine how jobs are assigned resources on a given machine. The way a job is handled is determined by the job queue it is submitted to. Efficient use of the queuing system requires that users request runtimes consistent with estimated runtimes of their jobs. In particular, requesting more time than is necessary for a particular job can lead to inefficient and unfair queuing. Therefore, users that routinely request more time than is needed for their jobs are subject to a priority penalty that will lower the priority of their jobs. Each system sets a maximum number of jobs that a single user may have running without special permission (see below). There is no limit to the number of jobs that a particular user may have queued. Users that wish to obtain a higher priority for their jobs may use special priority queues (see below).
The processors in each group are divided into preemptory and dedicated pools. Certain mission critical applications, such as storm surge prediction during a hurricane threat, are granted immediate access to processors in the preemptory pool. Processors in the dedicated pool are used to run all other job types. The processors are accessed through different job queues. There are 5 job queues which use different combinations of the processor pools, and allow for different job characteristics.
- Preempt Queue: The preempt queue controls access to the preemptory pool. Authorized applications submitted to this queue will cause the termination of all other user applications running on preemptory nodes.
- Checkpt Queue: The checkpt queue controls access to nodes in both the dedicated and preemptory pool. Jobs running in this queue may be subject to termination by the preempt queue, thus are implicitly assumed to support restarts based on periodically saved information. No refunds of lost SU's are offered if jobs in the checkpt queue are terminated by preemption. The user running jobs without restart capability assume this risk. However, the benefits from using this queue include access to larger numbers of nodes, and/or faster throughput, depending on how busy the queue is.
- Workq: The workq queue controls access to nodes in the dedicated pool. Jobs in the work queue will run until they terminate as planned, their requested run time has expired, or they stop due to an abnormal system failure. Jobs which are terminated due to system errors beyond the user's control may be subject to refund of expended SU's. Poor planning is not considered grounds for a refund.
- Interactive Queue: The interactive queue gives real-time access to jobs for on-line analysis or debugging, but only allows very short run times. It supports development work, but not production.
- Priority Queue: The priority queue controls nodes in the dedicated pool, but allows applications to be given higher priority with prior approval. Approval may be granted during training sessions, for demonstration purposes, or other special needs. The SU's charged will be adjusted by a factor of 1.3, and no more than 20% of an allocation may be expended in the queue. Requests for priority access should be directed to sys-help@loni.org. This queue does not impact other running user jobs, but will delay the start of lower priority jobs already in the queue.
Note: The named queues do not necessarily exist on all machines, and the maximum time allowed in the queues will vary from machine to machine.
Back to Top4.2. Storage
Currently disk space usage is controlled via user quotas rather than on a per-project basis. Storage may become an allocated resource in the future. At such time, a request for storage space will be required in the allocation request. At the current time, an estimate of required space is requested.
Back to Top4.3. Special Requests
A request for special access to LSU HPC machines (such as usage of all nodes on a machine or exceptionally long runs) must be explicitly stated in the proposal for HPC resources. Appeals to the decision of the HPCRAC may be made to the VCRED.
Back to Top5. HPC Resources Allocation Committee Members
The HPCRAC membership is shown in Table 4. The CCT Director, in consultation with the VERED, appoints the Chair of HPCRAC. The HPCRAC is chartered approval authority allocating HPC resources, and charged with the task of adopting the HPC Resources Allocations Policy and reporting its activities and recommendations to VCRED.
Name | Department | Contact Email |
---|---|---|
Michal Brylinski (Chair) | Biological Sciences | mbrylinski@lsu.edu |
Juana Moreno | Physics | moreno@phys.lsu.edu |
Zuo "George" Xue | College of the Coast & Environment | zxue@lsu.edu |
Shawn W. Walker | Mathematics | walker@math.lsu.edu |
Mayank Tyagi | Petroleum Engineering | mtyagi@lsu.edu |
Jeremy Brown | Biological Sciences | jembrown@lsu.edu |
Revati Kumar | Chemistry | revatik@lsu.edu |
6. Terminology
Summary report: This report (This template is provided as a reference) should be a maximum of 5 pages and include the following information, commensurate with the level of effort:
- Principal user information [PI] (name, status, department, phone, email, institution [if different from LSU]
- Summarize the nature of the LSU-sponsored research in a non-technical fashion, suitable for public consumption. (Approximately one page; no more than two pages)
- Describe any potential applications of the research to industry or government.
- Describe any use of this allocation to encourage education in computational science, and particularly the level of student involvement and any HPC elements incorporated into formal courses.
- List any publications/presentations/reports that resulted from the work. All such items should include acknowledgement of the HPC resources used as required by the Usage Policy.
Definition of Service Unit (SU): Currently, one SU corresponds to one hour of wall-clock time on one processing core. A single machine will have multiple nodes (individual servers) available, and multiple cores within each node. This allows many cores to be used for a single parallel processing job, but can lead to not-so-obvious charges. For example, running for 1 hour using 8 cores on an 8-core node consumes 8 SU's. On the other hand, running for 1 hour on 1 core of the same 8 core node, which allows the single core to access all of the node memory, also consumes 8 SU's. In simple terms, the number of cores that are reserved for a job, and hence are unavailable to others, is the number used to calculate SU usage.
HPC resources at LSU will be allocated/charged according to the number of “service units (SU's)” required/used, where:
# SU's = m * #Nodes * Wall_Time,
and,
m = number of processing cores per Node;
Wall_Time = Total Wall Clock Hours.
For example, if a machine has 4 processing cores per Node (m = 4), a program that ran for 24 hours on 32 nodes required 4*32*24 = 3072 SU's.
Note: Starting on July 1, 2023, jobs running in the gpu queue on SuperMike-3 will be charged for 64 SUs for each GPU-hour it consumes. For instance, if a job runs two hours with two GPU devices in the gpu queue, it will be charged 2 hours * 2 GPUs * 64 = 256 SUs.
Back to TopLast revised: 5 June 2015
LSU HPC Storage Policy
h3
h4
▶ Table of Contents
See also: ITS Faculty Data Storage.
1. Storage Systems
Available storage on LSU@HPC high performance computing machines is divided into four file systems (Table 1). More complete descriptions of follow below.
File System | Description |
---|---|
/var/scratch | Local storage on each node where existence of files are guaranteed only during job execution (i.e. could be deleted as soon as the job finishes). |
/work | Shared storage provided to all users for job input, output, and related data files. Files are not backed up and may be subject to purge! |
/home | Persistant storage provided to each active user account. Files are backed up to tape. |
/project | Persistant storage provided by request for a limited time to hold large amounts of project-specific data. Files are not backed up. |
All file systems share a common characteristic. When they are too full, system performance begins to suffer and all users suffer from the effects. Similarly, placing too many individual files in a directory degrades performance. System management activities are aimed at keeping the different file systems below 80% of capacity. It is strongly recommended that users hold file counts below 1,000, and never exceed 10,000, per subdirectory. If atypical usage begins to impact performance, individual users may be contacted and asked to help resolve the issue. When performance or capacity issues become significant, system management may intervene by requiring users to offload files, stop jobs, or take other actions to ensure the stability and continued operation of the system.
Management makes every effort to avoid data loss. In the event of a failure, every effort will be made to recover data. However, the volume of data housed makes it impractical to provide system-wide data backup. Users are expected to safeguard their own data by making sure that all important code, scripts, documents and data are transferred to another location in a timely manner. The use of inexpensive, high capacity external hard drives attached to a local workstation is highly recommended for individual user backup.
Back to Top2. File System Details
2.1. /var/scratch
/var/scratch space is provided on all compute nodes, and is local to each node (i.e. files stored in /var/scratch cannot be accessed by other nodes). The size of this file system will vary from system to system, and possibly across nodes within a system. This is the preferred place to put any intermediate files required while a job is executing. Once the job ends, the files it stores in /var/scratch are subject to deletion. Users should not have any expectation that files will exist after a job terminates, and are expected to move the data from /var/scratch to their /work or /home directory as part of the clean up process in their job script.
Back to Top2.2. /work
/work is the primary file storage area to utilize when running jobs. /work is a common file system that all system nodes have access to. It is the recommended location for input files, checkpoint files, other job output, as well as related data files. Files on /work are not backed up, making the safekeeping of data the responsibility of the user.
User may consume as much space as needed to run jobs, but must be aware that since the /work file systems are fully shared, they are subject to purge if they become overfull. If capacity approaches 80%, an automatic purge process is started by management. This process targets files with the oldest age and size, removing them in turn until the capacity drops to an acceptable level. The purge process will normal occur once per month, as necessary. In short, use the space required, but clean up afterwards. See the description of /project space below for longer term storage options.
Back to Top2.3. /home
All user home directories are located in the /home file system. /home is intended for the user to store source code, executables, and scripts. /home may be considered persistent storage. The data here should remain as long as the user has a valid account on the system. /home will always have a storage quota, and is clearly a hard limit. While /home is not subject to management activities controlling capacity and performance, it should not be considered permanent storage, as system failure could result in the loss of information. Users should arrange for backing up their own data, even though /home is periodically backed up to tape.
Back to Top2.4. /project
The /project file system may be available on some systems, and provides storage space for a specific project. /project space is allocated and must be requested. To qualify as the PI of a storage allocation, the user must satisfy the PI qualifications of a computational allocation and must have an active computational allocation when the request is submitted. It is made available for 12 months at a time. Shortly before an allocation expires the user will be notified of the upcoming expiration. Users may request to have the allocation extended. Renewal requests should be submitted at least 1 month prior to expiration to allow decision and planning time. Users should have no expectation that data will persist, and may be erased any time after 1 month from a project’s expiration. Thus the user is encouraged to employ alternate safe keeping and protection solutions of their own.
Back to Top2.5 System Specific Information
System | File System | Storage (Tb) |
Quota (GB) |
Purge File Limit (Million) |
---|---|---|---|---|
SuperMike-III | /work | 840 (LPFS)[2] | N/A | 4 |
/home | (NFS)[3] | 10 | N/A | |
/project | 840 (LPFS) | By Request | ||
SuperMIC | /work | 840 (LPFS)[2] | N/A | 4 |
/home | (NFS)[3] | 10 | N/A | |
/project | 840 (LPFS)[4] | By Request | ||
Deep Bayou | /work | 840 (LPFS)[5] | N/A | 4 |
/home | (NFS)[3] | 10 | N/A | |
/project | 840 (LPFS)[5] | By Request |
3. Job Use
On all systems, jobs must be run from the /work file systems, and not from the /home or /project file systems. The individual nodes assigned to a job will have access to their local /var/scratch space. Files should be copied from /home or /project space to /work before a job is executed, and back when a job terminates, to avoid excessive I/O during execution that degrades system performance.
Back to Top4. /project Allocation Requests
Limitations: Space is allocated on the /project file systems by request for periods of up to 12 months at a time. Renewal requests are allowed but, in the interest of fairness, are subject to availability and competing requests. Please be advised that the data stored on the /project file systems is NOT backed up and it is the users' responsibility to ensure that any data of importance is backed up somewhere else.
How to Apply: A storage allocation may be requested by completing a web form (click here). The provided information should fully justify the need for the storage, and indicate how the data will be handled in the event that the allocation can not be renewed. A user group can be set up for sharing access to the space if the Pricipal Investigator (PI) includes a list of HPC user names. The PI can provide this list either in the "Description of the need for this storage" section of the web form or by sending an email to sys-help@loni.org with the subject "[Creating|Adding] members to a storage allocation".
Allocation Class: Allocations are divided into 3 classes (small, medium, and large), each with a separate approval authority (Table 3). All storage allocation requests are limited by the available space, and justifications must be commensurate with the amount of space required. Each PI is allowed only one storage allocation per /project storage volume.
All classes need to describe how the allocation will be used and why the /work volume is insufficient. In addition to this, large storage allocations need to briefly describe a data plan that includes storage allocation size calculations based on cluster model runs, data sizes, software package size, etc. that justifies a large allocation. An estimate of the timeline on the need of the storage is also requested. Many researchers include this information in their CPU allocation request where they have described their research in detail, and often just attach their CPU allocation request PDF to the storage allocation request or reference it. Please also note that, since all storage allocations are cluster specific, the PI must clearly indicate for which cluster the storage allocation is being requested.
Storage Allocation Policy:
- Each storage allocation can only have one Principal Investigator (PI), who is responsible for the administrative tasks of the allocation, such as renewal and membership management;
- The PI of a storage allocation is the steward of all data stored under it (the LSU Security of Data policy can be found here: LSU Security of Data policy);
- In the event that a member needs to be removed from a storage allocation, it is the PI's responsibility to notify the HPC staff and clean up the data owned by that person;
- The maximum large storage allocation request has been established per PI at 20 TB;
- Storage allocations are for ongoing HPC use and are not meant for archival storage. Any storage allocation request (including renewal) without an active CPU allocation will be rejected;
- When a storage allocation expires, the PI has a grace period of up to 4 weeks to copy the data off. The data will be removed when the grace period expires.
Class | Size | Approval Authority |
---|---|---|
Small | Up to 100GB | HPC@LSU Staff |
Medium | Between 100GB and 1TB | HPC@LSU Operations Manager |
Large | Over 1TB | HPC@LSU Director |
[1] Global Parallel File System
[2] Lustre Parallel File System
[3] Network File System
[4] Physically shares /work
[5] Shared with the SuperMIC cluster
Last revised: 10 Dec 2014
Attribution
Users are asked to acknowledge their use of LSU HPC resources in resulting publications and reports with the following statement:
Portions of this research were conducted with high performance computational resources provided by Louisiana State University (http://www.hpc.lsu.edu).