Mandatory Procedures
Basic policies: for HPRC and for Grace and Terra.
On login nodes: CPU time is limited to 1 hour per login session and up to 8 cores can be used at once. For computationally-intensive tasks, submit jobs to SLURM.
Keep track of your standing file number and memory usage. This is shown each time you log in and can be checked with the command ‘showquota’.
Note also the Account Management System. Strictly, you get 5000 SUs (CPU-hours) of batch time per year. If you need more SU, you can email Mitchcomp Help (mitchcomp_help@physics.tamu.edu) and ask to be given hours from Prof. Toback’s allocation.
Running and Submitting Jobs
Small scripts and jobs can be run on the Grace and Terra login nodes, but these are closely limited, as noted above. Computationally-intensive work/scripts should be submitted using the job schedulers, which will send your code to the HPRC compute nodes (which are where the real muscle is; the login nodes you interact with directly are meant for just that–interacting with you, not doing big computing themselves).
- On both Terra and Grace, batch jobs use SLURM instead. See the Terra SLURM and the Grace SLURM HPRC Wiki page.
HPRC also occasionally has a SLURM short course; presentation slides are linked on this page.
If you’re more used to one or the other, HPRC also has a translation guide.
If you want to submit many concurrent jobs, consider using tamulauncher instead of a job array. Using tamulauncher more efficiently utilizes HPRC resources, especially for many small jobs.
More complete documentation can be found on the tamulauncher Wiki page, but the basic usage is just to call tamulauncher commands.txt in an lsf or slrm job script, where commands.txt is a list of commands to run.
Job Times Important Notes
- Use the –time=HH:MM:SS option to specify how long your job is expected to run. When the job finishes, only the actual time used will be charged to you’re SU allocation
- The maximum time you can specify depends on the batch queue. The right queue “should” be chosen automatically based on your time request.
- Batch jobs cannot have their time extended once they are running. Choose the time needed when you submit.
Account Management System Notes
Each user has access to limited Disk Space, Maximum number of files, and SU. In HPRC, you will be given one (or two for research associates) allocation which contains a set amount of SUs (service units, related to CPU hours). Faculty members will be given more allocations with more SUs per allocation for their research projects. They can give their allocated SUs to their group members to use. But even if you are a group member of a faculty member that has some SUs to spare, you need to create your own “basic” account. (see the ‘Project accounts’ section in the Resource Allocations Policy). Below are some important notes about Disk Space, Maximum number of files and SUs, how to manage them, and how to request for more!
- For more information about Disk Space and Maximum number of files and how to request for more, please visit User Directories section in our HPRC New Users and Getting Started guide
- An SU stands for Service Unit and it the unit used by the HPRC. It is basically one CPU hour on one core. For more information see the Overview of AMS
- You are automatically allocated 5000 SUs of batch time per year. Your 5000 SU user allocation is renewed every year on 1 September.
- How to use your SU’s on the HPRC cluster? check Manage your accounts HPRC wiki pages for more information!
- To get more allocation, you can email to Mitchcomp Help (mitchcomp_help@physics.tamu.edu) and ask to be given hours from Prof. Toback’s allocation. This is why you are part of MitchComp group.
- You can monitor your SU allocation at Account Management System webpage . You may see more than one account listed. The SUs allocated from our sources (For example from Prof. Toback’s group) will go into separate account numbers from your annual user allocation.
- Once you have multiple accounts, you will need to specify which account to deduct from when you submit job, using the “–account=#######” SLURM option.
Keeping Track of Your Jobs
Overview
You can submit and monitor jobs both directly on the command line and through the HPRC portal.
Tracking your jobs at the Grace/Terra Command Line
See in particular the tables here for Grace/SLURM and here for Terra/SLURM.
In short: use ‘squeue -u [user_name]’.
Tracking your jobs using the OnDemand Portal
The HPRC OnDemand Portal can both compose jobs and be used to monitor them. See the Wiki page for more details.
Transferring Files
In order to transfer your files you have some options. HPRC Wiki has a File Transfer Wikipage comparing file transferring options.
It’s important that you know the size of your transfer. While you can transfer small files through login nodes, it is STRONGLY RECOMMENDED to transfer your big files through Fast Transfer Nodes (FTN) on Grace and Terra. Grace has two fast transfer nodes while Terra has one. Their addresses are:
Terra FTN:
- terra-ftn.hprc.tamu.edu
Grace FTN:
- grace-dtn1.hprc.tamu.edu
- grace-dtn2.hprc.tamu.edu
You should not transfer anything other than your login environment and personal scripts or documents into your /home directory. Other things should be transferred to $SCRATCH (that is, /scratch/user/YOUR-USERNAME).
Transferring via Globus Connect
Globus Connect is our recommended way to transfer your files. It is easy to use, reliable, and fast. It has a web-based user interface and you don’t need to use the TAMU VPN if you are off-campus to transfer files into/from HPRC.
Globus Connect transfers are not encrypted by default! If you want to make it encrypted, in the ‘File Manager’ tab, click on ‘Transfer and Sync Options’ and check the ‘encrypted transfer’ option.
To transfer files with Globus Connect follow these steps:
- Log in into Globus Connect. If you are a member of an organization like TAMU or other universities, a research center, or a laboratory, you can sign in using your existing organizational login (e.g. a TAMU NetID, which will give you access to the HPRC machines listed below).
- Now you need to choose your origin and destination (endpoints). Organizational endpoints (like HPRC endpoints) are predefined. Note that you’ll have to provide credentials for the relevant organization to access (or ‘activate’) its endpoints if you did not log in with those credentials. See this section of the Globus FAQs for more.
Try to search for your endpoint on the Globus collection. The following are HPRC endpoints:
Terra:
-
- TAMU terra-ftn
Grace:
-
- TAMU grace-dtn
- If you want to transfer files from your personal computer you can make a private endpoint. You need to install Globus Connect Personal. Visit this webpage (only works if you are already logged in).
- In the ‘File Manager’ tab, choose your origin, click on “Transfer or Sync to…” on left side of your screen, and then choose your destination.
- Choose the files/folders you want to transfer and click start.
- You can log out now. Once it’s done, Globus Connect will send you a notification by Email.
Grid Resources
Unfortunately, unlike the previous Brazos cluster, HPRC is not a Tier3 site and is not on the Grid. Contact us if you think a suitable replacement or stopgap can be implemented for your work and you need help.
Git
For information on how to access the Mitchcomp Github Team, see the Github guide.
In our group we use Git for many purposes, especially for CDMS computing. There are many documents on the internet to learn Git but we recommend these Atlassian tutorial pages . To access the Git pages, you can go to https://github.tamu.edu/.
Also, for CDMS users, you can find some help with Git in the Confluence documentation .
If you are first setting up on a new machine, you will probably want to go ahead and set up some global configurations:
>$ git config --global user.name "Batman" >$ git config --global user.email batman@batcave.com
This will create a .gitconfig file in your home area (assuming the system already has git set up). Now all your repositories on this system will know your name and e-mail. Note that ‘local’ git configs will be read before your ‘global’ one, though.