Notur
News About us Publications Projects Hardware Software Access Guide Apply Report Support Documentation

A guide for using the Notur facilities

This page provides information about the resources and services provided by the Notur project. The information is primarily for potential and new users of the national compute facilities.


Contents:

   1. General information
   2. Use of the resources
   3. Projects: proposals, allocations
   4. Access: user account, passwords, secure shell, data transfer
   5. Storage
   6. Execution of applications
   7. Programming
   8. Scientific publications and acknowledgments
   9. Support


1. General information

The Notur project operates a number of high-end supercomputer facilities. The facilities are available to individuals and groups at the Norwegian universities and colleges, the Meteorological Institute, and any other projects that are funded by the Research Council of Norway (Norges forskningsråd) or the Ministry of Education and Research (Kunnskapsdepartementet).

Detailed information about the Notur project can be found elsewhere on this website (www.notur.no). We mention the following:

  • A brief description of the Notur project and its organization
  • A description of the available hardware facilities
  • A list of available software on each of the facilities
  • How to get access to the facilities, including allocations and user accounts
  • Technical support for the facilities, including help-desk assistance, end-user documentation and application support

2. Use of the resources

The facilities are operated by the university partners of the Notur project. To ensure a smooth service for all users, certain rules and regulations are required. The local regulations for use of the general IT-infrastructure at each university also apply to the Notur facilities operated by that university. The support staff at NTNU, UiB, UiO, and UiT can provide further information on this if needed.

Each of the facilities is set up as a shared resource. At any time, each facility is used by multiple users and several applications from different users are executed simultaneously. Even if the facilities are large in terms of the number of processors, memory, and storage, there are limitations to their capacities. If one or more of these limitations is exceeded, the performance of the overall system may be seriously degraded. Due to the special nature of the Notur facilities, some additional rules, guidelines, and recommendations for use of these facilities are necessary. They are meant to ensure that all users are satisfied with the provided services, to ensure that the facilities provide reliable and robust services to all the users, and to ensure efficient utilization of the facilities.

It is important that the user complies with the rules and regulations of the Notur project and those of the universities. This applies to all types of resources provided by the Notur project, including compute facilities (processors, memory, and processor/node interconnect), disk storage and secondary storage (tape), and the network. If you are in doubt whether your (intended) usage of a facility conforms to the rules and regulations, please contact the local support staff.

3. Projects; proposals, allocations

To be able to execute applications on a facility, one needs to apply for a CPU-hour allocation (or quota) for that facility. The proposals are evaluated by a Resource Allocation Committee (RFK) that has been appointed by the Research Council of Norway. The procedures and criteria for obtaining an allocation can be found here. Allocation are assigned to research projects with a sound scientific objective. The applicant for an allocation must make sure that the information in the proposal is accurate and up-to-date.

Resources on the machines are allocated per allocation period. An allocation period is a 6-month period, starting the first working day in April and the first working day in October.

Allocations on the facilities take the form of a project. The applicant is the project responsible. Project responsibles must inform UNINETT Sigma about any changes in the information that was provided in the proposal and any deviations that may lead to usage of the resources that differs from what was originally planned.

Project names are of the form nn****k, where * is a digit [0-9].

Several users can be attached to one project. These users share the allocations that is given to the project. Each user in a project shall ensure that the allocation is used only for the activities specified in the project proposal. Failure to comply with these requirements may lead to reduction or withdrawal of the allocation and to closure of user accounts.

Each (active) project that received an allocation during a year is required to submit a Usage Report to Notur by the end of that year.

Other regulations with respect to allocations are:

  • Once the allocation for a project on a facility is exhausted during the on-going allocation period, it is no longer possible for that project to execute computational tasks on that facility. In case more resources are needed, the project must apply for extra allocations by using the normal procedures.
  • Moving allocations between Notur facilities (for the same project) is only possible after a request is sent to and approved by the RFK.
  • Moving allocations between different projects (from project A to project B) is in principle not possible. This requires an application for extra allocations for project B.
  • The Notur project monitors whether allocations are actually being used during the on-going allocation period. Once it is noticed that a project is not using its allocation, a request will be sent to the project (responsible) about the expected usage for the remainder of the period. If the project expects to use considerably less than the assigned allocation or the project does not respond to the request within reasonable time, the allocation may be reduced in size.
  • Unused allocations at the end of the period cannot be moved to a future period.

Each facility has a command-line function (cost) that gives a brief status of the accounting statistics for a user or project. This includes the total allocation (quota) and the usage (CPU-hours consumed) so far.

4. Access: user account, passwords, secure shell, data transfer

Once a CPU-hour project has been established, users can be added to the project that can make use of the granted resources. New users can be added by sending an application for a user account to UNINETT Sigma.

Users must inform UNINETT Sigma in case the information that was provided in the application for a user account has changed.

Once a user has obtained a user account on any of the Notur facilities, the user wil be given a user name and password for this account. User names are strictly personal. Never distribute your password in any way. It is not permitted to share user name and password with someone else. It is the user's responsibility to secure the confidentiality of user name and password.

The user is encouraged to

  • change the password immediately the first time the user logs on to the resource
  • change the password regularly

Passwords are enforced to be non-trivial.

The system administration is allowed to close a user account in case there are clear indications that the account is used by several people.

A user may have access to more than one Notur facility. In such case, the user will have the same user name on each of these facilities. It is not required to have the same password on the facilities.

A Secure Shell client (SSH) is the required tool to connect to the Notur facilities. SSH is a network protocol that allows data to be exchanged over a secure channel between two computers. Encryption provides confidentiality and integrity of data. SSH uses public-key cryptography to authenticate the remote computer and allow the remote computer to authenticate the user, if necessary. SSH is typically used to log into a remote machine and execute commands, but it also supports tunneling, forwarding arbitrary TCP ports and X11 connections. See further information on how to log in with SSH.

The facilities are stand-alone systems and do not mount remote file systems. Files can be transferred securely to and from the facilities with the SCP or SFTP utilities. In a Linux environment, the on-line manual pages provide more information (type 'man scp' or 'man sftp')

5. Storage

Directories. Each user has one private storage area (home directory) on each of the compute facilities where the user has an account. In addition, on most of the facilities, the user has also access to one or more shared storage areas (work directories) that can be used by several users simultaneously.

  • The user's home directory (/home) is intended for permanent data only. This includes source code, binary code, scripts, fixed input data and final computed results. There may be a default size limit (quota) on the home directories which may differ between facilities. Large-volume data shall not be stored for a long time on the home directory, but must be moved to storage at the user's local machine, or to a Notur archival service (tape storage) if available. Data in the home directory that is not accessed on a regular basis must be compressed. The home directory will often be on a slow filesystem (or may be NFS-mounted). Due to such performance limitations, the home directory must never be used for demanding I/O or large temporary storage during computation.
  • Work directories (/work or /scratch) are intended for intensive I/O and temporary storage during computation. Work directories typically reside on faster disks than home directories, and are considerably larger than a single user's home directory. Work directories are shared by several users. To prevent work directories from being full with data, it is required that the user removes his/her data from these directories once this data is no longer needed. Work directories are purged with regular intervals, see Purging Policies.

Disk quota is not regulated uniformly across the facilities. In practice, you will have a quota on your home directory (some Gigabytes), but no limit on the work directories. In case you need more space in your home directory than the current limit allows, contact the local support staff with a request to increase your quota.

Backup policies. Data in user home directories is backed up. Earlier versions of files are kept at least 90 days and deleted files at least half a year (182 days). The backup policy may differ per site and the user is encouraged to contact the local support staff for more details. Data in work directories is not backed up.

Purging policies. In case one of the work file systems is getting full, the system administrator may remove files without prior notice. To make sure there is always sufficient temporary storage available for running (on-going) applications, there may exist special routines for automatic cleanup of work directories. Automatic cleanup normally removes the older files first. In situations with high demand for temporary storage, files may be deleted after just a few days.

It is important not to keep important data in work directories for an extended period of time.

Users with special needs for temporary storage on a facility should contact the system administration of the facility before starting the application(s).

Attempts by the user to circumvent the purpose of the work directories and the cleanup routines by using creative techniques will be recorded and the user in question may be denied further access to the resource.

Compression of data. A user is strongly encouraged to compress data as much as possible. The following commands can be used to compress data without loss of information:

  • gzip file creates the compressed file file.gz
  • gunzip file.gz recreates the original uncompressed file
  • In Linux, type 'man gzip' for on-line information
  • bzip2 file creates the compressed file file.bz2
  • bunzip2 file.bz2 recreates the original uncompressed file
  • In Linux, type 'man bzip2' for on-line information
As a rule, large files in a user's home directory should always be compressed. Large uncompressed files can be kept in the work directories.

Long-term storage. Projects that have a need for long-term storage of large data sets that must survive shifts in hardware and software technologies or data sets that must be shared by several groups or communities, should consider applying for allocations from the NorStore facilities.

6. Execution of applications

The operating systems on the Notur facilities are variants of Unix and Linux.

Batch usage. Each Notur compute facility uses a batch system. A batch system is software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., (batch) jobs, among the available resources on the facility. A user must submit all his/her jobs to the batch system. The batch system uses a scheduler that starts jobs on the facility based on available resources and the job specifications that are supplied by the users. A job specification is a file (script) given to the batch system that contains user-supplied parameters like job priority, maximum allowed run-time, number of processors and memory requested, as well as the locations of the application binaries, input data and output data. Submission procedures and choice of parameters vary between the facilities and the user must acquaint himself/herself with the local set up before submitting jobs.

In case there are insufficient resources available to execute a job (e.g., due to usage of the system by other jobs), the execution of the job is postponed and the batch system places the job in a queue. The job will be queued until the resources become available that were requested in the job's batch script. The batch system uses sophisticated algorithms to optimize the use of the resources and to ensure a fair sharing of the overall resource between all users.

Interactive usage. Interactive execution of applications (i.e., execution of applications directly from the command-line) circumvents the batch system. Interactive usage is permitted for administrative tasks, like text editing, data handling, compilation and short test runs for program development.

Interactive usage of the facilities with resource-demanding applications is in principle not allowed. In particular applications that occupy processors for a longer period must not be used interactively. Some of the facilities have a small part that is reserved for interactive purposes, but still may impose certain restrictions on the jobs that can be submitted. Limitations may apply to run-time, number of processors, etc.

The user must know the local policies for interactive usage before starting resource-demanding interactive applications. In case the user violates these policies, the corresponding jobs may be terminated without prior notice.

In case you are not sure whether your applications can be executed on the facilities in batch mode or (the allowed) interactive modes, please contact the local support staff and UNINETT Sigma.

Software and job requirements. Users are expected to know their software applications as well as the requirements for each job that they submit on the facilities. Important properties of applications software are scalability and run-time efficiency. Important properties of jobs are expected run-time, memory demands and (temporary) storage demands. The Notur compute facilities are parallel systems, capable of running applications that use many processors simultaneously. However, application software may have poor scalability and limitations in the software may not lead to faster run-times when more processors are used. The number of processors that is requested for each job must therefore be judged with care.

It is good practice that users verify that submitted applications behave as expected during run-time, especially applications that consume large resources (run-time, etc.). It is not the responsibility of the system support staff to detect errors made by users. However, the support staff may terminate jobs that make wrong or bad use of resources and that interfere with other jobs that are running on the facilities.

Certain applications require large amounts (Gigabytes) of input data or produce large amounts of intermediate data and/or output data. Large amounts of data that need to be read from or written to disk may decrease the overall run-time of an application considerably. Always use the work directories of the facilities when large amounts of data are involved.

Software licensing. The user is not allowed to install or use software on any of the Notur installations if that will violate the licensing conditions that are attached to this software.

Job priorities. All users and jobs are normally treated equally on each of the facilities. In case you believe that (some of) your jobs should be executed with higher priority, you must request this to the support staff of the facility in question. Once a job is submitted, changing its priority may no longer be possible. Make sure that you make such request as soon as possible. In case the request concerns many jobs or jobs that will occupy a significant fraction of the resources, you must supply sufficient detail and justification.

7. Programming

For researchers that are not familiar with multi-processor (parallel) computers, it is important to learn how parallel computers can be used most efficiently. A sequential application (using one processor) will not necessarily execute much faster on a parallel computer than on a local desktop. It is crucial that a software application can make use of several processors simultaneously such that the required arithmetic calculations are distributed equally over the available processors, without changing the result of the overall computation. A computation using N processors simultaneously should be carried out approximately N times faster than the same computation using only a single processor.

Programming a parallel computer is in many ways similar to programming your local desktop in a Linux environment. You need software that is written in a standard programming language (e.g., C/C++, Fortran) that can be compiled using one of the compilers that are available on the system. Once you have created a binary that can be executed on the computer and the necessary input data, you can submit a job to the batch system.

This section does not attempt to provide full information on parallelization techniques of computer programs.

Links to more detailed educational material will be included shortly.

8. Scientific publications and acknowledgments

Users are strongly encouraged to acknowledge the use of the Notur facilities in all their publications (journals, magazines, newspapers, etc.). It is also highly appreciated if you can send us copies of publications that acknowledge the Notur project, or send us pointers to any sources in the media where references to the Notur project can be found.

9. User support

Contact the support staff in case you are in doubt about any of the rules and regulations for a specific facility or to get assistance in resolving problems or executing your applications.

Information concerning the help-desk and advanced user support services provided by the Notur project can be found here.

In case you believe that you have not been given professional or adequate assistance, contact UNINETT Sigma.

info@notur.no