Table of Contents

High Performance Computing and Best Practices


HPC Vs. Single Processors

Serial Computing - Instructions run one at a time:

Parallel Computing - Problem is split into multiple problems, each problem can be run concurrently (at the same time) on multiple processors

Need for HPC

Time - Parallel computing can solve problems faster

Cost - Parallel computing can be accomplished with cheap hardware and the time savings can lead to cost savings

Larger Problems - Complex problems like climate change, traffic, website transactions, and plasma physics can be infeasible with serial computing and are better suited to parallel computing

Concepts and Terminology

HPC - High Performance Computing, solving big problems with big computing power

Cluster - A group of computers working together

Node - A single computer, many nodes join together to form a cluster

Core/CPU - A processing unit

Job - A problem or program to be run

Parallel Overhead - Extra time needed to setup and coordinate a parallel job: synchronizing, data exchange, start-up/termination, etc

Scalability - Ability of a system to handle more work or its ability to be enlarged to accommodate more work

Limits and Cost

$Speedup = \frac{1}{(P/N)+S}$

P = parallel fraction, N = number of processors, S = serial fraction

Development cost of parallel programming is higher.

Memory Architectures

Batch Processing

Batch Processing and Shared Resources

The need for shared resources

Our Cluster


Sunway TaihuLight

Check for an updated list of the top500 supercomputers

Batch Processing Management

With multiple users wishing to access computing resources, we need a way to manage who gets to use what resources and when they can do so.

Some Resource Managers:

Using HPC Resources at WSI

Accessing Resources via SSH

In the office:

$ ssh user@control

Launching and Managing Your Jobs

Using Torque (NERSC)

Using Slurm

Version Control

Often multiple programmers will be working on a codebase simultaneously. Alice may want to work on the code but Bob does too. If they both make changes to a file without telling each other, changes might get lost. Using a version control system ensures that Alice and Bob can work on the code together without undoing each-others work or breaking the code.

Commonly used Version Control Systems

Github SVN

HPC Best Practices