Course/Event Essentials

Event/Course Start

Tue, 06/02/2024 - 09:00 CET

Event/Course End

Thu, 08/02/2024 - 16:00 CET

Event/Course Format

Online

Live (synchronous)

Primary Event/Course URL

https://www.cecam.org/workshop-details/…

Venue Information

Country: Spain
Venue Details: Click here

Training Content and Scope

Scientific Domain

Not scientific domain specific

Technical Domain

Not technical domain specific

Level of Instruction

Beginner

Sector of the Target Audience

Research and Academia

Industry

HPC Profile of Target Audience

Application Users

Data Scientists

Language of Instruction

English

Other Information

Supporting Project(s)

MultiXscale

Event/Course Description

Outline

This practical introduction to high performance computing (HPC) will cover all basic concepts needed to access an HPC cluster and to run applications and workflows there.

The course is built upon the Carpentries and HPC Carpentry online learning materials, with HPC Carpentry being a lesson program in incubation for the Carpentries. The Carpentries is a nonprofit organisation that teaches software engineering and data science skills to researchers in order to conduct efficient, open, and reproducible research. Their volunteer instructors (4287) have run 4000+ Workshops in 65 countries since 2012, with 450+ alone in 2022 [3]. All of their lesson materials are freely reusable under the Creative Commons - Attribution licence [4].

Here, we are offering a 3-day workshop, composed of a half-day and 2 full-day sessions, with the primary goal of introducing High-Performance Computing (HPC) to individuals who have limited or no prior experience with such computing resources. The first day is dedicated to guiding participants through the basics of file systems and the Unix shell. This foundation is a prerequisite for the other days which introduces HPC resources, the cluster management tool Slurm, and how to run applications and workflows on such resources.

Join this workshop if you are:

Interested in learning what HPC is
Interested in learning how to access and use HPC resources
Interested in running applications and workflows on HPC resources

Learning Outcomes:

By the end of this workshop, you will be able to:

Use the UNIX shell (also known as terminal or command line) to connect to a cluster
Identify problems a cluster can help solve
Transfer files onto a cluster
Comfortably manipulate files and directories, locally and on a remote resource
Submit and manage jobs on a cluster using a scheduler
Observe the benefits and limitations of parallel execution
Understand how to construct a reliable, scalable and reproducible scientific workflow as a series of chained rules
Use a subset of features of the Snakemake workflow tool
Run a Snakemake workflow on an HPC resource

Requirements:

There is no need for programming or informatics skills but a basic understanding of what files and directories are is required
PC/Laptop with an up-to-date browser. Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, may not be)
The lessons require you to have access to a terminal application with ssh capabilities. If it is unclear what this requirement means, please click here for guidance on how to make this available for your operating system

Recommendations for your setup and interacting during the workshop:

To follow the workshop more efficiently, we recommend having a two-screen setup: for example, one to display the instructor’s shared screen and the collaborative pad, and another one for your own coding.
To actively communicate during the workshop, please familiarise yourself with Markdown formatting by reviewing the HedgeDoc features document.

Useful links and videos

Interaction between participants, trainers and helpers

The workshop will be delivered in a Zoom webinar format, with participants’ visibility disabled to preserve their privacy. You, as a participant, will be able to see and learn from the trainers but a direct interaction (e.g. chat or voice) will not be possible during the sessions. Instead, a collaborative document, previously setup by the trainers, will be shared with you before the session. You will be expected to engage and interact anonymously with other participants as well as with the workshop helpers and trainers directly in this document.

Trainer Hubs

All BioNT workshops are offered at no cost, but there are a limited number of seats available. To make workshops more accessible for members of the same company we highly recommend organising what we refer to as "Training Hubs." In this arrangement, one person is formally registered for the workshop, but the knowledge sharing can be expanded to numerous colleagues within their company or SME through live-streaming the session.

Topics

Day 1

The Unix Shell

1. Introducing the Shell

2. Navigating Files and Directories

3. Working With Files and Directories

4. Pipes and Filters

5. Shell Scripts

Day 2

Introduction to High-Performance Computing

1. Why use a Cluster?

2. Connecting to a remote HPC system

3. Exploring Remote Resources

4. Scheduler Fundamentals

5. Accessing software via Modules

6. Transferring files with remote computers

7. Running a parallel job

Day 3

8. Using resources effectively

9. Using shared resources responsibly

HPC Workflow Management with Snakemake

1. Running commands with Snakemake

2. Placeholders and wildcards

3. Chaining rules

4. How Snakemake plans what jobs to run

4. Snakemake and the Cluster

5. Snakemake Profiles

6. Processing lists of inputs

7. Finishing the basic workflow

About BioNT

BioNT - BIO Network for Training - is an international consortium of academic entities and small and medium-sized enterprises (SMEs). BioNT is dedicated to providing a comprehensive training program and fostering a community for digital skills relevant to the biotechnology industry and biomedical sector. With a curriculum tailored for both beginners and advanced professionals, BioNT aims to equip individuals with the necessary expertise in handling, processing, and visualising biological data, as well as utilising computational biology tools. Leveraging the consortium's strong background in digital literacy training and extensive network of collaborations, BioNT is poised to professionalise life sciences data management, processing, and analysis skills.

An introduction to High Performance Computing

Course/Event Essentials

Venue Information

Training Content and Scope

Other Information

Outline

About BioNT

Stay Updated!

Search

An introduction to High Performance Computing

Course/Event Essentials

Venue Information

Training Content and Scope

Other Information

Outline

About BioNT