MareNostrum5

Connecting to MareNostrum5

I have defined the following alias in my ~/.bashrc file, and then sourced it source ~/.bashrc (restarting the terminal also works).

alias mn5l1='ssh MyUserName@glogin1.bsc.es'
alias mn5l2='ssh MyUserName@glogin2.bsc.es'

Therefore, to connect to MareNostrum5, I can simply do mn5l1 or mn5l2. This way I can enter the General Purpose Partition (GPP) partition which is CPU based. We don't need the other Accelerated Partition (ACC). The default loaded modules are:

[MyUserName@glogin1 ~]$ module list

Currently Loaded Modules:
  1) intel/2023.2.0   3) mkl/2023.2.0   5) oneapi/2023.2.0
  2) impi/2021.10.0   4) ucx/1.15.0     6) bsc/1.0

File systems

First, we need to understand the file systems. After logging in, our current working directory is /home/cbil/MyUserName/ but this seems to be just a link to /gpfs/home/cbil/MyUserName/. In the /gpfs/home/cbil/, there are home directory for 4 different users which is not shared. So, one cannot access shared files there. Each of the user seems to have only 80 GB in this home directory.

Warning

Running jobs directly from this /home/cbil/MyUserName/ filesystem is strongly discouraged.

We mistakenly ran job from this filesystem on the first day. Jobs should be executed either from /gpfs/projects or /gpfs/scratch.

Our project has a shared storage which can be accessed by any project members which is located at /gpfs/projects/OurProjectNumber. We have a 1 TB storage limit in this shared storage. The quota is group-based unlike the storage quota of home filesystem which is individual-based. We can run job here but it's better to run jobs in /gpfs/scratch.

The /gpfs/scratch/OurProjectNumber/ is designed to store temporary files during the executation of a job. We can run our job here, then after finishing the job, move the necessary output files in /gpfs/projects/OurProjectNumber/. Both projects and scratch directories are shared among group members.

Transferring files

Since we don't have internet connection in the MareNostrum5, we cannot use git to synchronize our files. That's why I prefer rsync here. scp can also be used.

rsync -avz --progress --exclude-from="rsync-exclude.txt" MyUserName@transfer1.bsc.es:/gpfs/scratch/OurProjectNumber/benchmark/* /home/abdul/mn5/benchmark/

rsync -avzP /home/abdul/mn5/benchmark/* MyUserName@transfer1.bsc.es:/gpfs/scratch/OurProjectNumber/benchmark/

Example of my rsync-exclude.txt file: (it contains exclude pattern for Quantum ESPRESSO wavefunctions, GWW temporary files, WEST temporary files, etc.)

*quasi*
*storage*
*expansion*
wfcup*.hdf5
wfcdw*.hdf5
charge*den*5
wfc*.hdf5
*.wfc*
wfc*.dat
*body*.dat
*dmat*dat
*zmat*dat
*over_g*
Q0*E0*dat
hf.dat
*diago*.dat
*.pp
atomic_proj.xml
*.mix*
*.bfgs
*.update
*restart*
*WFSX*
*.fcw*
*.fmat*
*.fsr*
*.oap*
*.real_whole*
*.vw_lanczos*
*.wiwjwfc_red*
*.phsave*
*_ph0*
*.basis2simple*
*.dft_xc*
*.exchange*
*.nfcws*
*green*
*.p_eig_lan*
*.s_eig_lan*
*.p_mat_lan*
*.a_mat_lan*
*.s_mat_lan*
*p_iter_lanczos*
*s_iter_lanczos*
*-polaw*
*.pt_mat_lan*
*.st_mat_lan*
*-q_lanczos*
*.vgq*
*.v_mat*
*.vpot*
*.s_vector*
*.uterms*
*.wannier*
*.wing*