Linux: Getting Started
I will try to give an introduction to essential Linux commands, as well as HPC environment tips, and job submission basics. I will provide the details below but here is a list of all commands that I use daily: pwd, cd, ls, tree, ll -tr, ll -a, mkdir, rmdir, rm, rm -rf, cp, mv, cat, head, tail, less, more, nano or vi or vim, grep, ssh, rsync, etc. and some SLURM-specific commands. Learning the use of special characters such as pipe characters |, asterisk *, tilde ~, dot . etc. is also helpful. Also, if you are ever stuck with a command, try man command. Most of the time, every command has their manual page and it is super helpful.
Note
I started writing this hoping to cover everything but it's getting longer and I plan to write about HPC and SLURM in a different note. This note contains the basic linux commands with file managements stuff only. Check the next note for HPC and SLURM stuff.
Basic Linux Commands
Here, if something is inside the square brackets [ ], then that means it's optional. The asterisk * is used to denote everything. These are actually metacharacters and their details can be found in regex WikiPedia article.
pwd
- Description: Displays the current working directory.
- Usage:
pwd - Note: On a personal machine, right after you open a terminal, you may see a directory like
/home/username/(often abbreviated with a tilde~). On HPC systems, the home directory paths might differ (e.g.,/arf/home/username/on ARF,/truba/home/usernameon TRUBA, or/home/groupname/usernameon MareNostrum5).
cd
- Description: Changes the current directory to the given directory. If no
pathis given, then changes the current working directory to/home/usernamei.e. equivalent tocd ~ - Usage:
cd path - Example:
cd ~/directory_namemoves you to the "directory_name" directory within your home folder. Thepathcan either be an absolutepathsuch as/home/username/project/qe/GaAs/or a relativepathsuch asqe/GaAs/when you are already inside/home/username/project/.
ls
- Description: Lists directory contents (all files and folders). If no
pathis given, current working directory is assumed. Multiple options can be combined such asll -trwill show all the files where the last updated files are shown at the bottom which helps to see which files got updated last during the runtime of a program. - Usage:
ls [options] [path] - Common Options:
-l: Long listing format. (llis an alias ofls -l)-a: Includes hidden files.-h: Shows the file sizes in human readable format.-t: Sorts by modification time.-r: Reverses the order.
tree
- Description: It's like the
lscommand but it shows the file and directory structure in a tree-like format. Not every linux system has this though. - Usage:
tree - Example:
File Management
mkdir
- Description: Creates a new directory.
- Usage:
mkdir new_directory - Note: It can create multiple directory at the same time such as
mkdir my_folder1 my_folder2 another_folderbut it will throw an error if there already exists a directory with the same name.
rm
- Description: Removes files or directories.
- Usage:
rm filename
For directories, use:rm -r directory_name(Here this-roption stands for recursively) Caution: Use with care to avoid accidental data loss because there is no way to recover data deleted this way. It does not go to the Trash folder, so no way to restore. If you want to clean a directory with everything inside, tryrm -rf *(very dangerous but often helpful to manage storage. Consider having some kinda backup before doing this unless you are really sure what you are doing.)
cp
- Description: Copies files or directories.
- Usage:
cp source destination - Note: Use the
-roption to copy directories. If you want all the files inside a directory, use*such ascp calculation/GaSA/* .will copy everything inside theGaAsdirectory to the current working directory. The.always refers to the current working directory. If you want the folders to be copied too, usecp -r calculation/GaSA/* .
mv
- Description: Moves or renames files or directories.
- Usage:
mv old_name new_nameormv source destination - Note: We don't need any
-roption here sincemvmoves/renames everything.
touch
- Description: Update the timestamps of the file, or if the file doesn't exist, it creates the file.
- Usage:
touch filename - Note: It helps when we need to save data in a file and initially, an empty file needs to be created. It also changes the timestamps if the file already exists so that now it is shown at the end of the
ll -trlist.
nano
- Description: Opens the
nanotext editor - Usage:
nanoornano filename - Note: If the
filenameexists, it opens that file. Otherwise it starts a new file calledfilename. Not every HPC hasnanobut most of them have it. In local machines with GUI,geditis more user friendly text editor thannano. Some HPC havevimor at least the very primitive (and very powerful)vi.
Viewing File Contents
cat
- Description: Displays the entire content of a file(s).
- Usage:
cat filename - Note: If the file is small enough (less than a hundread lines), this is useful. For large files, this often covers the entire terminal and it's hard to find the specific part to look at. You can enter multiple filenames too to see all of their content.
head
- Description: Displays the first part of a file(s).
- Usage:
head filename - Note: It only shows the first 10 lines but you can use
head -n 50 filenameto show first 50 lines, for example. It can also handle multiple files such ashead -n 15 *inwill show the first 15 lines of all the files in the current directory whose name ends within.
tail
- Description: Displays the last part of a file(s).
- Usage:
tail filename - Note: It's the same as
headexcept that it shows the final part of the file. It is helpful when a calculation is running and we need to check only the last few lines to see the progress.
less
- Description: Displays the content of a file.
- Usage:
less filename - Note: This is better than
catfor large files. It loads the file slowly, and by using the mouse scrolling, or by usingspaceorENTERorPAGE UP/DOWNnavigation, one can navigate through the file. To quit, pressq.
Remote Access & File Transfer
ssh
- Description: Securely log in to a remote machine such as an HPC system (server/cluster/supercomputer). Sometimes the
hostnameis only accessible when you are connected through a VPN such as in ARF cluster but MareNostrum5 or the TRUBA cluster can be accessed without VPN. The first time someone conncets through a particular host, it might ask for some permission setting which can be safely chosen asyes. - Usage:
ssh username@hostname - Example:
ssh username@levrek1.ulakbim.gov.tr
scp
- Description: Securely copy files between hosts.
- Usage:
scp local_file username@hostname:/path/to/destinationor`scp username@hostname:/path/to/cluster_file /path/to/local_mcahine - Note: While small clusters like TRUBA/ARF provide the same host for login and file transfer, it seems MareNostrum5 provides a separate I/O specialized nodes for data transfer. It's also possible to transfer data between two clusters if they are connected such as TRUBA and ARF cluster. It can use the
-roption likecp
rsync
- Description: Synchronizes files or directories between systems, efficiently copying only the differences.
- Usage:
rsync -avz source/ destination/ - Note: This is the best choice to sync local machine with the remote machine. I also prefer using a
ignore.txtfile that lists all the pattern to ignore, and then usersyncas. It can also be used to sync between two local folders. In that case, thersync -avz --progress --exclude-from="ignore.txt" username@172.16.6.11:/arf/home/username/my_remote_files/* /home/local_username/my_local_files/-z(compression) option is not needed. Checkman rsyncto know about other options.