Skip to content

Linux: Getting Started

I will try to give an introduction to essential Linux commands, as well as HPC environment tips, and job submission basics. I will provide the details below but here is a list of all commands that I use daily: pwd, cd, ls, tree, ll -tr, ll -a, mkdir, rmdir, rm, rm -rf, cp, mv, cat, head, tail, less, more, nano or vi or vim, grep, ssh, rsync, etc. and some SLURM-specific commands. Learning the use of special characters such as pipe characters |, asterisk *, tilde ~, dot . etc. is also helpful. Also, if you are ever stuck with a command, try man command. Most of the time, every command has their manual page and it is super helpful.

Note

I started writing this hoping to cover everything but it's getting longer and I plan to write about HPC and SLURM in a different note. This note contains the basic linux commands with file managements stuff only. Check the next note for HPC and SLURM stuff.


Basic Linux Commands

Here, if something is inside the square brackets [ ], then that means it's optional. The asterisk * is used to denote everything. These are actually metacharacters and their details can be found in regex WikiPedia article.

pwd

  • Description: Displays the current working directory.
  • Usage: pwd
  • Note: On a personal machine, right after you open a terminal, you may see a directory like /home/username/ (often abbreviated with a tilde ~). On HPC systems, the home directory paths might differ (e.g., /arf/home/username/ on ARF, /truba/home/username on TRUBA, or /home/groupname/username on MareNostrum5).

cd

  • Description: Changes the current directory to the given directory. If no path is given, then changes the current working directory to /home/username i.e. equivalent to cd ~
  • Usage: cd path
  • Example: cd ~/directory_name moves you to the "directory_name" directory within your home folder. The path can either be an absolute path such as /home/username/project/qe/GaAs/ or a relative path such as qe/GaAs/ when you are already inside /home/username/project/.

ls

  • Description: Lists directory contents (all files and folders). If no path is given, current working directory is assumed. Multiple options can be combined such as ll -tr will show all the files where the last updated files are shown at the bottom which helps to see which files got updated last during the runtime of a program.
  • Usage: ls [options] [path]
  • Common Options:
  • -l: Long listing format. (ll is an alias of ls -l)
  • -a: Includes hidden files.
  • -h: Shows the file sizes in human readable format.
  • -t: Sorts by modification time.
  • -r: Reverses the order.

tree

  • Description: It's like the ls command but it shows the file and directory structure in a tree-like format. Not every linux system has this though.
  • Usage: tree
  • Example:
      $ tree
    .
    ├── dir1
    │   ├── file1.txt
    │   └── file2.txt
    └── dir2
        └── file3.txt
    

File Management

mkdir

  • Description: Creates a new directory.
  • Usage: mkdir new_directory
  • Note: It can create multiple directory at the same time such as mkdir my_folder1 my_folder2 another_folder but it will throw an error if there already exists a directory with the same name.

rm

  • Description: Removes files or directories.
  • Usage: rm filename
    For directories, use: rm -r directory_name (Here this -r option stands for recursively) Caution: Use with care to avoid accidental data loss because there is no way to recover data deleted this way. It does not go to the Trash folder, so no way to restore. If you want to clean a directory with everything inside, try rm -rf * (very dangerous but often helpful to manage storage. Consider having some kinda backup before doing this unless you are really sure what you are doing.)

cp

  • Description: Copies files or directories.
  • Usage: cp source destination
  • Note: Use the -r option to copy directories. If you want all the files inside a directory, use * such as cp calculation/GaSA/* . will copy everything inside the GaAs directory to the current working directory. The . always refers to the current working directory. If you want the folders to be copied too, use cp -r calculation/GaSA/* .

mv

  • Description: Moves or renames files or directories.
  • Usage: mv old_name new_name or mv source destination
  • Note: We don't need any -r option here since mv moves/renames everything.

touch

  • Description: Update the timestamps of the file, or if the file doesn't exist, it creates the file.
  • Usage: touch filename
  • Note: It helps when we need to save data in a file and initially, an empty file needs to be created. It also changes the timestamps if the file already exists so that now it is shown at the end of the ll -tr list.

nano

  • Description: Opens the nano text editor
  • Usage: nano or nano filename
  • Note: If the filename exists, it opens that file. Otherwise it starts a new file called filename. Not every HPC has nano but most of them have it. In local machines with GUI, gedit is more user friendly text editor than nano. Some HPC have vim or at least the very primitive (and very powerful) vi.

Viewing File Contents

cat

  • Description: Displays the entire content of a file(s).
  • Usage: cat filename
  • Note: If the file is small enough (less than a hundread lines), this is useful. For large files, this often covers the entire terminal and it's hard to find the specific part to look at. You can enter multiple filenames too to see all of their content.
  • Description: Displays the first part of a file(s).
  • Usage: head filename
  • Note: It only shows the first 10 lines but you can use head -n 50 filename to show first 50 lines, for example. It can also handle multiple files such as head -n 15 *in will show the first 15 lines of all the files in the current directory whose name ends with in.

tail

  • Description: Displays the last part of a file(s).
  • Usage: tail filename
  • Note: It's the same as head except that it shows the final part of the file. It is helpful when a calculation is running and we need to check only the last few lines to see the progress.

less

  • Description: Displays the content of a file.
  • Usage: less filename
  • Note: This is better than cat for large files. It loads the file slowly, and by using the mouse scrolling, or by using space or ENTER or PAGE UP/DOWN navigation, one can navigate through the file. To quit, press q.

Remote Access & File Transfer

ssh

  • Description: Securely log in to a remote machine such as an HPC system (server/cluster/supercomputer). Sometimes the hostname is only accessible when you are connected through a VPN such as in ARF cluster but MareNostrum5 or the TRUBA cluster can be accessed without VPN. The first time someone conncets through a particular host, it might ask for some permission setting which can be safely chosen as yes.
  • Usage: ssh username@hostname
  • Example: ssh username@levrek1.ulakbim.gov.tr

scp

  • Description: Securely copy files between hosts.
  • Usage: scp local_file username@hostname:/path/to/destination or `scp username@hostname:/path/to/cluster_file /path/to/local_mcahine
  • Note: While small clusters like TRUBA/ARF provide the same host for login and file transfer, it seems MareNostrum5 provides a separate I/O specialized nodes for data transfer. It's also possible to transfer data between two clusters if they are connected such as TRUBA and ARF cluster. It can use the -r option like cp

rsync

  • Description: Synchronizes files or directories between systems, efficiently copying only the differences.
  • Usage: rsync -avz source/ destination/
  • Note: This is the best choice to sync local machine with the remote machine. I also prefer using a ignore.txt file that lists all the pattern to ignore, and then use rsync as
    rsync -avz --progress --exclude-from="ignore.txt" username@172.16.6.11:/arf/home/username/my_remote_files/* /home/local_username/my_local_files/
    
    . It can also be used to sync between two local folders. In that case, the -z (compression) option is not needed. Check man rsync to know about other options.