SSH Tutorial

In this post, I break down the basics of SSH and OpenSSH, offering an introduction to a general audience, whether you are a developer, a researcher, a student or just someone interested in using SSH. From setting up asymmetric key pairs to configuring OpenSSH, this tutorial covers all you need to know to get started and navigate the world of servers.

1. Security over the internet

When you are connecting to a website over the internet, you are always connecting to a physical (web) server. This server stores the website data, sends it to your web browser on request, which then displays it on your computer screen. Nowadays, besides typing in the url of a website into your browser, you have nothing else to worry about. Your web browser and the web server automatically handles every other aspect of this process.

However, web servers are just one type of server among many others, each serving different purposes. For instance, there are servers dedicated to file sharing, database storage, running programs, managing virtual machines or even other servers, and much more. These servers often require a terminal for access. (Note: While ‘terminal’, ‘console’, ‘shell’, and ‘command line’ are frequently used interchangeably, they actually have distinct meanings. From now on, I will only use the term ‘terminal’.)

Connecting to these servers securely is crucial. When you browse a website, your connection is typically secured through Hypertext Transfer Protocol Secure (HTTPS), a “communication scheme”, which ensures that the data exchange between your browser and the web server is encrypted. Similarly, when connecting to other types of servers via a terminal, secure protocols like the Secure Shell Protocol (SSH) are used. This guarantees that all data transferred between you (the client) and the server is encrypted, adhering to modern standards of internet security.

2. SSH and assymetric key pairs

Similarly to every “protocol” that could be mentioned in the context of computer science and technology (just like HTTPS above), SSH is also just a “theoretical model” or “design”. Somebody should first implement this design in a software form that we can run on a computer. In the case of SSH, the most widely used implementation is OpenSSH. Nowadays, OpenSSH is not just a simple software but rather a collection of them, merged into what we call a software suit.

The goal of the SSH protocol is to provide a client with a secure authentication method when trying to connect to a server, somehow. OpenSSH offers several authentication methods (6 in total), like using a password or an asymmetric key pair and other ways. However, in most real-life use cases, using key pairs is the preferred authentication method. It does not just give us the comfort that we do not need to type our passwords every time to connect securely to a server; using asymmetric key pairs could also be considered a more secure authentication method than regular passwords.

The key pair authentication method implemented in OpenSSH is based on asymmetric cryptography or public-key cryptography. This means that during the authentication process between two actors, we use a pair of keys: a public key, and a private key. Consider the scenario where these two actors are a client (a user) and a server. In this case, the public key will be stored on the server, while the private key is on the client’s (your) machine. The private key (held by you) is used to authenticate yourself to the server (it is basically like showing your ID card). On the other hand, the public key (on the server) is used to check the validity of your private key (your digital ID card). Of course, the story is much more complex than that. The asymmetric cryptography itself refers to the practice of encrypting and decrypting messages using a unique, private-public key pair, but that is an entirely different topic for a completely different lecture.

Asymmetric key pair files (both public and private key) in the `~/.ssh` directory on my computer.

Asymmetric cryptography is designed so that as long as you are keeping your private key a secret (so read-only by you and only you), no potential impersonator can pretend to be you and trick the server into this false idea. Since both the public key and the private key are simple text files stored on a computer, it is possible for someone to steal your private key (e.g. copy the file after breaking into your account or computer). To abridge this problem, the SSH protocol offers a second layer of protection by adding an optional passphrase (a password essentially) to your asymmetric key pair that you need to type whenever you try to connect to a server and authenticate yourself using the private key. Even if someone grabs hold of your private key file, it is still useless without the (hopefully secure) passphrase.

OpenSSH also provides some quality of life tools that helps you connect to a server even more easily, without any security trade-off. I will talk about this a bit later too.

Notes to clear up any remaining confusion

SSH: A computer protocol. It is a conceptual model of connecting computers securely over an otherwise unsecured network.
OpenSSH: The most popular software implementation of the SSH protocol; that is why I’m speaking about only this in this tutorial. Nowadays, this is a collection of software named “OpenSSH” that provides many other features besides SSH’s actual implementation.
Asymmetric cryptography/public-key cryptography: A cryptographic method/system that is widely used in computer authentication and is based on using “asymmetric key pairs” (also called “public and private key pairs”). It is the cryptographic method that OpenSSH uses as its authentication method.

3. Creating an SSH key pair

All steps detailed below are nearly identical on Windows, Linux and MacOS. However, minor differences between various systems should always be expected. While Linux and MacOS users can use their regular command line interface (aka “terminal”), Windows users are recommended to use Git Bash or the Windows Subsystem for Linux (WSL). The latter one has been available since Windows 10, and it is a great way to get started with Linux or just use Linux commands, but you do not want to install a virtual machine or dual-boot your computer just to use Linux.

Step 1. Setup the `.ssh` directory and its permissions

Create a .ssh directory in your ~ (home) directory, give the necessary permissions to it and cd into it. If you are on Windows, the ~ directory is different from the C:\Users\your-username directory! Git Bash and WSL create a new file system for themselves on your computer at a well-hidden location. In this case, the ~ directory is the default location when you start your Git Bash or a WSL terminal. Even if that is not the case, you can just cd there using cd ~. But it is unnecessary for any of the commands here if you use them as-is.

$ mkdir -p ~/.ssh && chmod 700 ~/.ssh && cd ~/.ssh

Just like the .ssh directory, both the public and private keys, the SSH config file and other related files and folders need the appropriate permissions to work as intended. A comprehensive list can be found in this forum post. Here, 700 stands for “can be read, written and executed by only you”.

Step 2. Generate your key pair

The next step is to use the ssh-keygen software from OpenSSH to generate a new key pair. One can use several cryptographic algorithms (like RSA or EdDSA, etc.) for key pair generation, and most existing ones are implemented in OpenSSH. It may not be trivial, but there are more and less secure algorithms among these. Generally, using the safest available option for key generation is advised. Side note: the currently known best algorithm might not be supported by the server you want to connect to, forcing you to fall back to a generally less secure option. It happens; you have to deal with it. However, as of 2024, the best practice is to use the so-called Ed25519 digital signature scheme.

To generate a key pair using ssh-keygen the only thing necessary is to pass a command line argument that specifies the name of the algorithm you want to use for key pair generation. Various servers or sites could ask you for other info; e.g., GitHub requires you to provide the email address associated with your GitHub account in the “comments section” of a key pair, using the -C comments flag. However, this is optional for the key generation itself in most cases. Using the Ed25519 algorithm, the command to generate a key pair (along with the optional -C flag for GitHub) is the following:

$ ssh-keygen -t ed25519 -C "your@email.com"

After hitting enter, ssh-keygen will ask you 3 questions:

Enter a filename to save the key in

This question asks where you want to save the key pair and what you want to name the created key files (the public and the private key files). Here, you can give the desired file name or an absolute or relative path for the newly generated keys. Since we want to create them in the .ssh directory (and because we are cd-ed into it already in the previous step), both ~/.ssh/file-name and file-name are identical to each other here. In the first case, the key pair is generated with the file name file-name under the ~/.ssh/ directory. In contrast, in the second case, they are generated inside the current working directory (the ~/.ssh/ directory in this case).

It is a good practice to name your key pairs according to what you want to use them for. E.g. generate your key pair with the name id_github if you are going to use them for GitHub, or name it id_volta if you want to use it with some server named “Volta” (I just made up some name). Of course, this is just a cosmetic tip to make your life easier; it is unnecessary.
Enter a passphrase!
Enter this passphrase again!

As I already mentioned it, OpenSSH offers an additional layer of protection to your key pair in the form of an (optional) passphrase; this is a password for your key pair specifically. Whether you want to set up a passphrase for your key pair is up to you. If you don’t, press Enter twice to skip questions 2 and 3. If you want to specify a passphrase, remember that you must type it in every time you want to use your key pair, for example, when you want to clone a repository from GitHub or when accessing a server in any way. This could be annoying, but using a passphrase for your key pair is a good practice since it makes it harder for an attacker to use your key pair to access your account on a server or a website. Additionally, SSH can be configured to remember your passphrase as long as the current shell session is open.

Step 3. Aftercare of your new key pairs

After completing the previous steps, your terminal will display two messages about the successful creation of the private and public key files. Below that, the public key’s SHA256 fingerprint and its randomart will also be printed. The fingerprint is a character chain randomly generated with the SHA256 hashing algorithm, using the public key as an input, and the computer uses it to validate an asymmetric key pair more easily. Similarly, the randomart is a randomly generated ASCII art image that serves the same purpose but for humans! A human can more easily recognize a randomart image than a long character chain and can validate a key pair by comparing two randomarts.

After you finished with the setup, your terminal should look something like this:

Now the only thing that remains is to register your key pair in OpenSSH so your computer is aware of its existence. This can be done, by starting another software called ssh-agent and letting it run in the background, as long as we reboot our computer. This software is responsible for registering and handling your key pairs. To start it, then add your key to it, simply type the following command:

$ eval "$(ssh-agent -s)"
> Agent pid <some number>
$ ssh-add ~/.ssh/id_example

To ensure that the generated public and private key have the correct permissions, set them manually now (note the placeholder names id_example and id_example.pub!):

$ chmod 600 ~/.ssh/id_example && chmod 644 ~/.ssh/id_example.pub

(Here 600 stands for “can be read and written only by you”, while 644 stands for “can be read and written by only you and can be only read by anyone else”. Refer to the same forum post from above.)

Step 4. Registering your public key on the server

In the SSH protocol, the server will use your public key to validate the private key on your computer, when you try to connect to it. In case of OpenSSH, all public keys are stored in a file named ~/.ssh/authorized_keys in your home directory on the server. In this last step you have to copy your public key into this file.

There are several ways to do this, but the most easiest is to simply copy the content of your public key file into this directory on the server. This could be done using single, built-in command from OpenSSH, that even creates the ~/.ssh/authorized_keys file if it does not exist yet:

$ ssh-copy-id -i ~/.ssh/id_example.pub username@server-address

After restarting your terminal, you can check, whether the key pair is working properly by simply trying to login via SSH to the server:

$ ssh username@server-address

If everything is set up correctly, you should either be prompted to type your passphrase (if you set one up) or be logged in to the server without any further questions.

4. Setting your SSH config file

Something I would have loved to be aware of when I started using SSH is that it can be configured to be more user-friendly using a simple configuration file. Although it can offer you several quality of life updates for your SSH experience, I would like to highlight two of them here:

If there are multiple servers that you need to connect to via SSH, it can be tiresome to type full-length SSH commands every time you are trying to connect to a server. This can be shortened to single aliases using a correctly set up SSH config file. It can be extremely useful even if you only have to connect to a single remote machine.
The ssh-agent software that registers and handles your key pairs is only activated, when you first run an ssh command after a reboot. This means that it needs some encouragement every time you turn on your computer for it to work properly. To diminish the inconvenience of adding keys to the ssh-agent every time you are restarting your device, you can define keys in the config file and configure them to be automatically added to the agent, no matter what.

The SSH config file on my computer as an example for a real setup.

The config file can be created (touch) under the ~/.ssh/ directory and then given the necessary permissions (chmod 600), simply by

$ touch ~/.ssh/config && chmod 600 ~/.ssh/config

This file then can be edited with any text editor. The most important feature of this file is that you can collect every credential of a server that are otherwise should be typed explicitly into the command line every time, when you want to connect to that specific server. Instead of typing all these info, one can define an alias for any host (“server” in other words) inside the config file. When you wants to connect to a server now, you only have to specify the alias. Everything else is handled by OpenSSH.

The config file has a straightforward syntax:

Host alias_1
    Option_1 value_1
    Option_2 value_2
    ...
  
Host alias_2
    Option_1 value_1
    Option_2 value_2
    ...
.
.
.

The necessary Options that should be specified for any host are the HostName (IP or domain name of the server), User (your username on the server) and IdentityFile (path to the corresponding private key file). If you do not provide the latter, you will be prompted to type your password as by default. A comprehensive list with detailed descriptions of all possible SSH config options can be found on the official docs page.

It is also possible to specify an entry that is applied to all hosts. This is done by using the * wildcard as the host name. One useful example is to specify the AddKeysToAgent option for all hosts, so you do not have to add your keys to the ssh-agent manually every time you restart your computer. An entry like this would look like this:

Host *
    AddKeysToAgent yes

Host alias_1
.
.
.

An example setup for a real-world configuration

Consider the somewhat common scenario, where you first have to connect to a so-called “head node” server of some server network at a research institute, and from there, you need to SSH your way over to the actual server you want to work on. Imagine the server-side SSH software is listening to the port $2222$, so besides your credentials and the IP, you have to specify that too. In a setup like this, the home directory of the head node is usually shared with the actual server you want to work on. This means you do not need to generate two different key pairs for each machine, but you can use the same key pair for both and only need to copy it to the single, shared ~/.ssh/ directory.

The full command to access a server describe above would look like this:

$ ssh -p 2222 username@very.long.name.of.the.server.edu
on-head-node$ ssh username@ip-of-the-server-you-want-to-work-on

That is not really convenient to type it 54 times per day. Let us create an entry for that in the config file and even attach a private key to it that was previously configured for this server and my device. The entry would look like this:

Host head-node-whatever
    HostName very.long.name.of.the.server.edu
    Port 2222
    User username-goes-here
    IdentityFile ~/.ssh/id_example

Host server
    HostName ip-of-the-server-you-want-to-work-on
    User username-goes-here
    IdentityFile ~/.ssh/id_example
    ProxyJump head-node-whatever

The first line specifies the arbitrary alias (head-node-whatever) assigned to a specific host machine. The other entries below that specify the credentials and configurations regarding how to connect to the host machine. Now if I want to connect to the server, all I need to type on my machine is the following:

$ ssh server

Much easier, right?

An example setup for GitHub

The same can be set up for e.g. GitHub to ensure that your SSH key pair is working under any use-case. Since the communication method is for some reason hard-coded in case of git and GitHub, the whole entry should look exactly like this:

Host github.com
    HostName github.com
    User git
    RequestTTY no
    PreferredAuthentications publickey
    IdentityFile ~/.ssh/whatever-you-call-you-github-private-key

While both the alias and the HostName should be specified as github.com, you can still name your private key as you would like.