Prasanth Janardhanan

Creating an isolated cluster - provisioning a cluster and a Bastion host using Ansible

This is the second part of the series on setting up a cluster using Terraform and Ansible. In the first part, we had set up the bare cluster Virtual Machines. The set up included a virtual private network, a bastion server, and a separately configurable cluster of nodes.

The nodes are ready but not software or configuration is done so far. In the next step, we will provision all the nodes in our cloud

Provisioning Bastion Server using Ansible

Checkout the Ansible playbook in the folder 2-provision/bastion folder The first step we do is to configure the system:

- name: Set timezone to Etc/UTC
    name: Etc/UTC

# Install Packages
- name: Update apt
  apt: update_cache=yes

- name: Install required system packages
  apt: name={{ item }} state=latest update_cache=yes force_apt_get=yes
  loop: [ 'curl', 'vim', 'ufw']

- name: Should be able to add local hosts in /etc/hosts
    path: /etc/cloud/cloud.cfg
    regexp: '^manage_etc_hosts:'
    line: manage_etc_hosts:false
- name: Update system
    name: "*"
    update_cache: yes
    state: latest
  register: system_updated

First, we set up the timezone to be UTC. Then we install or upgrade some system packages. We update the file /etc/cloud/cloud.cfg to add a configuration to be able to edit the etc/hosts file. We will be updating the hosts file for accessing the nodes in the cluster easily using names like node1, node2 etc

User setup

The next step is to configure user login and ssh. See the role users in the folder 2-provision/roles/users. We create a ‘wheel’ group with sudo permissions. Then we disable root login. Note that this is just one way of configuring user access and ssh configuration. Depending on your requirements and configuration you can customize the user login setup by adding stricter and tighter SSH login restrictions.

# Sudo Group Setup
- name: Make sure we have a 'wheel' group
    name: wheel
    state: present

- name: Allow 'wheel' group to have passwordless sudo
    path: /etc/sudoers
    state: present
    regexp: '^%wheel'
    line: '%wheel ALL=(ALL) NOPASSWD: ALL'
    validate: '/usr/sbin/visudo -cf %s'
- name: Create a new user
    name: "{{ deploy_user_name }}"
    state: present
    groups: wheel
    append: true
    create_home: true
    shell: /bin/bash
- name: Set authorized key for remote user
    user: "{{ deploy_user_name }}"
    state: present
    key: "{{ lookup('file', deploy_user_key_path ) }}"
- name: Disable password login
    path: /etc/ssh/sshd_config 
    regexp: '^(#\s*)?PasswordAuthentication '
    line: 'PasswordAuthentication no'
  notify: Restart sshd
- name: Disable root login
    path: /etc/ssh/sshd_config
    state: present
    regexp: '^#?PermitRootLogin'
    line: 'PermitRootLogin no'
  notify: Restart sshd

Configuring node communications

The next few steps are to configure the SSH communication with the nodes in the cluster a bit easier. We will assign names to the nodes in the cluster and use the private IP address to communicate between the nodes.

    - name: Add IP address of all hosts to hosts file
        dest: /etc/hosts
        regexp: '.*{{ }}$'
        line: "{{ item.ip }}  {{}}"
        state: present
        - name: node1
          ip: ""
        - name: node2
          ip: ""
        - name: node3
          ip: ""   
        - name: node4
          ip: ""   
        - name: node5
          ip: ""
        - name: node6
          ip: ""
    - name: Copy keys used to connect to nodes
        src: "keys/{{item}}"
        dest: "/home/{{ deploy_user_name }}/.ssh/{{item}}"
        mode: "600"
        owner: "{{ deploy_user_name }}"
        group: "{{ deploy_user_name }}"
        - nodes
    - name: Make sure ssh/config file exists
         path: "/home/{{ deploy_user_name }}/.ssh/config"
         state: touch        
    - name: Setup SSH config file for nodes
         path: "/home/{{ deploy_user_name }}/.ssh/config"
         marker: "# {mark} Added through ansible scripts {{item}}"
         block:  |
            Host {{item}}
                IdentityFile ~/.ssh/nodes
                User {{ deploy_user_name }}            
         - node1
         - node2
         - node3
         - node4
         - node5
         - node6

Installing tools

Next, we install tools required for running the cluster. For example, it will be handy to have kubectl and helm on the bastion host. So we install these tools.

Next, we install and configure an NFS share on the bastion host. This is optional. However, this will be handy to have a network share accessible from all nodes of the cluster.

Firewall setup

We will use ufw for firewall. First, install ufw then allow only SSH incoming. We also open the NFS port only to the private network.

    - name: Install ufw
        name: ufw
        state: latest
    - name: ufw allow ssh
        rule: allow
        name: OpenSSH
    - name: ufw allow NFS
        rule: allow
        port: "2049"
        src: ""
    - name: Enable UFW
          state: enabled

Provisioning the Bastion server

Now that we have the scripts ready, let us run Ansible to configure the servers.

First, you need to generate SSH keys for setting up a login on the bastion host Let’s name it mycluster-bastion. Then create SSH keys for the nodes. Keep the node keys(private and public) in the folder 2-provision/bastion/keys

Get the IP address of the bastion host to node.txt Then run the Ansible playbook

ansible-playbook bastion.yaml -i node.txt --user='root' --key-file="~/.ssh/tcloud" --ssh-extra-args='-p 22 -o ConnectTimeout=10 -o ConnectionAttempts=10 -o StrictHostKeyChecking=no' --extra-vars="deploy_user_name=nodeuser deploy_user_key_path=~/.ssh/"

All these steps can be automated in the bash script like this:

function ProvisionBase(){
   cd ./1-infra/mycluster/base
   IP=$(terraform show | egrep bastion_ip | cut -d'"' -f 2 )
   echo "IP is $IP"
   cd ../../../2-provision/bastion 
   printf "$IP\n" > node.txt
   ansible-playbook bastion.yaml -i node.txt --user='root' --key-file="~/.ssh/tcloud" --ssh-extra-args='-p 22 -o ConnectTimeout=10 -o ConnectionAttempts=10 -o StrictHostKeyChecking=no' --extra-vars="deploy_user_name=nodeuser deploy_user_key_path=~/.ssh/"

The script can be run like this:

./ provision base

Provisioning the cluster nodes

We run the basic system setup on the cluster nodes as well using the same roles used for provisioning the bastion server. Then we setup SSH and user rights using the users role.

Installing essential software on the nodes

- name: post build setup - nodes
  hosts: all
  become: true
    - name: Initial System Setup
         name: ../roles/system
    - name: Users setup
         name: ../roles/users  
    - name: Install docker
         name: geerlingguy.docker 
            - "{{ deploy_user_name }}"      
    - name: Set authorized key taken from file
        user: myadmin
        state: present
        key: "{{ lookup('file', '../bastion/keys/') }}"
    - name: Enable UFW firewall
        name: ../roles/ufw    

The nodes require only a few software installed. We intend to build a Kubernetes cluster. We intend to use Docker for the virtualization layer. So we install docker. Then we copy the SSH keys. Finally, we install and configure UFW firewall.

The firewall is configured to disallow communications from the public network. All communications to the cluster nodes are through the private network. See the UFW rules configured using Ansible:

- name: make sure ufw is installed
    name: ufw
    state: latest
- name: disable all incoming on eth0
     rule: reject
     direction: in
     interface: eth0

- name: allow all from internal network
     rule: allow
     from_ip: ""
     to_ip: any
- name: Enable UFW
    state: enabled

Now that the script is ready, we can run Ansible playbook.

First, create SSH key for the node user mycluster-nodes. First, get the public IP addresses of the nodes to a cluster.txt inventory file. Then run the command

cd ./2-provision/cluster

ansible-playbook nodes.yaml -i cluster.txt --user='root' --key-file="~/.ssh/tcloud" --ssh-extra-args='-p 22 -o ConnectTimeout=10 -o ConnectionAttempts=10 -o StrictHostKeyChecking=no' --extra-vars="deploy_user_name=nodeuser deploy_user_key_path=~/.ssh/"

Here is the bash script function that does the same:

function provisionCluster(){
   cd "./1-infra/mycluster/$NAME"
   NODES=$(terraform show | egrep ipv4_address | cut -d'"' -f 2 | sort -u)
   cd ../../../2-provision/cluster
   printf "$NODES\n" > cluster.txt
   ansible-playbook nodes.yaml -i cluster.txt --user='root' --key-file="~/.ssh/tcloud" --ssh-extra-args='-p 22 -o ConnectTimeout=10 -o ConnectionAttempts=10 -o StrictHostKeyChecking=no' --extra-vars="deploy_user_name=nodeuser deploy_user_key_path=~/.ssh/"

Once the command finishes the cluster nodes are provisioned.

In order to test whether the setup is working, first SSH to the Bastion host Then from the bastion host, ssh to the nodes in the cluster. Add the IP address of the bastion host to your /etc/hosts file

ssh -i ~/.ssh/mycluster-bastion nodeuser@bastion

ssh nodeuser@node1

If you could get to node1 through SSH, the setup is working as expected. We can proceed to install Kubernetes on this cluster.

Similarly, you can try logging into node2 and node3.

SSH to the nodes through the bastion host

You can configure SSH on your local laptop to connect to the node through the bastion host. Here is a sample ssh config file (place the file in ~/.ssh/config )

Host bastion
  User nodeuser
  IdentityFile ~/.ssh/mycluster-bastion
Host node1
  IdentityFile ~/.ssh/mycluster-nodes
  ProxyCommand ssh nodeuser@bastion -W %h:%p

Now you can access node1 directly from your local laptop:

ssh nodeuser@node1