Erwan Gallen
Erwan Gallen

Apr 26, 2021 44 min read

OpenShift Bare Metal provisioning with NVIDIA GPU

thumbnail for this post

TL;DR

The bare metal installation of OCP is only this installer command:

$ openshift-baremetal-install --dir ~/clusterconfigs create cluster

but I’ll take time in this post to explain how to prepare your platform and how to follow the installation.

Here is the platform we will deploy in this post: OCP platform Dell PowerEdge R740xd servers

Each node of my lab is one Dell PowerEdge R740xd server: Dell PowerEdge R740xd servers

We have a total of 7 x Dell PowerEdge R740xd servers.

Each node has a two Xeon Gold 6230:

$ cat /proc/cpuinfo | grep Xeon | tail -1
model name  : Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz

Each node has 192GB of RAM (12 x memory modules: 16GiB DIMM DDR4 Synchronous Registered Buffered 2933 MHz):

$ free -h
              total        used        free      shared  buff/cache   available
Mem:          187Gi        11Gi       159Gi        90Mi        16Gi       174Gi
Swap:            0B          0B          0B

Each worker node has one PCIe NVIDIA T4 GPU accelerator:

$ lspci -nn | grep -i nvidia
3b:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)

I’ll aply the official OCP Installer-provisioned installation documentation: https://docs.openshift.com/container-platform/4.7/installing/installing_bare_metal_ipi/ipi-install-overview.html

Installation steps:

Prepare the environment

We are taking one node as “Provisioning node”. This node will be the host of the temporary “Bootstrap VM” created by the OpenShift installer CLI. In addition to the OpenShift installer, I’ll use the “Provisioning node” to host the baremetal-network DHCP server (dhcp-server), a DNS server (bind9) and the bare-metal network gateway (iptables).

We have one RHEL 8.2 system ready:

[kni@e26-h37-740xd ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.2 (Ootpa)

Enable the sudo NOPASSWD:

[root@e26-h37-740xd ~]# sed --in-place 's/^\s*\(%wheel\s\+ALL=(ALL)\s\+ALL\)/# \1/' /etc/sudoers
[root@e26-h37-740xd ~]# sed --in-place 's/^#\s*\(%wheel\s\+ALL=(ALL)\s\+NOPASSWD:\s\+ALL\)/\1/' /etc/sudoers
[root@e26-h37-740xd ~]# diff /tmp/sudoers /etc/sudoers
107c107
< %wheel	ALL=(ALL)	ALL
---
> # %wheel	ALL=(ALL)	ALL
110c110
< # %wheel	ALL=(ALL)	NOPASSWD: ALL
---
> %wheel	ALL=(ALL)	NOPASSWD: ALL

Create the kni user:

[root@e26-h37-740xd ~]# adduser kni
[root@e26-h37-740xd ~]# passwd kni
[root@e26-h37-740xd ~]# usermod -aG wheel kni

Copy your ssh key:

egallen@laptop ~ % ssh-copy-id kni@e26-h37-740xd

Check network:

[kni@e26-h37-740xd ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:de brd ff:ff:ff:ff:ff:ff
    inet 10.19.96.186/23 brd 10.19.97.255 scope global dynamic noprefixroute eno3
       valid_lft 258859sec preferred_lft 258859sec
3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e4:43:4b:b2:77:df brd ff:ff:ff:ff:ff:ff
4: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:00 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f8d5:7679:162b:40b2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
5: ens7f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:01 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::be3:be06:2146:7e38/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
6: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dc brd ff:ff:ff:ff:ff:ff
    inet 172.16.96.186/16 brd 172.16.255.255 scope global noprefixroute eno1
       valid_lft forever preferred_lft forever
7: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dd brd ff:ff:ff:ff:ff:ff
    inet 172.17.96.186/16 brd 172.17.255.255 scope global noprefixroute eno2
       valid_lft forever preferred_lft forever

Check routes:

[kni@e26-h37-740xd ~]$ ip r
default via 10.19.97.254 dev eno3 proto dhcp metric 102 
10.19.96.0/23 dev eno3 proto kernel scope link src 10.19.96.186 metric 102 
172.16.0.0/16 dev eno1 proto kernel scope link src 172.16.96.186 metric 103 
172.17.0.0/16 dev eno2 proto kernel scope link src 172.17.96.186 metric 104 
172.18.0.0/16 dev ens7f0 proto kernel scope link src 172.18.96.186 metric 100 
172.19.0.0/16 dev ens7f1 proto kernel scope link src 172.19.96.186 metric 101 
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown 

Firewall

iptables-service is started (firewalld was not autmatically enable in this node but firewalld is good too):

[kni@e26-h37-740xd ~]$ sudo systemctl status iptables
● iptables.service - IPv4 firewall with iptables
   Loaded: loaded (/usr/lib/systemd/system/iptables.service; enabled; vendor preset: disabled)
   Active: active (exited) since Wed 2021-04-21 08:00:48 UTC; 11s ago
 Main PID: 8506 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 1227228)
   Memory: 0B
   CGroup: /system.slice/iptables.service

Apr 21 08:00:48 e26-h37-740xd.alias.bos.scalelab.redhat.com systemd[1]: Starting IPv4 firewall with iptables...
Apr 21 08:00:48 e26-h37-740xd.alias.bos.scalelab.redhat.com iptables.init[8506]: iptables: Applying firewall rules: [  OK  ]
Apr 21 08:00:48 e26-h37-740xd.alias.bos.scalelab.redhat.com systemd[1]: Started IPv4 firewall with iptables.

Configure iptables:
```shell
[kni@e26-h37-740xd ~]$ sudo cat /etc/sysconfig/iptables
*nat
:PREROUTING ACCEPT [2:132]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
-A POSTROUTING -j MASQUERADE
COMMIT
*filter
:INPUT ACCEPT [1:328]
:FORWARD ACCEPT [4:564]
:OUTPUT ACCEPT [61:6744]
-A INPUT -j ACCEPT
-A FORWARD -j ACCEPT
COMMIT

Restart and enable iptables:

[kni@e26-h37-740xd ~]$ sudo systemctl restart iptables

[kni@e26-h37-740xd ~]$ sudo systemctl enable iptables

Forwarding

Check before:

[kni@e26-h37-740xd ~]$ cat /proc/sys/net/ipv4/ip_forward
0

Enable the forwarding:

[kni@e26-h37-740xd ~]$ echo "net.ipv4.ip_forward = 1" | sudo tee -a /etc/sysctl.conf
net.ipv4.ip_forward = 1

[kni@e26-h37-740xd ~]$ sudo sysctl -p
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv4.ip_forward = 1

Check after:

[kni@e26-h37-740xd ~]$ cat /proc/sys/net/ipv4/ip_forward
1

Libvirt preparation to prepare the bootstrap Virtual Machine enablement

We need to install libvirt that will be used by the OpenShift installer.

Install the following packages:

[kni@e26-h37-740xd ~]$ sudo dnf install -y libvirt qemu-kvm mkisofs python3-devel jq ipmitool tmux

Modify the user to add the libvirt group to the kni user:

[kni@e26-h37-740xd ~]$ sudo usermod --append --groups libvirt kni

Start libvirt:

[kni@e26-h37-740xd ~]$ sudo systemctl start libvirtd
[kni@e26-h37-740xd ~]$ sudo systemctl enable libvirtd --now

Create the default storage pool and start it:

[kni@e26-h37-740xd ~]$ sudo virsh pool-define-as --name default --type dir --target /var/lib/libvirt/images
Pool default defined

[kni@e26-h37-740xd ~]$ sudo virsh pool-start default
Pool default started

[kni@e26-h37-740xd ~]$ sudo virsh pool-autostart default
Pool default marked as autostarted

Networking configuration

We will create to interfaces “baremetal” and “provisioning” connected to the respective networks.

First, check the status before changing the network configuration:

[kni@e26-h37-740xd ~]$ ip r
default via 10.19.97.254 dev eno3 proto dhcp metric 102 
10.19.96.0/23 dev eno3 proto kernel scope link src 10.19.96.186 metric 102 
172.22.0.0/24 dev eno2 proto kernel scope link src 172.22.0.1 metric 106 
192.168.0.0/24 dev eno1 proto kernel scope link src 192.168.0.1 metric 107 
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown 

[kni@e26-h37-740xd ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:de brd ff:ff:ff:ff:ff:ff
    inet 10.19.96.186/23 brd 10.19.97.255 scope global dynamic noprefixroute eno3
       valid_lft 257768sec preferred_lft 257768sec
3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e4:43:4b:b2:77:df brd ff:ff:ff:ff:ff:ff
4: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:00 brd ff:ff:ff:ff:ff:ff
5: ens7f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:01 brd ff:ff:ff:ff:ff:ff
6: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dc brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/24 brd 192.168.0.255 scope global noprefixroute eno1
       valid_lft forever preferred_lft forever
7: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dd brd ff:ff:ff:ff:ff:ff
    inet 172.22.0.1/24 brd 172.22.0.255 scope global noprefixroute eno2
       valid_lft forever preferred_lft forever
16: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:08:31:83 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
17: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:08:31:83 brd ff:ff:ff:ff:ff:ff

Show the active connections with the Network Manager CLI:

[kni@e26-h37-740xd ~]$ sudo nmcli con show
NAME         UUID                                  TYPE      DEVICE 
System eno3  24871ea9-4411-efbd-924f-49cd9fbda6e2  ethernet  eno3   
eno1         abf4c85b-57cc-4484-4fa9-b4a71689c359  ethernet  eno1   
eno2         b186f945-cc80-911d-668c-b51be8596980  ethernet  eno2   
virbr0       626ca2a3-a4a1-4ded-b7a6-37f841b3f931  bridge    virbr0 

Configure networking:

[kni@e26-h37-740xd ~]$ export PUB_CONN=eno1

[kni@e26-h37-740xd ~]$ export PROV_CONN=eno2

[kni@e26-h37-740xd ~]$ sudo nmcli con down $PROV_CONN
[kni@e26-h37-740xd ~]$ nmcli con delete $PROV_CONN

[kni@e26-h37-740xd ~]$ sudo nmcli con down $PUB_CONN
[kni@e26-h37-740xd ~]$ sudo nmcli con delete $PUB_CONN

[kni@e26-h37-740xd ~]$ sudo nmcli connection add ifname provisioning type bridge con-name provisioning
[kni@e26-h37-740xd ~]$ sudo nmcli con add type bridge-slave ifname $PROV_CONN master provisioning

[kni@e26-h37-740xd ~]$ sudo nmcli connection add ifname baremetal type bridge con-name baremetal
[kni@e26-h37-740xd ~]$ sudo nmcli con add type bridge-slave ifname $PUB_CONN master baremetal

[kni@e26-h37-740xd ~]$ sudo pkill dhclient

[kni@e26-h37-740xd ~]$ sudo nmcli con mod baremetal connection.autoconnect yes ipv4.method manual ipv4.addresses 192.168.0.1/24 ipv4.dns 192.168.0.1
[kni@e26-h37-740xd ~]$ sudo nmcli con up baremetal

[kni@e26-h37-740xd ~]$ sudo nmcli con mod provisioning connection.autoconnect yes ipv4.method manual ipv4.addresses 172.22.0.1/24
[kni@e26-h37-740xd ~]$ sudo nmcli con up provisioning

Show the active connections with the Network Manager CLI:

[kni@e26-h37-740xd ~]$ sudo nmcli con show
NAME               UUID                                  TYPE      DEVICE       
System eno3        24871ea9-4411-efbd-924f-49cd9fbda6e2  ethernet  eno3         
baremetal          70c6b297-abcc-45cb-b7bb-ecb15bcf532f  bridge    baremetal    
provisioning       0dac0189-1498-4de5-9135-19b2db4503ea  bridge    provisioning 
virbr0             626ca2a3-a4a1-4ded-b7a6-37f841b3f931  bridge    virbr0       
bridge-slave-eno1  982056a8-9015-4ef7-ae02-35f7e8de61e1  ethernet  eno1         
bridge-slave-eno2  f145eb95-115b-484b-9bec-34bccffa306c  ethernet  eno2

[kni@e26-h37-740xd ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:de brd ff:ff:ff:ff:ff:ff
    inet 10.19.96.186/23 brd 10.19.97.255 scope global dynamic noprefixroute eno3
       valid_lft 257249sec preferred_lft 257249sec
3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e4:43:4b:b2:77:df brd ff:ff:ff:ff:ff:ff
4: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:00 brd ff:ff:ff:ff:ff:ff
5: ens7f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:01 brd ff:ff:ff:ff:ff:ff
6: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master baremetal state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dc brd ff:ff:ff:ff:ff:ff
7: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master provisioning state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dd brd ff:ff:ff:ff:ff:ff
16: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:08:31:83 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
17: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:08:31:83 brd ff:ff:ff:ff:ff:ff
22: provisioning: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dd brd ff:ff:ff:ff:ff:ff
    inet 172.22.0.1/24 brd 172.22.0.255 scope global noprefixroute provisioning
       valid_lft forever preferred_lft forever
    inet6 fe80::f46e:d32f:c08e:a291/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
23: baremetal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dc brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/24 brd 192.168.0.255 scope global noprefixroute baremetal
       valid_lft forever preferred_lft forever
    inet6 fe80::c467:dc53:acad:e902/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

DHCP installation

[kni@e26-h37-740xd ~]$ sudo dnf -y install dhcp-server 
[kni@e26-h37-740xd ~]$ sudo cp /usr/lib/systemd/system/dhcpd.service /etc/systemd/system/

Add your interface (here “baremetal”) in the ExecStart line:

[kni@e26-h37-740xd ~]$ sudo cat /etc/systemd/system/dhcpd.service
[Unit]
Description=DHCPv4 Server Daemon
Documentation=man:dhcpd(8) man:dhcpd.conf(5)
Wants=network-online.target
After=network-online.target
After=time-sync.target

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/dhcpd
ExecStart=/usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid $DHCPDARGS baremetal
StandardError=null

[Install]
WantedBy=multi-user.target

Prepare the DHCP configuration

[kni@e26-h37-740xd ~]$ sudo cat /etc/dhcp/dhcpd.conf
option domain-name     "alias.bos.scalelab.redhat.com";

# Declare DHCP Server
authoritative;
allow booting;
allow bootp;
allow unknown-clients;
default-lease-time -1;
max-lease-time -1;

# The default DHCP lease time
default-lease-time 600;

# Set the maximum lease time
max-lease-time 7200;

# Set Network address, subnet mask and gateway

subnet 192.168.0.0 netmask 255.255.0.0 {
  range 192.168.0.100 192.168.0.200;
  option domain-name-servers 192.168.0.1;
  option routers 192.168.0.1;
  max-lease-time 172800;
}

group {
  # master e25-h31
  host e25-h31-740xd.alias.bos.scalelab.redhat.com {
      hardware ethernet e4:43:4b:b2:77:d4;
      fixed-address 192.168.0.60;
      ddns-hostname "e25-h31-740xd.alias.bos.scalelab.redhat.com";
  }
  # master e26-h33
  host e26-h33-740xd.alias.bos.scalelab.redhat.com {
      hardware ethernet e4:43:4b:b2:75:c0;
      fixed-address 192.168.0.61;
      ddns-hostname "e26-h33-740xd.alias.bos.scalelab.redhat.com";
  }
  # master e26-h35
  host e26-h35-740xd.alias.bos.scalelab.redhat.com {
      hardware ethernet e4:43:4b:b2:74:88;
      fixed-address 192.168.0.62;
      ddns-hostname "e26-h35-740xd.alias.bos.scalelab.redhat.com";
  }
  # worker e25-h11
  host e25-h11-740xd.alias.bos.scalelab.redhat.com {
      hardware ethernet e4:43:4b:27:ee:0c;
      fixed-address 192.168.0.63;
      ddns-hostname "e25-h11-740xd.alias.bos.scalelab.redhat.com";
  }
  # worker e26-h09 
  host e26-h09-740xd.alias.bos.scalelab.redhat.com {
      hardware ethernet e4:43:4b:b2:78:04;
      fixed-address 192.168.0.64;
      ddns-hostname "e26-h09-740xd.alias.bos.scalelab.redhat.com";
  }
  # worker e26-h11
  host e26-h11-740xd.alias.bos.scalelab.redhat.com {
      hardware ethernet e4:43:4b:b2:77:7c;
      fixed-address 192.168.0.65;
      ddns-hostname "e26-h11-740xd.alias.bos.scalelab.redhat.com";
  }
}

dhcpd is not started for now:

[kni@e26-h37-740xd ~]$ sudo systemctl status dhcpd
● dhcpd.service - DHCPv4 Server Daemon
   Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:dhcpd(8)
           man:dhcpd.conf(5)

[kni@e26-h37-740xd ~]$ sudo systemctl restart dhcpd

Start the DHCP daemon:

[kni@e26-h37-740xd ~]$ sudo systemctl enable dhcpd
Created symlink /etc/systemd/system/multi-user.target.wants/dhcpd.service → /etc/systemd/system/dhcpd.service.
[kni@e26-h37-740xd ~]$ sudo systemctl start dhcpd

DNS

Install the bind packages:

[kni@e26-h37-740xd ~]$ sudo dnf install bind-utils bind -y
[root@e26-h37-740xd named]# cat /etc/named.conf 
//
// named.conf
//
// Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
// server as a caching only nameserver (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
// See the BIND Administrator's Reference Manual (ARM) for details about the
// configuration located in /usr/share/doc/bind-{version}/Bv9ARM.html

options {
        forwarders { 10.19.96.1; };
        listen-on port 53 { any; };
        listen-on-v6 { none; };
        #listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        recursing-file  "/var/named/data/named.recursing";
        secroots-file   "/var/named/data/named.secroots";
        allow-query     { any; };
        allow-query-cache { any; };

        /*
         - If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.
         - If you are building a RECURSIVE (caching) DNS server, you need to enable
           recursion.
         - If your recursive DNS server has a public IP address, you MUST enable access
           control to limit queries to your legitimate users. Failing to do so will
           cause your server to become part of large scale DNS amplification
           attacks. Implementing BCP38 within your network would greatly
           reduce such attack surface
        */
        recursion yes;

        dnssec-enable no;
        dnssec-validation no;

        /* Path to ISC DLV key */
        bindkeys-file "/etc/named.iscdlv.key";

        managed-keys-directory "/var/named/dynamic";

        pid-file "/run/named/named.pid";
        session-keyfile "/run/named/session.key";
};

logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};

zone "." IN {
        type hint;
        file "named.ca";
};


zone "lan.redhat.com" in {
      type master;
      file "lan.redhat.com.zone";
      notify no;
};

zone "0.16.172.in-addr.arpa" IN {
    type master;
    file "0.16.172.in-addr.arpa";
};

include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
[kni@e26-h37-740xd ~]$ sudo cat /var/named/lan.redhat.com.zone
$TTL 10
lan.redhat.com. IN SOA      gateway  root.lan.redhat.com. (
            2021041509  ; serial
            1D          ; refresh
            2H          ; retry
            1W          ; expiry
            2D )        ; minimum
            IN NS       gateway

*                            IN A        192.168.0.12
*.alias                      IN A        192.168.0.12
api.alias                    IN A        192.168.0.10
api-int.alias                IN A        192.168.0.10
dns.alias                    IN A        192.168.0.1
provisioner.alias            IN A        192.168.0.6
openshift-master-0.alias     IN A        192.168.0.60
openshift-master-1.alias     IN A        192.168.0.61
openshift-master-2.alias     IN A        192.168.0.62
openshift-worker-0.alias     IN A        192.168.0.63
openshift-worker-1.alias     IN A        192.168.0.64
openshift-worker-2.alias     IN A        192.168.0.65
[root@e26-h37-740xd named]# sudo vi /var/named/0.16.172.in-addr.arpa 
$ORIGIN 0.168.192.in-addr.arpa.
$TTL 10
@  IN  SOA  ns1.alias.lan.redhat.com.  root.alias.lan.redhat.com. (
       2001062502  ; serial
       21600       ; refresh after 6 hours
       3600        ; retry after 1 hour
       604800      ; expire after 1 week
       86400 )     ; minimum TTL of 1 day
;
        IN      NS       ns1.alias.lan.redhat.com.
6       IN      PTR      provisioner.alias.lan.redhat.com.
10      IN      PTR      api.alias.lan.redhat.com.
1       IN      PTR      ns1.alias.lan.redhat.com.
60      IN      PTR      openshift-master-0.alias.lan.redhat.com.
61      IN      PTR      openshift-master-1.alias.lan.redhat.com.
62      IN      PTR      openshift-master-2.openshift.example.com.
63      IN      PTR      openshift-worker-0.openshift.example.com.
64      IN      PTR      openshift-worker-1.openshift.example.com.
65      IN      PTR      openshift-worker-2.openshift.example.com.

Start and enable named:

[kni@e26-h37-740xd ~]$ sudo systemctl restart named

[kni@e26-h37-740xd ~]$ sudo systemctl enable named
Created symlink /etc/systemd/system/multi-user.target.wants/named.service → /usr/lib/systemd/system/named.service.

Test named:

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 api.alias.bos.scalelab.redhat.com
192.168.0.10

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 ingress.alias.bos.scalelab.redhat.com
192.168.0.12

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.60 +noall +answer
60.0.168.192.in-addr.arpa. 10 IN  PTR openshift-master-0.alias.lan.redhat.com.

Test the reverse IPs for the nodes:

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.60 +noall +answer
60.0.168.192.in-addr.arpa. 10 IN  PTR e25-h31-740xd.alias.bos.scalelab.redhat.com.

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.61 +noall +answer
61.0.168.192.in-addr.arpa. 10 IN  PTR e26-h33-740xd.alias.bos.scalelab.redhat.com.

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.62 +noall +answer
62.0.168.192.in-addr.arpa. 10 IN  PTR e26-h35-740xd.alias.bos.scalelab.redhat.com.

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.63 +noall +answer
63.0.168.192.in-addr.arpa. 10 IN  PTR e25-h11-740xd.alias.bos.scalelab.redhat.com.

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.64 +noall +answer
64.0.168.192.in-addr.arpa. 10 IN  PTR e26-h09-740xd.alias.bos.scalelab.redhat.com.

[kni@e26-h37-740xd ~]$ dig @192.168.0.1 -x 192.168.0.65 +noall +answer
65.0.168.192.in-addr.arpa. 10 IN  PTR e26-h11-740xd.alias.bos.scalelab.redhat.com.

Test the DNS entries for the nodes:

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 e26-h31-740xd.alias.bos.scalelab.redhat.com
192.168.0.60

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 e26-h33-740xd.alias.bos.scalelab.redhat.com
192.168.0.61

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 e26-h35-740xd.alias.bos.scalelab.redhat.com
192.168.0.62

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 e25-h11-740xd.alias.bos.scalelab.redhat.com
192.168.0.63

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 e26-h09-740xd.alias.bos.scalelab.redhat.com
192.168.0.64

[kni@e26-h37-740xd ~]$ dig +short @192.168.0.1 e26-h11-740xd.alias.bos.scalelab.redhat.com
192.168.0.65

OS preparation

Generate a ssh key for kni

[kni@e26-h37-740xd ~]$ ssh-keygen -t ed25519 -f /home/kni/.ssh/id_rsa -N ''
Generating public/private ed25519 key pair.
Your identification has been saved in /home/kni/.ssh/id_rsa.
Your public key has been saved in /home/kni/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:8+uGuom2hWVYF1doC99QE5RzV4CQgXGRwCyvAcSk+rU kni@e26-h37-740xd.alias.bos.scalelab.redhat.com
The key's randomart image is:
+--[ED25519 256]--+
|   +o  o++*@*+..o|
|   .o . =+* +.. .|
|  .  ..o.+ + o . |
| .   o... o .    |
|.   o ooS        |
| . . =.  o       |
|  . E .  ..      |
|    .o .. ..     |
|   .o.+o oo      |
+----[SHA256]-----+

Create a pull-secret.txt file from https://cloud.redhat.com/openshift/install/metal/user-provisioned

[kni@e26-h37-740xd ~]$ cat pull-secret.txt 
{"auths":{"cloud.openshift.com":{"auth":"b3Blbn
...
[kni@e26-h37-740xd ~]$ export VERSION=4.7.7

[kni@e26-h37-740xd ~]$ export RELEASE_IMAGE=$(curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$VERSION/release.txt | grep 'Pull From: quay.io' | awk -F ' ' '{print $3}')
[kni@e26-h37-740xd ~]$ echo $RELEASE_IMAGE
quay.io/openshift-release-dev/ocp-release@sha256:aee8055875707962203197c4306e69b024bea1a44fa09ea2c2c621e8c5000794

Set the environment variables:

[kni@e26-h37-740xd ~]$ export cmd=openshift-baremetal-install
[kni@e26-h37-740xd ~]$ export pullsecret_file=~/pull-secret.txt
[kni@e26-h37-740xd ~]$ export extract_dir=$(pwd)
[kni@e26-h37-740xd ~]$ curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$VERSION/openshift-client-linux.tar.gz | tar zxvf - oc
[kni@e26-h37-740xd ~]$ sudo cp oc /usr/local/bin
[kni@e26-h37-740xd ~]$ rm oc

Extract the installer:

[kni@e26-h37-740xd ~]$ oc adm release extract --registry-config "${pullsecret_file}" --command=$cmd --to "${extract_dir}" ${RELEASE_IMAGE}
[kni@e26-h37-740xd ~]$ sudo cp openshift-baremetal-install /usr/local/bin
[kni@e26-h37-740xd ~]$ rm openshift-baremetal-install

Create the clusterconfigs folder:

[kni@e26-h37-740xd ~]$ mkdir ~/clusterconfigs

Prepare some simple bash tools that will help to get the server power status:

[kni@e26-h37-740xd ~]$ cat ipmi_test.sh 
#!/bin/bash
echo -n "master: "; ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h31-740xd.alias.bos.scalelab.redhat.com power status
echo -n "master: "; ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h33-740xd.alias.bos.scalelab.redhat.com power status 
echo -n "master: "; ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h35-740xd.alias.bos.scalelab.redhat.com power status
echo -n "worker: "; ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e25-h11-740xd.alias.bos.scalelab.redhat.com power status
echo -n "worker: "; ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h09-740xd.alias.bos.scalelab.redhat.com power status

[kni@e26-h37-740xd ~]$ cat ipmi_off.sh 
#!/bin/bash
ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h31-740xd.alias.bos.scalelab.redhat.com power off
ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h33-740xd.alias.bos.scalelab.redhat.com power off
ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h35-740xd.alias.bos.scalelab.redhat.com power off
ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e25-h11-740xd.alias.bos.scalelab.redhat.com power off
ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h09-740xd.alias.bos.scalelab.redhat.com power off
ipmitool -I lanplus -U scalelab -P XXXXXX -H management-e26-h11-740xd.alias.bos.scalelab.redhat.com power off

[kni@e26-h37-740xd ~]$ cat clean_bootstrap.sh 
#!/bin/bash
for i in $(sudo virsh list | tail -n +3 | grep bootstrap | awk {'print $2'});
do
  sudo virsh destroy $i;
  sudo virsh undefine $i;
  sudo virsh vol-delete $i --pool default;
  sudo virsh vol-delete $i.ign --pool default;
done

Install the OCP 4.7.7 cluster

Check, all the nodes should be powered off:

[kni@e26-h37-740xd ~]$ ./ipmi_test.sh 
Chassis Power is off
Chassis Power is off
Chassis Power is off
Chassis Power is off
Chassis Power is off
Chassis Power is off

Check version:

[kni@e26-h37-740xd ~]$ oc version
Client Version: 4.7.7

[kni@e26-h37-740xd ~]$ openshift-baremetal-install version
openshift-baremetal-install 4.7.7
built from commit fae650e24e7036b333b2b2d9dfb5a08a29cd07b1
release image quay.io/openshift-release-dev/ocp-release@sha256:aee8055875707962203197c4306e69b024bea1a44fa09ea2c2c621e8c5000794

Additional requirements with no provisioning network: All installer-provisioned installations require a baremetal network. The baremetal network is a routable network used for external network access to the outside world. In addition to the IP address supplied to the OpenShift Container Platform cluster node, installations without a provisioning network require the following:

  • Setting an available IP address from the baremetal network to the bootstrapProvisioningIP configuration setting within the install-config.yaml configuration file.
  • Setting an available IP address from the baremetal network to the provisioningHostIP configuration setting within the install-config.yaml configuration file.
  • Deploying the OpenShift Container Platform cluster using RedFish Virtual Media/iDRAC Virtual Media.

Prepare the install-config.yaml:

[kni@e26-h37-740xd ~]$ cat install-config.yaml
apiVersion: v1
baseDomain: bos.scalelab.redhat.com
metadata:
  name: alias
networking:
  machineCIDR: 192.168.0.0/24
  networkType: OVNKubernetes
compute:
- name: worker
  replicas: 3
controlPlane:
  name: master
  replicas: 3
  platform:
    baremetal: {}
platform:
  baremetal:
    apiVIP: 192.168.0.10
    ingressVIP: 192.168.0.12
    provisioningNetworkInterface: eno2
    provisioningDHCPRange: 172.22.0.20,172.22.0.80
    provisioningNetworkCIDR: 172.22.0.0/24
    hosts:
      - name: e26-h31-740xd
        role: master
        bmc:
          address: ipmi://management-e26-h31-740xd.alias.bos.scalelab.redhat.com
          username: "scalelab"
          password: "XXXXXX"
        bootMACAddress: "E4:43:4B:B2:77:D5"
        rootDeviceHints:
          deviceName: "/dev/sda"
      - name: e26-h33-740xd
        role: master
        bmc:
          address: ipmi://management-e26-h33-740xd.alias.bos.scalelab.redhat.com
          username: "scalelab"
          password: "XXXXXX"
        bootMACAddress: "E4:43:4B:B2:75:C1"
        rootDeviceHints:
          deviceName: "/dev/sda"
      - name: e26-h35-740xd
        role: master
        bmc:
          address: ipmi://management-e26-h35-740xd.alias.bos.scalelab.redhat.com
          username: "scalelab"
          password: "XXXXXX"
        bootMACAddress: "E4:43:4B:B2:74:89"
        rootDeviceHints:
          deviceName: "/dev/sda"
      - name: e25-h11-740xd
        role: worker
        bmc:
          address: ipmi://management-e25-h11-740xd.alias.bos.scalelab.redhat.com
          username: "scalelab"
          password: "XXXXXX"
        bootMACAddress: "E4:43:4B:27:EE:0D"
        rootDeviceHints:
          deviceName: "/dev/sda"
        hardwareProfile: unknown
      - name: e26-h09-740xd
        role: worker
        bmc:
          address: ipmi://management-e26-h09-740xd.alias.bos.scalelab.redhat.com
          username: "scalelab"
          password: "XXXXXX"
        bootMACAddress: "E4:43:4B:B2:78:05"
        rootDeviceHints:
          deviceName: "/dev/sda"
        hardwareProfile: unknown
      - name: e26-h11-740xd
        role: worker
        bmc:
          address: ipmi://management-e26-h11-740xd.alias.bos.scalelab.redhat.com
          username: "scalelab"
          password: "XXXXXX"
        bootMACAddress: "E4:43:4B:B2:77:7D"
        rootDeviceHints:
          deviceName: "/dev/sda"
        hardwareProfile: unknown
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"b3BlbnNoaWZ0LXXXXXXEpQTw==","email":"mymail@redhat.com"},"quay.io":{"auth":"b3BlbnNXXXXXXXXXXXXXXHYzVE1FeFpVMlNGYjFQNUtMSXZ4QktqNGpFVV9lYnE5aEpoMmNHNmFDanIxbklUcUJkdUJwSzNEYUhJQk5ILWNQNm5WM25sV1dUWlloN2ZDLVJwOV96RXdtWnBsSHJieUlBQTh5eVlYYjBSdWJjSmlJdHRJd19oaW1GSVRDZktIem96TUdmWnh1Uk5V","email":"mymail@redhat.com"}}}'
sshKey: 'ssh-ed25519 AAAAC3NzaC1lXXXXXXXXXXXXXXXXI2R7TGN1cCGM mymail@redhat.com'

Check before Installation

You installer node should have DNS pointing to your DNS:

[kni@e26-h37-740xd ~]$ cat /etc/resolv.conf 
# Generated by NetworkManager
search alias.bos.scalelab.redhat.com
nameserver 192.168.0.1
[kni@e26-h37-740xd ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:de brd ff:ff:ff:ff:ff:ff
    inet 10.19.96.186/23 brd 10.19.97.255 scope global dynamic noprefixroute eno3
       valid_lft 245072sec preferred_lft 245072sec
3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e4:43:4b:b2:77:df brd ff:ff:ff:ff:ff:ff
4: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:00 brd ff:ff:ff:ff:ff:ff
5: ens7f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 40:a6:b7:00:df:01 brd ff:ff:ff:ff:ff:ff
6: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master baremetal state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dc brd ff:ff:ff:ff:ff:ff
7: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master provisioning state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dd brd ff:ff:ff:ff:ff:ff
8: provisioning: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dd brd ff:ff:ff:ff:ff:ff
    inet 172.22.0.1/24 brd 172.22.0.255 scope global noprefixroute provisioning
       valid_lft forever preferred_lft forever
    inet6 fe80::f46e:d32f:c08e:a291/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
9: baremetal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether e4:43:4b:b2:77:dc brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/24 brd 192.168.0.255 scope global noprefixroute baremetal
       valid_lft forever preferred_lft forever
    inet6 fe80::c467:dc53:acad:e902/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
10: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:08:31:83 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
11: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:08:31:83 brd ff:ff:ff:ff:ff:ff

[kni@e26-h37-740xd ~]$ ip r
default via 10.19.97.254 dev eno3 proto dhcp metric 100 
10.19.96.0/23 dev eno3 proto kernel scope link src 10.19.96.186 metric 100 
172.22.0.0/24 dev provisioning proto kernel scope link src 172.22.0.1 metric 425 
192.168.0.0/24 dev baremetal proto kernel scope link src 192.168.0.1 metric 426 
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown 

[kni@e26-h37-740xd ~]$ sudo nmcli con show
NAME                UUID                                  TYPE      DEVICE       
System eno3         24871ea9-4411-efbd-924f-49cd9fbda6e2  ethernet  eno3         
baremetal           70c6b297-abcc-45cb-b7bb-ecb15bcf532f  bridge    baremetal    
provisioning        0dac0189-1498-4de5-9135-19b2db4503ea  bridge    provisioning 
virbr0              2281c177-1221-45c0-9596-c14b77db6270  bridge    virbr0       
bridge-slave-eno1   982056a8-9015-4ef7-ae02-35f7e8de61e1  ethernet  eno1         
bridge-slave-eno2   f145eb95-115b-484b-9bec-34bccffa306c  ethernet  eno2         
Wired connection 1  22bfc603-b540-3e00-ad0d-b1eaf2d80baf  ethernet  --    

You should have no provisioning VMs running:

[kni@e26-h37-740xd ~]$ sudo virsh list --all

Copy the install-config.yaml:

[kni@e26-h37-740xd ~]$ cp install-config.yaml clusterconfigs/

Prepare manifests:

[kni@e26-h37-740xd ~]$ openshift-baremetal-install --dir ~/clusterconfigs create manifests
INFO Consuming Install Config from target directory 
WARNING Discarding the Openshift Manifests that was provided in the target directory because its dependencies are dirty and it needs to be regenerated 
INFO Manifests created in: /home/kni/clusterconfigs/manifests and /home/kni/clusterconfigs/openshift 

[kni@e26-h37-740xd ~]$ ls clusterconfigs
manifests  openshift

Installation with openshift-baremetal-install

Launch the installation in a tmux:

[kni@e26-h37-740xd ~]$ time openshift-baremetal-install --dir ~/clusterconfigs --log-level debug create cluster
...
DEBUG Cluster is initialized                       
INFO Waiting up to 10m0s for the openshift-console route to be created... 
DEBUG Route found in openshift-console namespace: console 
DEBUG OpenShift console route is admitted          
INFO Install complete!                            
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/kni/clusterconfigs/auth/kubeconfig' 
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.alias.bos.scalelab.redhat.com 
INFO Login to the console with user: "kubeadmin", and password: "P5qsq-XXXX-XXXX-k6MmS" 
DEBUG Time elapsed per stage:                      
DEBUG     Infrastructure: 22m52s                   
DEBUG Bootstrap Complete: 24m22s                   
DEBUG  Bootstrap Destroy: 11s                      
DEBUG  Cluster Operators: 43m46s                   
INFO Time elapsed: 1h31m16s                       

real    91m16.234s
user    1m14.445s
sys     0m12.741s
egallen@notebook ~ %  sshuttle -r e26-h37-740xd 192.168.0.0/24 --dns 
[local sudo] Password: 
c : Connected to server.

OCP Console

OCP Console

OCP Console

Checks during the installation

oc command checks during the OpenShift installation

You can find the credentials here:

[kni@e26-h37-740xd ~]$ ls clusterconfigs/auth/
kubeadmin-password  kubeconfig

[kni@e26-h37-740xd ~]$ export KUBECONFIG=/home/kni/clusterconfigs/auth/kubeconfig

No nodes registred for now:

[kni@e26-h37-740xd ~]$ oc get nodes
The connection to the server api.alias.lan.redhat.com:6443 was refused - did you specify the right host or port?
[kni@e26-h37-740xd clusterconfigs]$ oc get nodes
NAME                                          STATUS   ROLES    AGE   VERSION
e26-h31-740xd.alias.bos.scalelab.redhat.com   Ready    master   13m   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   Ready    master   13m   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready    master   13m   v1.20.0+c8905da

[kni@e26-h37-740xd clusterconfigs]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          46m     Working towards 4.7.7: 643 of 668 done (96% complete)

ssh to the OCP nodes

During the installation you can ssh into the nodes with the core user using the ssh key used in you install.yaml:

You can connect to the master nodes:
```shell
[kni@e26-h37-740xd ~]$ ssh core@192.168.0.60
The authenticity of host '192.168.0.60 (192.168.0.60)' can't be established.
ECDSA key fingerprint is SHA256:bzuDchHotqfgTJ87jbz3iGZZwRuo9FW+AQudeTERMQU.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.0.60' (ECDSA) to the list of known hosts.
Red Hat Enterprise Linux CoreOS 47.83.202104090345-0
  Part of OpenShift 4.7, RHCOS is a Kubernetes native operating system
  managed by the Machine Config Operator (`clusteroperator/machine-config`).

WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
  https://docs.openshift.com/container-platform/4.7/architecture/architecture-rhcos.html

---
Last login: Wed Apr 21 13:31:21 2021 from 192.168.0.1
[systemd]
Failed Units: 1
  NetworkManager-wait-online.service

An the bare metal nodes are in a power off status

[kni@e26-h37-740xd ~]$ ./ipmi_test.sh 
master: Chassis Power is off
master: Chassis Power is off
master: Chassis Power is off
worker: Chassis Power is off
worker: Chassis Power is off
worker: Chassis Power is off

Creating infrastructure resources

Boostrap VM created after one minute (0h41):

[kni@e26-h37-740xd ~]$ sudo virsh list
 Id    Name                           State
----------------------------------------------------
 16    alias-xkrrv-bootstrap          running

You can get the IP of the bootsrap VM with the DHCP server logs:

[kni@e26-h37-740xd ~]$ sudo cat /var/log/messages  | grep 192.168.0 | tail -1
Apr 21 23:40:38 e26-h37-740xd dhcpd[22700]: DHCPACK on 192.168.0.183 to 52:54:00:e6:c1:6a via baremetal

We can connect into the bootstrap VM:

[kni@e26-h37-740xd ~]$ ssh core@192.168.0.183
The authenticity of host '192.168.0.183 (192.168.0.183)' can't be established.
ECDSA key fingerprint is SHA256:fnpxW6x0JrTn/wQjsueQRMDtW8nH5dl0y1m3HKEZaa0.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.0.183' (ECDSA) to the list of known hosts.
Red Hat Enterprise Linux CoreOS 47.83.202103251640-0
  Part of OpenShift 4.7, RHCOS is a Kubernetes native operating system
  managed by the Machine Config Operator (`clusteroperator/machine-config`).

WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
  https://docs.openshift.com/container-platform/4.7/architecture/architecture-rhcos.html

---
This is the bootstrap node; it will be destroyed when the master is fully up.

The primary services are release-image.service followed by bootkube.service. To watch their status, run e.g.

  journalctl -b -f -u release-image.service -u bootkube.service
[core@localhost ~]$ 

You can check the bootstrap node activities:

[core@localhost ~]$ journalctl -b -f -u release-image.service -u bootkube.service
-- Logs begin at Wed 2021-04-21 23:39:52 UTC. --
Apr 21 23:41:07 localhost systemd[1]: Starting Download the OpenShift Release Image...
Apr 21 23:41:07 localhost release-image-download.sh[1806]: Pulling quay.io/openshift-release-dev/ocp-release@sha256:aee8055875707962203197c4306e69b024bea1a44fa09ea2c2c621e8c5000794...
Apr 21 23:41:13 localhost release-image-download.sh[1806]: 78ffdba41a1edb82011287b1d27edaadf3405ae24dde7fef4be47459d419f679
Apr 21 23:41:13 localhost systemd[1]: Started Download the OpenShift Release Image.
Apr 21 23:41:26 localhost systemd[1]: Started Bootstrap a Kubernetes cluster.

We can check the network configuration with both baremetal and provisioning interfaces:

[core@localhost ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:e6:c1:6a brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.183/16 brd 192.168.255.255 scope global dynamic noprefixroute ens3
       valid_lft 520sec preferred_lft 520sec
    inet6 fe80::4f39:fc6b:65b2:f457/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:96:7a:c0 brd ff:ff:ff:ff:ff:ff
    inet 172.22.0.2/24 brd 172.22.0.255 scope global noprefixroute ens4
       valid_lft forever preferred_lft forever
    inet6 fe80::4077:e7ad:76c7:7ca9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: cni-podman0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether c2:7e:1e:8e:5b:f5 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 brd 10.88.255.255 scope global cni-podman0
       valid_lft forever preferred_lft forever
    inet6 fe80::c07e:1eff:fe8e:5bf5/64 scope link 
       valid_lft forever preferred_lft forever
 
[core@localhost ~]$ ip r
default via 192.168.0.1 dev ens3 proto dhcp metric 100 
10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 linkdown 
172.22.0.0/24 dev ens4 proto kernel scope link src 172.22.0.2 metric 101 
192.168.0.0/16 dev ens3 proto kernel scope link src 192.168.0.183 metric 100 

Get the metal3 baremetalhost list:

[kni@e26-h37-740xd ~]$ oc get bmh -A
NAMESPACE               NAME            STATUS   PROVISIONING STATUS      CONSUMER                     BMC                                                       HARDWARE PROFILE   ONLINE   ERROR
openshift-machine-api   e25-h11-740xd   OK       provisioned              alias-xkrrv-worker-0-gwg44   ipmi://management-e25-h11-740xd.alias.bos.scalelab.redhat.com   unknown            true     
openshift-machine-api   e26-h09-740xd   OK       provisioned              alias-xkrrv-worker-0-mmbvr   ipmi://management-e26-h09-740xd.alias.bos.scalelab.redhat.com   unknown            true     
openshift-machine-api   e26-h11-740xd   OK       provisioned              alias-xkrrv-worker-0-pffpj   ipmi://management-e26-h11-740xd.alias.bos.scalelab.redhat.com   unknown            true     
openshift-machine-api   e26-h31-740xd   OK       externally provisioned   alias-xkrrv-master-0         ipmi://management-e26-h31-740xd.alias.bos.scalelab.redhat.com                      true     
openshift-machine-api   e26-h35-740xd   OK       externally provisioned   alias-xkrrv-master-2         ipmi://management-e26-h35-740xd.alias.bos.scalelab.redhat.com                      true     
openshift-machine-api   e26-h33-740xd   OK       externally provisioned   alias-xkrrv-master-1         ipmi://management-e26-h33-740xd.alias.bos.scalelab.redhat.com                      true   

Accessing the OpenShift Web Console after the installation

We can get the OpenShift Web Console URL with this CLI command:

[kni@e26-h37-740xd ~]$ oc whoami --show-console r
https://console-openshift-console.apps.alias.bos.scalelab.redhat.com

The kubeadmin password is printed at the end of the installaiton, if you avec lost it, you will fint it here:

[kni@e26-h37-740xd ~]$ cat /home/kni/clusterconfigs/auth/kubeadmin-password 
P5qsq-XXXX-XXXX-k6MmS

I’m routing the trafic from my laptop:

laptop ~ % sshuttle -r e26-h37-740xd 192.168.0.0/24 --dns 
[local sudo] Password: 
c : Connected to server.

On your laptop, add this in your hosts file if you are not using the same DNS server as your platform (the –dns option of the next command sshuttle can do part of the job):

echo "192.168.0.12     console-openshift-console.apps.alias.bos.scalelab.redhat.com oauth-openshift.apps.alias.bos.scalelab.redhat.com integrated-oauth-server-openshift-authentication.apps.alias.bos.scalelab.redhat.com prometheus-k8s-openshift-monitoring.apps.alias.bos.scalelab.redhat.com grafana-openshift-monitoring.apps.alias.bos.scalelab.redhat.com " >> /etc/hosts

I’m routing the trafic from my laptop:

laptop ~ %  sshuttle -r e26-h37-740xd 192.168.0.0/24 --dns 
[local sudo] Password: 
c : Connected to server.

NTP

I’ve got this PTP warning on the console:

Clock on e26-h09-740xd.alias.bos.scalelab.redhat.com is not synchronising. Ensure NTP is configured on this host.
Clock on e26-h11-740xd.alias.bos.scalelab.redhat.com is not synchronising. Ensure NTP is configured on this host.

From the doc: https://docs.openshift.com/container-platform/4.7/installing/install_config/installing-customizing.html#installation-special-config-chrony_installing-customizing

[kni@e26-h37-740xd ~]$ cat << EOF | base64
pool clock.redhat.com iburst 
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF

cG9vbCBjbG9jay5yZWRoYXQuY29tIGlidXJzdCAKZHJpZnRmaWxlIC92YXIvbGliL2Nocm9ueS9k
cmlmdAptYWtlc3RlcCAxLjAgMwpydGNzeW5jCmxvZ2RpciAvdmFyL2xvZy9jaHJvbnkK

Take the previous string and add it in the “source” section (remove the newline):

cat << EOF > ./99-masters-chrony-configuration.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: 99-masters-chrony-configuration
spec:
  config:
    ignition:
      config: {}
      security:
        tls: {}
      timeouts: {}
      version: 3.2.0
    networkd: {}
    passwd: {}
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,cG9vbCBjbG9jay5yZWRoYXQuY29tIGlidXJzdCAKZHJpZnRmaWxlIC92YXIvbGliL2Nocm9ueS9kcmlmdAptYWtlc3RlcCAxLjAgMwpydGNzeW5jCmxvZ2RpciAvdmFyL2xvZy9jaHJvbnkK
        mode: 420
        overwrite: true
        path: /etc/chrony.conf
  osImageURL: ""
EOF

Check the nodes status:

[kni@e26-h37-740xd ~]$ oc get nodes
NAME                                          STATUS   ROLES    AGE   VERSION
e25-h11-740xd.alias.bos.scalelab.redhat.com   Ready    worker   15h   v1.20.0+c8905da
e26-h09-740xd.alias.bos.scalelab.redhat.com   Ready    worker   15h   v1.20.0+c8905da
e26-h11-740xd.alias.bos.scalelab.redhat.com   Ready    worker   15h   v1.20.0+c8905da
e26-h31-740xd.alias.bos.scalelab.redhat.com   Ready    master   16h   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   Ready    master   16h   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready    master   16h   v1.20.0+c8905da

Check the configuration before:

[kni@e26-h37-740xd ~]$ oc debug node/e26-h31-740xd.alias.bos.scalelab.redhat.com
Starting pod/e26-h31-740xdaliasbosscalelabredhatcom-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.60
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host

sh-4.4# find /etc | grep chrony
/etc/chrony.conf
/etc/chrony.keys
/etc/NetworkManager/dispatcher.d/20-chrony
/etc/NetworkManager/dispatcher.d/20-coreos-chrony-dhcp
/etc/dhcp/dhclient.d/chrony.sh
/etc/logrotate.d/chrony
/etc/selinux/targeted/active/modules/100/chronyd
/etc/selinux/targeted/active/modules/100/chronyd/cil
/etc/selinux/targeted/active/modules/100/chronyd/hll
/etc/selinux/targeted/active/modules/100/chronyd/lang_ext
/etc/sysconfig/chronyd
/etc/systemd/system/multi-user.target.wants/chronyd.service

sh-4.4# grep pool /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
pool 2.rhel.pool.ntp.org iburst

sh-4.4# grep -ri clock.redhat.com /etc 
sh-4.4# 

Apply the file:

[kni@e26-h37-740xd ~]$ oc apply -f ./99-masters-chrony-configuration.yaml
machineconfig.machineconfiguration.openshift.io/99-masters-chrony-configuration created

On each node the Scheduling is disabled, first master to reboot e26-h31-740xd:

[kni@e26-h37-740xd ~]$ oc get nodes
NAME                                          STATUS                     ROLES    AGE   VERSION
e25-h11-740xd.alias.bos.scalelab.redhat.com   Ready                      worker   15h   v1.20.0+c8905da
e26-h09-740xd.alias.bos.scalelab.redhat.com   Ready                      worker   15h   v1.20.0+c8905da
e26-h11-740xd.alias.bos.scalelab.redhat.com   Ready                      worker   15h   v1.20.0+c8905da
e26-h31-740xd.alias.bos.scalelab.redhat.com   Ready,SchedulingDisabled   master   16h   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   Ready                      master   16h   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready                      master   16h   v1.20.0+c8905da

[kni@e26-h37-740xd ~]$ oc get nodes
NAME                                          STATUS                        ROLES    AGE   VERSION
e25-h11-740xd.alias.bos.scalelab.redhat.com   Ready                         worker   15h   v1.20.0+c8905da
e26-h09-740xd.alias.bos.scalelab.redhat.com   Ready                         worker   15h   v1.20.0+c8905da
e26-h11-740xd.alias.bos.scalelab.redhat.com   Ready                         worker   15h   v1.20.0+c8905da
e26-h31-740xd.alias.bos.scalelab.redhat.com   NotReady,SchedulingDisabled   master   16h   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   Ready                         master   16h   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready                         master   16h   v1.20.0+c8905da

During each node upgrade: NTP

and the OCP bare metal server is rebooted:

[kni@e26-h37-740xd ~]$ ping e26-h31-740xd.alias.bos.scalelab.redhat.com
PING e26-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60) 56(84) bytes of data.
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=1 ttl=64 time=0.161 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=2 ttl=64 time=0.192 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=3 ttl=64 time=0.178 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=4 ttl=64 time=0.116 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=5 ttl=64 time=0.175 ms
From ns1.alias.bos.scalelab.redhat.com (192.168.0.1) icmp_seq=30 Destination Host Unreachable
From ns1.alias.bos.scalelab.redhat.com (192.168.0.1) icmp_seq=31 Destination Host Unreachable
From ns1.alias.bos.scalelab.redhat.com (192.168.0.1) icmp_seq=32 Destination Host Unreachable
...
From ns1.alias.bos.scalelab.redhat.com (192.168.0.1) icmp_seq=286 Destination Host Unreachable
From ns1.alias.bos.scalelab.redhat.com (192.168.0.1) icmp_seq=287 Destination Host Unreachable
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=288 ttl=64 time=1024 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=289 ttl=64 time=0.302 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=290 ttl=64 time=0.221 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=291 ttl=64 time=0.166 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=292 ttl=64 time=0.165 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=293 ttl=64 time=0.166 ms
64 bytes from e25-h31-740xd.alias.bos.scalelab.redhat.com (192.168.0.60): icmp_seq=294 ttl=64 time=0.167 ms

The node e26-h31-740xd is moving back to a Ready status:

[kni@e26-h37-740xd ~]$ oc get nodes
NAME                                          STATUS   ROLES    AGE   VERSION
e25-h11-740xd.alias.bos.scalelab.redhat.com   Ready    worker   15h   v1.20.0+c8905da
e26-h09-740xd.alias.bos.scalelab.redhat.com   Ready    worker   15h   v1.20.0+c8905da
e26-h11-740xd.alias.bos.scalelab.redhat.com   Ready    worker   15h   v1.20.0+c8905da
e26-h31-740xd.alias.bos.scalelab.redhat.com   Ready    master   16h   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   Ready    master   16h   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready    master   16h   v1.20.0+c8905da

Check the configuration done on the first master e26-h31-740xd configured:

[kni@e26-h37-740xd ~]$ oc debug node/e26-h31-740xd.alias.bos.scalelab.redhat.com
Starting pod/e26-h31-740xdaliasbosscalelabredhatcom-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.60
If you don't see a command prompt, try pressing enter.

sh-4.4#  chroot /host

sh-4.4#  find /etc | grep chrony
/etc/chrony.conf
/etc/chrony.keys
/etc/NetworkManager/dispatcher.d/20-chrony
/etc/NetworkManager/dispatcher.d/20-coreos-chrony-dhcp
/etc/dhcp/dhclient.d/chrony.sh
/etc/logrotate.d/chrony
/etc/selinux/targeted/active/modules/100/chronyd
/etc/selinux/targeted/active/modules/100/chronyd/cil
/etc/selinux/targeted/active/modules/100/chronyd/hll
/etc/selinux/targeted/active/modules/100/chronyd/lang_ext
/etc/sysconfig/chronyd
/etc/systemd/system/multi-user.target.wants/chronyd.service
/etc/machine-config-daemon/orig/etc/chrony.conf.mcdorig

sh-4.4# grep pool /etc/chrony.conf
pool clock.redhat.com iburst 

sh-4.4# grep -ri clock.redhat.com /etc 
/etc/chrony.conf:pool clock.redhat.com iburst 

sh-4.4# cat /etc/chrony.conf 
pool clock.redhat.com iburst 
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony

A second master e26-h33-740xd, is rebooting…

[kni@e26-h37-740xd ~]$ oc get nodes
NAME                                          STATUS                     ROLES    AGE   VERSION
e25-h11-740xd.alias.bos.scalelab.redhat.com   Ready                      worker   15h   v1.20.0+c8905da
e26-h09-740xd.alias.bos.scalelab.redhat.com   Ready                      worker   15h   v1.20.0+c8905da
e26-h11-740xd.alias.bos.scalelab.redhat.com   Ready                      worker   15h   v1.20.0+c8905da
e26-h31-740xd.alias.bos.scalelab.redhat.com   Ready                      master   16h   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   Ready,SchedulingDisabled   master   16h   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready                      master   16h   v1.20.0+c8905da

[kni@e26-h37-740xd ~]$ oc get nodes
NAME                                          STATUS                        ROLES    AGE   VERSION
e25-h11-740xd.alias.bos.scalelab.redhat.com   Ready                         worker   15h   v1.20.0+c8905da
e26-h09-740xd.alias.bos.scalelab.redhat.com   Ready                         worker   15h   v1.20.0+c8905da
e26-h11-740xd.alias.bos.scalelab.redhat.com   Ready                         worker   15h   v1.20.0+c8905da
e26-h31-740xd.alias.bos.scalelab.redhat.com   Ready                         master   16h   v1.20.0+c8905da
e26-h33-740xd.alias.bos.scalelab.redhat.com   NotReady,SchedulingDisabled   master   16h   v1.20.0+c8905da
e26-h35-740xd.alias.bos.scalelab.redhat.com   Ready                         master   16h   v1.20.0+c8905da

We can now apply the NTP change to workers.

The master configuration is enabled:

[kni@e26-h37-740xd ~]$ oc get mc | grep chrony
99-masters-chrony-configuration

Copy the configuration:

[kni@e26-h37-740xd ~]$ cp 99-masters-chrony-configuration.yaml 99-workers-chrony-configuration.yaml 

Modify “master” by “worker” in the configuration:

[kni@e26-h37-740xd ~]$ cat 99-workers-chrony-configuration.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-workers-chrony-configuration
spec:
  config:
    ignition:
      config: {}
      security:
        tls: {}
      timeouts: {}
      version: 3.2.0
    networkd: {}
    passwd: {}
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,cG9vbCBjbG9jay5yZWRoYXQuY29tIGlidXJzdCAKZHJpZnRmaWxlIC92YXIvbGliL2Nocm9ueS9kcmlmdAptYWtlc3RlcCAxLjAgMwpydGNzeW5jCmxvZ2RpciAvdmFyL2xvZy9jaHJvbnkK
        mode: 420
        overwrite: true
        path: /etc/chrony.conf
  osImageURL: ""

Apply the configuration:

[kni@e26-h37-740xd ~]$ oc apply -f ./99-workers-chrony-configuration.yaml
machineconfig.machineconfiguration.openshift.io/99-workers-chrony-configuration created

After all the nodes upgrade, NTP

Badfish

This step is not mandatory, it depends of your hardware.

We will use Badfish to define the good boot order on Dell PE R740xd. We want to set the first boot of PXE to eno1 and after the disk boot. This operation can be done manually in the Bios but it’s better to automate all server preparation.

This tool code is available here: https://github.com/redhat-performance/badfish

First we install Podman:

[kni@e26-h37-740xd badfish]$ sudo dnf install podman -y

Prepare environment variables:

export DNS_IP=192.168.0.1
export USER=quads
export PASSWORD=XXXXXX
export HOST=management-e26-h31-740xd.alias.bos.scalelab.redhat.com

We can check the current boot order for each master/worker nodes:

[kni@e26-h37-740xd ~]$ sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD --check-boot
- INFO     - Current boot order:
- INFO     - 1: HardDisk.List.1-1
- INFO     - 2: NIC.Integrated.1-3-1
- INFO     - 3: NIC.Slot.7-2-1
- INFO     - 4: NIC.Slot.7-1-1
- INFO     - 5: NIC.Integrated.1-1-1
- INFO     - 6: NIC.Integrated.1-2-1

We define the profile ‘director’ that will change permanently the boot order putting eno1 PXE as the first device:

[kni@e26-h37-740xd ~]$ sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director
- INFO     - Job queue for iDRAC management-e26-h31-740xd.alias.bos.scalelab.redhat.com successfully cleared.
- INFO     - PATCH command passed to update boot order.
- INFO     - POST command passed to create target config job.
- INFO     - Command passed to On server, code return is 204.

We can apply this configuration for all nodes:

export HOST=management-e26-h31-740xd.alias.bos.scalelab.redhat.com
sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director

export HOST=management-e26-h33-740xd.alias.bos.scalelab.redhat.com
sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director

export HOST=management-e26-h35-740xd.alias.bos.scalelab.redhat.com
sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director

export HOST=management-e25-h11-740xd.alias.bos.scalelab.redhat.com
sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director

export HOST=management-e26-h09-740xd.alias.bos.scalelab.redhat.com
sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director

export HOST=management-e26-h11-740xd.alias.bos.scalelab.redhat.com
sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD -i config/idrac_interfaces.yml -t director

Each node is rebooting after the badfish command: Badfish

A life cycle screen is showing the Automated Task application running the bios update: Badfish

At the end, we can check each node and validate the proper PXE boot order:

[kni@e26-h37-740xd ~]$ sudo podman run -it --rm --dns $DNS_IP quay.io/quads/badfish -H $HOST -u $USER -p $PASSWORD --check-boot
- INFO     - Current boot order:
- INFO     - 1: NIC.Integrated.1-1-1
- INFO     - 2: NIC.Integrated.1-2-1
- INFO     - 3: NIC.Slot.7-2-1
- INFO     - 4: NIC.Slot.7-1-1
- INFO     - 5: NIC.Integrated.1-3-1
- INFO     - 6: HardDisk.List.1-1

Entitlement of the OCP nodes

Test before entitlement with oc debug

Check before, that there are no subscription files in the RHEL CoreOS VMs:

[kni@e26-h37-740xd ~]$ oc debug node/e25-h11-740xd.alias.bos.scalelab.redhat.com
Starting pod/e25-h11-740xdaliasbosscalelabredhatcom-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.63
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# ls -la /etc/rhsm/rhsm.conf /etc/pki/entitlement/entitlement.pem /etc/pki/entitlement/entitlement-key.pem
ls: cannot access '/etc/rhsm/rhsm.conf': No such file or directory
ls: cannot access '/etc/pki/entitlement/entitlement.pem': No such file or directory
ls: cannot access '/etc/pki/entitlement/entitlement-key.pem': No such file or directory

Test before the entitlement with the testing pod

Download the testing yaml:

[kni@e26-h37-740xd ~]$ curl -O https://raw.githubusercontent.com/openshift-psap/blog-artifacts/master/how-to-use-entitled-builds-with-ubi/0004-cluster-wide-entitled-pod.yaml

Run the subscription test:

[kni@e26-h37-740xd ~]$ oc create -f 0004-cluster-wide-entitled-pod.yaml
pod/cluster-entitled-build-pod created

[kni@e26-h37-740xd ~]$ oc get pods
NAME                         READY   STATUS              RESTARTS   AGE
cluster-entitled-build-pod   0/1     ContainerCreating   0          6s

[kni@e26-h37-740xd ~]$ oc get pods
NAME                         READY   STATUS    RESTARTS   AGE
cluster-entitled-build-pod   1/1     Running   0          16s


[kni@e26-h37-740xd ~]$ oc get pods
NAME                         READY   STATUS      RESTARTS   AGE
cluster-entitled-build-pod   0/1     Completed   0          21s

Check the result, the system is not registred (expected):

[kni@e26-h37-740xd ~]$ oc logs cluster-entitled-build-pod 
Updating Subscription Management repositories.
Unable to read consumer identity
Subscription Manager is operating in container mode.

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

Red Hat Universal Base Image 8 (RPMs) - BaseOS  791 kB/s | 778 kB     00:00    
Red Hat Universal Base Image 8 (RPMs) - AppStre 4.0 MB/s | 5.2 MB     00:01    
Red Hat Universal Base Image 8 (RPMs) - CodeRea  25 kB/s |  15 kB     00:00    
No matches found.

Delete the testing pod:
```shell
[kni@e26-h37-740xd ~]$ oc delete -f 0004-cluster-wide-entitled-pod.yaml 
pod "cluster-entitled-build-pod" deleted

Download the pem certificate for node entitlement

To install the GPU operator we have to registred the nodes.

Red Hat uses the subscription model to allow customers to download Red Hat tested and certified enterprise software. This way customers are supplied with the latest patches, bug fixes, updates and upgrades.

UBI is a subset of packages of a RHEL distribution, to have all needed packages to build a sophisticated container image, the build needs access to all repositories and this is where entitled builds can help.

Go to: https://access.redhat.com/management/systems, and download the subscription of your RHEL node:

Entitlement

Entitlement

Unzip the subscription file:

[kni@e26-h37-740xd ~]$ unzip ababab1212-1212-ab12-12ab-121212ab12_certificates.zip 
Archive:  ababab1212-1212-ab12-12ab-121212ab12_certificates.zip
signed Candlepin export for ababab1212-1212-ab12-12ab-121212ab12
  inflating: consumer_export.zip     
  inflating: signature               
[kni@e26-h37-740xd ~]$ unzip consumer_export.zip 
Archive:  consumer_export.zip
Candlepin export for ababab1212-1212-ab12-12ab-121212ab12
  inflating: export/meta.json        
  inflating: export/entitlement_certificates/12121212121212112.pem

[kni@e26-h37-740xd ~]$ cp export/entitlement_certificates/12121212121212112.pem ~/entitlement.pem

Check your private keys:

[kni@e26-h37-740xd ~]$ cat entitlement.pem
-----BEGIN CERTIFICATE-----
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
-----END CERTIFICATE-----
-----BEGIN ENTITLEMENT DATA----
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
-----END ENTITLEMENT DATA-----
-----BEGIN RSA SIGNATURE-----
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
-----END RSA SIGNATURE-----
-----BEGIN RSA PRIVATE KEY-----
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXx
-----END RSA PRIVATE KEY-----

Download files entitlement and rhsm.conf files:

[kni@e26-h37-740xd ~]$ curl -O https://gitlab.com/kpouget_psap/deploy-cluster/-/raw/master/utils/entitlement/cluster-wide.entitlement.machineconfigs.yaml.template

[kni@e26-h37-740xd ~]$ curl -O https://gitlab.com/kpouget_psap/deploy-cluster/-/raw/master/utils/entitlement/rhsm.conf

Prepare a small Python tool (Thanks Kevin):

[stack@undercloud-0 openshift]$ cat prepare-entitlement.sh 
#! /usr/bin/python
import fileinput, base64
PEM = "entitlement.pem"
RHSM = "rhsm.conf"
b64_pem = base64.b64encode(open(PEM, "rb").read()).decode()
b64_rhsm = base64.b64encode(open(RHSM, "rb").read()).decode()
for line in fileinput.input():
  print(line.replace("BASE64_ENCODED_PEM_FILE", b64_pem).replace("BASE64_ENCODED_RHSM_FILE", b64_rhsm), end="")

Set execute permission on the Python script:

[kni@e26-h37-740xd ~]$ chmod 755 prepare-entitlement.sh 

Generate your template:

python3 prepare-entitlement.sh cluster-wide.entitlement.machineconfigs.yaml.template > cluster-wide.entitlement.machineconfigs.yaml

Entitle your nodes

Check mc status before the enablement:

[kni@e26-h37-740xd ~]$ oc get machineconfig | grep entitlement
[kni@e26-h37-740xd ~]$ 

Enable the subscriptions:

[kni@e26-h37-740xd ~]$ oc create -f cluster-wide.entitlement.machineconfigs.yaml 
machineconfig.machineconfiguration.openshift.io/50-rhsm-conf created
machineconfig.machineconfiguration.openshift.io/50-entitlement-pem created
machineconfig.machineconfiguration.openshift.io/50-entitlement-key-pem created

!!! This enablement, can take some time, because the baremetal worker nodes are rebooted one by one.

Test after the entitlement with oc debug

We can see the machine config:

[kni@e26-h37-740xd ~]$ oc get machineconfig | grep entitlement
50-entitlement-key-pem                                                                        2.2.0             3m22s
50-entitlement-pem                                                                            2.2.0             3m22s

We can find the registration files rhsm.conf, entitlement.pem and entitlement-key.pem created:

[kni@e26-h37-740xd ~]$  oc debug node/e25-h11-740xd.alias.bos.scalelab.redhat.com
Starting pod/e25-h11-740xdaliasbosscalelabredhatcom-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.63
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# ls -la /etc/rhsm/rhsm.conf /etc/pki/entitlement/entitlement.pem /etc/pki/entitlement/entitlement-key.pem
-rw-r--r--. 1 root root 154960 Apr 22 09:59 /etc/pki/entitlement/entitlement-key.pem
-rw-r--r--. 1 root root 154960 Apr 22 09:59 /etc/pki/entitlement/entitlement.pem
-rw-r--r--. 1 root root   2975 Apr 22 09:59 /etc/rhsm/rhsm.conf

Test the entitlement:

[stack@undercloud-0 openshift]$ oc create -f 0004-cluster-wide-entitled-pod.yaml 
pod/cluster-entitled-build-pod created

[kni@e26-h37-740xd ~]$ oc logs cluster-entitled-build-pod 
Updating Subscription Management repositories.
Unable to read consumer identity
Subscription Manager is operating in container mode.
Red Hat Enterprise Linux 8 for x86_64 - AppStre 9.0 MB/s |  26 MB     00:02    
Red Hat Enterprise Linux 8 for x86_64 - BaseOS   20 MB/s |  29 MB     00:01    
Red Hat Universal Base Image 8 (RPMs) - BaseOS  760 kB/s | 778 kB     00:01    
Red Hat Universal Base Image 8 (RPMs) - AppStre 4.2 MB/s | 5.2 MB     00:01    
Red Hat Universal Base Image 8 (RPMs) - CodeRea  24 kB/s |  15 kB     00:00    
====================== Name Exactly Matched: kernel-devel ======================
kernel-devel-4.18.0-80.1.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-80.el8.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-80.4.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-80.7.1.el8_0.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-80.11.1.el8_0.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-147.el8.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-80.11.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-80.7.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-147.0.3.el8_1.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-147.8.1.el8_1.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-147.0.2.el8_1.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-147.3.1.el8_1.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-147.5.1.el8_1.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.el8.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.14.3.el8_2.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.13.2.el8_2.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.1.2.el8_2.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.19.1.el8_2.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.6.3.el8_2.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-240.el8.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-193.28.1.el8_2.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-240.1.1.el8_3.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-240.8.1.el8_3.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-240.10.1.el8_3.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-240.15.1.el8_3.x86_64 : Development package for building kernel modules to match the kernel
kernel-devel-4.18.0-240.22.1.el8_3.x86_64 : Development package for building kernel modules to match the kerneloc logs cluster-entitled-build-pod 

If you want to show how the machine config have done the configuration, you can get the select one machine-config-daemon on one worker node:

[kni@e26-h37-740xd ~]$ oc get pod -n openshift-machine-config-operator -o wide
NAME                                         READY   STATUS    RESTARTS   AGE   IP             NODE                                          NOMINATED NODE   READINESS GATES
machine-config-controller-6979c74884-6nvrz   1/1     Running   1          15h   10.130.0.17    e26-h33-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-daemon-7mszr                  2/2     Running   0          15h   192.168.0.65   e26-h11-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-daemon-94w9z                  2/2     Running   0          15h   192.168.0.64   e26-h09-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-daemon-gwrqv                  2/2     Running   0          15h   192.168.0.61   e26-h33-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-daemon-lrswf                  2/2     Running   0          15h   192.168.0.63   e25-h11-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-daemon-nf6dp                  2/2     Running   0          15h   192.168.0.60   e26-h31-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-daemon-wqfzn                  2/2     Running   0          15h   192.168.0.62   e26-h35-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-operator-688698d6db-qpg8w     1/1     Running   1          16h   10.129.0.10    e26-h35-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-server-5flv2                  1/1     Running   0          15h   192.168.0.61   e26-h33-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-server-ns88m                  1/1     Running   0          15h   192.168.0.60   e26-h31-740xd.alias.bos.scalelab.redhat.com   <none>           <none>
machine-config-server-p4jv2                  1/1     Running   0          15h   192.168.0.62   e26-h35-740xd.alias.bos.scalelab.redhat.com   <none>           <none>

and you can get the logs of the registration:

[kni@e26-h37-740xd ~]$ oc logs -n openshift-machine-config-operator machine-config-daemon-lrswf machine-config-daemon | grep update.go
I0422 09:58:00.781195    6045 update.go:1904] Update prepared; beginning drain
I0422 09:59:19.466364    6045 update.go:1904] drain complete
I0422 09:59:19.469188    6045 update.go:234] Successful drain took 78.685270222 seconds
I0422 09:59:19.469203    6045 update.go:1219] Updating files
I0422 09:59:19.494994    6045 update.go:1616] Writing file "/etc/NetworkManager/conf.d/99-keyfiles.conf"
I0422 09:59:19.496997    6045 update.go:1616] Writing file "/etc/NetworkManager/dispatcher.d/40-mdns-hostname"
...
I0422 09:59:19.572551    6045 update.go:1616] Writing file "/etc/kubernetes/cloud.conf"
I0422 09:59:19.573543    6045 update.go:1616] Writing file "/etc/kubernetes/kubelet.conf"
I0422 09:59:19.574632    6045 update.go:1616] Writing file "/etc/pki/entitlement/entitlement-key.pem"
I0422 09:59:19.593331    6045 update.go:1616] Writing file "/etc/pki/entitlement/entitlement.pem"
...

XXX We can get the console URL here:

[kni@e26-h37-740xd ~]$ oc whoami --show-console r
https://console-openshift-console.apps.alias.bos.scalelab.redhat.com

The kubeadmin password is printed at the end of the installaiton, if you avec lost it, you will fint it here:

[kni@e26-h37-740xd ~]$ cat /home/kni/clusterconfigs/auth/kubeadmin-password 
P5qsq-XXXX-XXXX-k6MmS

Install the Node Feature Discovery Operator

First we can create a project “gpu-operator-resources”: Node Feature Discovery Operator with OCP 4.7 installation

Node Feature Discovery Operator with OCP 4.7 installation

Node Feature Discovery Operator with OCP 4.7 installation

We can now go to the catalog: Node Feature Discovery Operator with OCP 4.7 installation

Node Feature Discovery Operator with OCP 4.7 installation

Node Feature Discovery Operator with OCP 4.7 installation

Node Feature Discovery Operator with OCP 4.7 installation

Node Feature Discovery Operator with OCP 4.7 installation

When the Node feature discovery is installed, we can find additionnal tags for each node. We can find the vendor id “10de” tagged for all the worker nodes:

[kni@e26-h37-740xd ~]$ oc describe node/e25-h11-740xd.alias.bos.scalelab.redhat.com | grep pci-10de
                    feature.node.kubernetes.io/pci-10de.present=true
                    feature.node.kubernetes.io/pci-10de.sriov.capable=true

Install the NVIDIA GPU Operator

The NVIDIA GPU Operator documentation is here: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/getting-started.html

To install athe GPU operator, we will use the catalog, and we search for “nvidia”: GPU Operator installation with OCP 4.7

We just need to click on “Install”: GPU Operator installation with OCP 4.7

We can follow the GPU Operator installation

[kni@e26-h37-740xd ~]$ oc project gpu-operator-resources
Now using project "gpu-operator-resources" on server "https://api.alias.bos.scalelab.redhat.com:6443".
[kni@e26-h37-740xd ~]$ oc get pods
NAME                                       READY   STATUS     RESTARTS   AGE
nvidia-container-toolkit-daemonset-j6wkm   0/1     Init:0/1   0          7s
nvidia-container-toolkit-daemonset-sddhx   0/1     Init:0/1   0          7s
nvidia-container-toolkit-daemonset-x6k7m   0/1     Init:0/1   0          7s
nvidia-driver-daemonset-5f7rx              1/1     Running    0          33s
nvidia-driver-daemonset-dck69              1/1     Running    0          33s
nvidia-driver-daemonset-vnq5n              1/1     Running    0          33s

At the end all pods are Running:

[kni@e26-h37-740xd ~]$ oc get pods
NAME                                       READY   STATUS      RESTARTS   AGE
gpu-feature-discovery-bnrgv                1/1     Running     0          21s
gpu-feature-discovery-j9mqr                1/1     Running     0          21s
gpu-feature-discovery-kqgfn                1/1     Running     0          21s
nvidia-container-toolkit-daemonset-j6wkm   1/1     Running     0          4m19s
nvidia-container-toolkit-daemonset-sddhx   1/1     Running     0          4m19s
nvidia-container-toolkit-daemonset-x6k7m   1/1     Running     0          4m19s
nvidia-dcgm-exporter-6lkf9                 1/1     Running     0          48s
nvidia-dcgm-exporter-9m9tk                 1/1     Running     0          48s
nvidia-dcgm-exporter-l9z8s                 1/1     Running     0          48s
nvidia-device-plugin-daemonset-m2jt6       1/1     Running     0          81s
nvidia-device-plugin-daemonset-p5lr8       1/1     Running     0          81s
nvidia-device-plugin-daemonset-x8k78       1/1     Running     0          81s
nvidia-device-plugin-validation            0/1     Completed   0          55s
nvidia-driver-daemonset-5f7rx              1/1     Running     0          4m45s
nvidia-driver-daemonset-dck69              1/1     Running     0          4m45s
nvidia-driver-daemonset-vnq5n              1/1     Running     0          4m45s

We can double check and validate the nvidia-device-plugin-validation logs:

[kni@e26-h37-740xd ~]$  oc logs nvidia-device-plugin-validation -n gpu-operator-resources
device-plugin validation is successful

Test the GPU in one nvidia-device-plugin-daemonset

[kni@e26-h37-740xd ~]$ oc exec -it nvidia-device-plugin-daemonset-m2jt6  -- nvidia-smi
Thu Apr 22 19:11:38 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:3B:00.0 Off |                    0 |
| N/A   45C    P8    17W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Test TensorFlow benchmarks with GPU

[kni@e26-h37-740xd ~]$ cat << EOF > tensorflow-benchmarks-gpu.yaml
apiVersion: v1
kind: Pod 
metadata:
 name: tensorflow-benchmarks-gpu
spec:
 containers:
 - image: nvcr.io/nvidia/tensorflow:19.09-py3
   name: cudnn
   command: ["/bin/sh","-c"]
   args: ["git clone https://github.com/tensorflow/benchmarks.git;cd benchmarks/scripts/tf_cnn_benchmarks;python3 tf_cnn_benchmarks.py --num_gpus=1 --data_format=NHWC --batch_size=32 --model=resnet50 --variable_update=parameter_server"]
   resources:
    limits:
      nvidia.com/gpu: 1
    requests:
      nvidia.com/gpu: 1
 restartPolicy: Never
EOF
[kni@e26-h37-740xd ~]$ oc create -f tensorflow-benchmarks-gpu.yaml
pod/tensorflow-benchmarks-gpu created
[kni@e26-h37-740xd ~]$ oc get pods tensorflow-benchmarks-gpu 
NAME                        READY   STATUS              RESTARTS   AGE
tensorflow-benchmarks-gpu   0/1     ContainerCreating   0          58s
NAME                        READY   STATUS    RESTARTS   AGE
tensorflow-benchmarks-gpu   1/1     Running   0          4m35s
[kni@e26-h37-740xd ~]$ oc get pods tensorflow-benchmarks-gpu 
NAME                        READY   STATUS      RESTARTS   AGE
tensorflow-benchmarks-gpu   0/1     Completed   0          24m
[kni@e26-h37-740xd ~]$ oc get pods tensorflow-benchmarks-gpu 
NAME                        READY   STATUS      RESTARTS   AGE
tensorflow-benchmarks-gpu   0/1     Completed   0          24m

[kni@e26-h37-740xd ~]$ oc logs tensorflow-benchmarks-gpu 
Cloning into 'benchmarks'...
2021-04-22 19:44:17.688822: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/compat/v2_compat.py:61: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2021-04-22 19:44:19.236570: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2100000000 Hz
2021-04-22 19:44:19.242517: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x31b5c40 executing computations on platform Host. Devices:
2021-04-22 19:44:19.242541: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2021-04-22 19:44:19.244359: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2021-04-22 19:44:19.329053: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4a71940 executing computations on platform CUDA. Devices:
2021-04-22 19:44:19.329085: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2021-04-22 19:44:19.329722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:3b:00.0
2021-04-22 19:44:19.329750: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2021-04-22 19:44:19.331294: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
2021-04-22 19:44:19.332662: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10
2021-04-22 19:44:19.333023: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10
2021-04-22 19:44:19.334385: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10
2021-04-22 19:44:19.335171: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10
2021-04-22 19:44:19.338312: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2021-04-22 19:44:19.339312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-04-22 19:44:19.339340: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2021-04-22 19:44:19.590001: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-22 19:44:19.590037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-04-22 19:44:19.590043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-04-22 19:44:19.591145: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14168 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:3b:00.0, compute capability: 7.5)
WARNING:tensorflow:From /workspace/benchmarks/scripts/tf_cnn_benchmarks/convnet_builder.py:134: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
W0422 19:44:19.613225 139962502174528 deprecation.py:323] From /workspace/benchmarks/scripts/tf_cnn_benchmarks/convnet_builder.py:134: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
WARNING:tensorflow:From /workspace/benchmarks/scripts/tf_cnn_benchmarks/convnet_builder.py:266: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.MaxPooling2D instead.
W0422 19:44:19.863591 139962502174528 deprecation.py:323] From /workspace/benchmarks/scripts/tf_cnn_benchmarks/convnet_builder.py:266: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.MaxPooling2D instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/losses/losses_impl.py:121: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0422 19:44:21.467282 139962502174528 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/losses/losses_impl.py:121: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /workspace/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py:2268: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
W0422 19:44:22.320955 139962502174528 deprecation.py:323] From /workspace/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py:2268: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2021-04-22 19:44:22.615891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:3b:00.0
2021-04-22 19:44:22.615952: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2021-04-22 19:44:22.615988: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
2021-04-22 19:44:22.615996: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10
2021-04-22 19:44:22.616003: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10
2021-04-22 19:44:22.616011: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10
2021-04-22 19:44:22.616018: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10
2021-04-22 19:44:22.616027: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2021-04-22 19:44:22.616869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-04-22 19:44:22.616913: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-22 19:44:22.616919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-04-22 19:44:22.616924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-04-22 19:44:22.617756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14168 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:3b:00.0, compute capability: 7.5)
2021-04-22 19:44:23.072212: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
INFO:tensorflow:Running local_init_op.
I0422 19:44:23.240972 139962502174528 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0422 19:44:23.283987 139962502174528 session_manager.py:502] Done running local_init_op.
2021-04-22 19:44:24.477198: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
2021-04-22 19:44:24.689721: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
TensorFlow:  1.14
Model:       resnet50
Dataset:     imagenet (synthetic)
Mode:        training
SingleSess:  False
Batch size:  32 global
             32 per device
Num batches: 100
Num epochs:  0.00
Devices:     ['/gpu:0']
NUMA bind:   False
Data format: NHWC
Optimizer:   sgd
Variables:   parameter_server
==========
Generating training model
Initializing graph
Running warm up
Done warm up
Step  Img/sec total_loss
1 images/sec: 118.3 +/- 0.0 (jitter = 0.0)  8.108
10  images/sec: 118.6 +/- 0.2 (jitter = 0.9)  8.122
20  images/sec: 118.4 +/- 0.1 (jitter = 0.6)  7.983
30  images/sec: 118.2 +/- 0.1 (jitter = 0.4)  7.780
40  images/sec: 118.1 +/- 0.1 (jitter = 0.5)  7.848
50  images/sec: 118.0 +/- 0.1 (jitter = 0.6)  7.779
60  images/sec: 117.9 +/- 0.1 (jitter = 0.6)  7.825
70  images/sec: 117.7 +/- 0.1 (jitter = 0.7)  7.839
80  images/sec: 117.6 +/- 0.1 (jitter = 0.8)  7.818
90  images/sec: 117.4 +/- 0.1 (jitter = 0.9)  7.646
100 images/sec: 117.3 +/- 0.1 (jitter = 1.0)  7.916
----------------------------------------------------------------
total images/sec: 117.23
----------------------------------------------------------------
[kni@e26-h37-740xd ~]$ oc delete -f tensorflow-benchmarks-gpu.yaml 
pod "tensorflow-benchmarks-gpu" deleted

Test scheduling one NGC UBI8 pod with CUDA running the nvidia-smi command

Prepare a manifest requesting one GPU nvidia.com/gpu: 1:

[kni@e26-h37-740xd ~]$cat << EOF > nvidia-smi.yaml
apiVersion: v1
kind: Pod
metadata:
 name: nvidia-smi
spec:
 containers:
 - image: nvcr.io/nvidia/cuda:10.2-runtime-ubi8
   name: nvidia-smi
   command: [ nvidia-smi ]
   resources:
    limits:
      nvidia.com/gpu: 1
    requests:
      nvidia.com/gpu: 1
 restartPolicy: Never
EOF

Create the pod:

[kni@e26-h37-740xd ~]$ oc create -f nvidia-smi.yaml
pod/nvidia-smi created

Watch the pod creation:

[kni@e26-h37-740xd ~]$ oc get pods
NAME                                  READY   STATUS      RESTARTS   AGE
nvidia-smi                            1/1     Running     0          4s

Watch the pod completion:

[kni@e26-h37-740xd ~]$ oc get pods
NAME                                  READY   STATUS      RESTARTS   AGE
nvidia-smi                            0/1     Completed   0          8s

Check the logs, \o/ success the nvidia command is showing the NVIDIA T4:

[kni@e26-h37-740xd ~]$ oc logs nvidia-smi 
Mon May 10 20:26:07 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:3B:00.0 Off |                    0 |
| N/A   42C    P8    15W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Delete the pod:

[kni@e26-h37-740xd ~]$ oc delete pod nvidia-smi