NVIDIA Tesla GPU PCI passthrough with Red Hat OpenStack Platform 13
Red Hat OpenStack Platform provides two ways to use NVIDIA Tesla GPU accelerators with virtual instances:
- GPU PCI passthrough (only one physical GPU per instance)
- vGPU GRID (one physical GPU can be shared to multiple instances, Tech Preview OSP14)
This blog post is intended to show how to setup GPU PCI passthrough.
Direct hardware access bypasses the Linux kernel and provides secure direct memory access to the physical GPU Card using PCI. The use of SR-IOV Physical Function is available since OpenStack Havana, but the full automation of the deployment requires some knowledge of TripleO director templating and NVIDIA drivers configuration. To explain who to use all your Deep Learning frameworks with OpenStack, we will describe how to setup RHOSP 13 with NVIDIA GPU PCI passthrough (SR-IOV PF).
- Description of the platform
- Director installation
- Launch the deployment
- Attach one physical GPU M10 to an instance
- Attach two physical GPUs M10 and four GPUs M60 to an instance
- GPU burn testing in an instance
- TensorFlow benchmarking in an instance with NVIDIA CUDA Deep Neural Network library (cuDNN)
Prerequisites: This documentation is providing the information to use GPU cards via PCI passthrough with Red Hat OpenStack Platform. We will not explain in detail how to deploy RHOSP but we will focus on specific NVIDIA Tesla PCI passthrough configuration (SR-IOV PF).
Description of the platform
For this test we will use two servers, the compute node has two GPU boards (M10 and M60). We can boot up to 6 VMs with GPU per compute node. This lab environment uses heterogeneous GPU board types, for a production environment, I advise to use the same type of card like the Tesla V100 for data center or Tesla T4 for edge. This configuration can be used with the NVIDIA DGX-1 server.
-
One Dell PowerEdge R730: compute node with two GPU cards: NVIDIA Tesla M10 with two physical GPUs per PCI board NVIDIA Tesla M60 with four physical GPUs per PCI board
-
One Dell PowerEdge R440: qemu/kvm host for director and controller VMs This server will host these two control VMs:
- lab-director: director to manage the RHOSP deployment
- lab-controller: RHOSP controller
The platform uses network isolation with VLAN on one NIC:
VM preparation of lab-director
Create the qcow file for lab-director:
[egallen@lab607 ~]$ sudo qemu-img create -f qcow2 -o preallocation=metadata /var/lib/libvirt/images/lab-director.qcow2 120G;
Formatting '/var/lib/libvirt/images/lab-director.qcow2', fmt=qcow2 size=128849018880 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
[egallen@lab607 ~]$ sudo virt-resize --expand /dev/sda1 /data/inetsoft/rhel-server-7.6-x86_64-kvm.qcow2 /var/lib/libvirt/images/lab-director.qcow2
[egallen@lab607 ~]$ sudo virt-customize -a /var/lib/libvirt/images/lab-director.qcow2 --root-password password:XXXXXXXX --uninstall cloud-init
[root@lab607 ~]# sudo virt-install --ram 32768 --vcpus 8 --os-variant rhel7 \
--disk path=/var/lib/libvirt/images/lab-director.qcow2,device=disk,bus=virtio,format=qcow2 \
--graphics vnc,listen=0.0.0.0 --noautoconsole \
--network network:default \
--network network:br0 \
--name lab-director --dry-run \
--print-xml > /tmp/lab-director.xml;
[egallen@lab607 ~]$ sudo virsh define --file /tmp/lab-director.xml
Domain lab-director defined from /tmp/lab-director.xml
[egallen@lab607 ~]$ sudo virsh start lab-director
Domain lab-director started
[egallen@lab607 ~]$ sudo cat /var/log/messages | grep dnsmasq-dhcp
Dec 14 11:27:51 lab607 dnsmasq-dhcp[2352]: DHCPDISCOVER(virbr0) 52:54:00:22:61:12
Dec 14 11:27:51 lab607 dnsmasq-dhcp[2352]: DHCPOFFER(virbr0) 192.168.122.245 52:54:00:22:61:12
Dec 14 11:27:51 lab607 dnsmasq-dhcp[2352]: DHCPREQUEST(virbr0) 192.168.122.245 52:54:00:22:61:12
Dec 14 11:27:51 lab607 dnsmasq-dhcp[2352]: DHCPACK(virbr0) 192.168.122.245 52:54:00:22:61:12
Log in:
[egallen@lab607 ~]$ ssh root@192.168.122.245
The authenticity of host '192.168.122.245 (192.168.122.245)' can't be established.
ECDSA key fingerprint is SHA256:qoySkC3Q/mirktkqQ5UoZjeE/lmbA/cXcuND/B+bkIo.
ECDSA key fingerprint is MD5:33:7c:a9:fd:bd:a7:f9:f3:86:af:6d:6e:ea:36:f1:82.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.122.245' (ECDSA) to the list of known hosts.
root@192.168.122.245's password:
[root@localhost ~]#
Set hostname:
[root@localhost ~]# hostnamectl set-hostname lab6-director.lan.redhat.com
[root@lab-director ~]# cat /etc/hosts
127.0.0.1 lab-director localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
Director network configuration
[root@lab-director network-scripts]# cat ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="none"
#BOOTPROTOv6="dhcp"
ONBOOT="yes"
TYPE="Ethernet"
USERCTL="yes"
PEERDNS="yes"
IPV6INIT="yes"
PERSISTENT_DHCLIENT="1"
IPADDR=192.168.122.10
PREFIX=24
GATEWAY=192.168.122.1
DNS1=192.168.122.1
Add stack user:
[root@lab-director ~]# adduser stack
[root@lab-director ~]# passwd stack
Changing password for user stack.
New password:
BAD PASSWORD: The password fails the dictionary check - it is based on a dictionary word
Retype new password:
passwd: all authentication tokens updated successfully.
[root@lab-director ~]# echo "stack ALL=(root) NOPASSWD:ALL" | tee -a /etc/sudoers.d/stack
stack ALL=(root) NOPASSWD:ALL
[root@lab-director ~]# chmod 0440 /etc/sudoers.d/stack
Log with ssh to the director’s VM:
[egallen@lab607 ~]$ ssh stack@lab-director
The authenticity of host 'lab-director (192.168.122.10)' can't be established.
ECDSA key fingerprint is SHA256:qoySkC3Q/mirktkqQ5UoZjeE/lmbA/cXcuND/B+bkIo.
ECDSA key fingerprint is MD5:33:7c:a9:fd:bd:a7:f9:f3:86:af:6d:6e:ea:36:f1:82.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'lab-director' (ECDSA) to the list of known hosts.
Last login: Fri Dec 14 11:53:17 2018 from gateway
[stack@lab6-director ~]$
Register lab-director to RHN:
[stack@lab-director ~]$ sudo subscription-manager register --username mycdnaccount
Registering to: subscription.rhsm.redhat.com:443/subscription
Password:
The system has been registered with ID: XXXXX-XXXX-XXXX-XXXXXXXX
The registered system name is: lab-director.lan.redhat.com
WARNING
The yum plugins: /etc/yum/pluginconf.d/subscription-manager.conf, /etc/yum/pluginconf.d/product-id.conf were automatically enabled for the benefit of Red Hat Subscription Management. If not desired, use "subscription-manager config --rhsm.auto_enable_yum_plugins=0" to block this behavior.
[stack@lab-director ~]$ sudo subscription-manager attach --pool=XXXXXXXXXXXXXXXXXXX
Successfully attached a subscription for: SKU
[stack@lab-director ~]$ sudo subscription-manager repos --disable=*
[stack@lab-director ~]$ sudo subscription-manager repos --enable=rhel-7-server-rpms --enable=rhel-7-server-openstack-13-rpms --enable=rhel-7-server-extras-rpms --enable=rhel-7-server-rh-common-rpms --enable=rhel-ha-for-rhel-7-server-rpms
[stack@lab-director ~]$ sudo yum upgrade -y
Update /etc/hosts:
[stack@lab-director ~]$ echo "172.16.16.10 lab-director lab-director.lan.redhat.com" >> /etc/hosts
[stack@lab6-director ~]$ hostname -f
lab6-director.lan.redhat.com
Install tripleoclient:
[stack@lab-director ~]$ sudo yum install -y python-tripleoclient
Loaded plugins: product-id, search-disabled-repos, subscription-manager
Director configuration
Prepare undercloud.conf :
[stack@lab-director ~]$ cat << EOF > ~/undercloud.conf
[DEFAULT]
undercloud_hostname = lab-director.lan.redhat.com
local_ip = 172.16.16.10/24
undercloud_public_host = 172.16.16.11
undercloud_admin_host = 172.16.16.12
local_interface = eth1
undercloud_nameservers = 10.16.36.29
undercloud_ntp_servers = clock.redhat.com
overcloud_domain_name = lan.redhat.com
[ctlplane-subnet]
cidr = 172.16.16.0/24
dhcp_start = 172.16.16.201
dhcp_end = 172.16.16.220
inspection_iprange = 172.16.16.221,172.16.16.230
gateway = 172.16.16.1
masquerade_network = True
subnets = ctlplane-subnet
local_subnet = ctlplane-subnet
EOF
Install director, took 26mn on a VM with 4 vCPU and 32GB RAM:
[stack@lab-director ~]$ time openstack undercloud install
…
2018-12-14 15:56:46,423 INFO: Mistral workbooks configured successfully
2018-12-14 15:57:53,117 INFO: Configuring an hourly cron trigger for tripleo-ui logging
2018-12-14 15:57:55,189 INFO:
#############################################################################
Undercloud install complete.
The file containing this installation's passwords is at
/home/stack/undercloud-passwords.conf.
There is also a stackrc file at /home/stack/stackrc.
These files are needed to interact with the OpenStack services, and should be
secured.
#############################################################################
real 26m47.960s
user 14m57.680s
sys 2m15.859s
Check services launched:
[stack@lab6-director ~]$ sudo systemctl list-units openstack-*
UNIT LOAD ACTIVE SUB DESCRIPTION
openstack-glance-api.service loaded active running OpenStack Image Service (code-named Glance) API server
openstack-heat-engine.service loaded active running Openstack Heat Engine Service
openstack-ironic-conductor.service loaded active running OpenStack Ironic Conductor service
openstack-ironic-inspector-dnsmasq.service loaded active running PXE boot dnsmasq service for Ironic Inspector
openstack-ironic-inspector.service loaded active running Hardware introspection service for OpenStack Ironic
openstack-mistral-api.service loaded active running Mistral API Server
openstack-mistral-engine.service loaded active running Mistral Engine Server
openstack-mistral-executor.service loaded active running Mistral Executor Server
openstack-nova-api.service loaded active running OpenStack Nova API Server
openstack-nova-compute.service loaded active running OpenStack Nova Compute Server
openstack-nova-conductor.service loaded active running OpenStack Nova Conductor Server
openstack-nova-scheduler.service loaded active running OpenStack Nova Scheduler Server
openstack-swift-account-reaper.service loaded active running OpenStack Object Storage (swift) - Account Reaper
openstack-swift-account.service loaded active running OpenStack Object Storage (swift) - Account Server
openstack-swift-container-sync.service loaded active running OpenStack Object Storage (swift) - Container Sync
openstack-swift-container-updater.service loaded active running OpenStack Object Storage (swift) - Container Updater
openstack-swift-container.service loaded active running OpenStack Object Storage (swift) - Container Server
openstack-swift-object-expirer.service loaded active running OpenStack Object Storage (swift) - Object Expirer
openstack-swift-object-reconstructor.service loaded active running OpenStack Object Storage (swift) - Object Reconstructor
openstack-swift-object-updater.service loaded active running OpenStack Object Storage (swift) - Object Updater
openstack-swift-object.service loaded active running OpenStack Object Storage (swift) - Object Server
openstack-swift-proxy.service loaded active running OpenStack Object Storage (swift) - Proxy Server
openstack-zaqar@1.service loaded active running OpenStack Message Queuing Service (code-named Zaqar) Server Instance 1
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
23 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
Load the undercloud environment credential and try to get one token for the testing
[stack@lab-director ~]$ source ~/stackrc
(undercloud) [stack@lab-director ~]$ openstack token issue
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| expires | 2018-12-15T00:59:18+0000 |
| id | gAAAAABcFBmm68v2iqazTzDMkJpseAVsTUwUcqEEf94pZNgoPdO8y8stcsoh3MbziyR3vcRKxIfYX0Vyaftn9z-0ezN_qh3nHi1C6vt3EmBh2rAxZSnZBXXXXXXXXEoS3u641O18eUSwYxz4xvmQsnBBDdrQqNLDpjIUuEDhQ |
| project_id | 3b7e4a4d7207465cbb1b354594974169 |
| user_id | 62e1f08d72fa49de80d739e2e79c4064 |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Extract overcloud image:
[stack@lab-director ~]$ sudo yum install -y rhosp-director-images rhosp-director-images-ipa libguestfs-tools -y
[stack@rhsec-director6 ~]$ sudo mkdir -p /var/images/x86_64
[stack@rhsec-director6 ~]$ sudo chown stack:stack /var/images/x86_64
[stack@rhsec-director6 ~]$ cd /var/images/x86_64
(undercloud) [stack@lab-director x86_64]$ for i in /usr/share/rhosp-director-images/overcloud-full-latest-13.0.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-13.0.tar; do tar -xvf $i; done
Set default root password (not mandatory, but allow to be able to log in the console if needed):
[stack@lab-director x86_64]$ virt-customize -a overcloud-full.qcow2 --root-password password:XXXXXX
Import overcloud images into glance:
[stack@rhsec-director ~]$ source ~/stackrc
(undercloud) [stack@lab-director x86_64]$ openstack overcloud image upload --image-path /var/images/x86_64
Image "overcloud-full-vmlinuz" was uploaded.
+--------------------------------------+------------------------+-------------+---------+--------+
| ID | Name | Disk Format | Size | Status |
+--------------------------------------+------------------------+-------------+---------+--------+
| e7d82dfa-b373-4b71-a14c-aa3dbe359bfc | overcloud-full-vmlinuz | aki | 6635920 | active |
+--------------------------------------+------------------------+-------------+---------+--------+
Image "overcloud-full-initrd" was uploaded.
+--------------------------------------+-----------------------+-------------+----------+--------+
| ID | Name | Disk Format | Size | Status |
+--------------------------------------+-----------------------+-------------+----------+--------+
| 6939cf47-6ceb-4f27-af8c-f311383599aa | overcloud-full-initrd | ari | 62450365 | active |
+--------------------------------------+-----------------------+-------------+----------+--------+
Image "overcloud-full" was uploaded.
+--------------------------------------+----------------+-------------+------------+--------+
| ID | Name | Disk Format | Size | Status |
+--------------------------------------+----------------+-------------+------------+--------+
| ef7d74d8-2b5b-4a83-a6b3-ec19b89924ed | overcloud-full | qcow2 | 1343225856 | active |
+--------------------------------------+----------------+-------------+------------+--------+
Image "bm-deploy-kernel" was uploaded.
+--------------------------------------+------------------+-------------+---------+--------+
| ID | Name | Disk Format | Size | Status |
+--------------------------------------+------------------+-------------+---------+--------+
| 4a2c49e5-3213-4f73-8ef7-9e29b7c684e5 | bm-deploy-kernel | aki | 6635920 | active |
+--------------------------------------+------------------+-------------+---------+--------+
Image "bm-deploy-ramdisk" was uploaded.
+--------------------------------------+-------------------+-------------+-----------+--------+
| ID | Name | Disk Format | Size | Status |
+--------------------------------------+-------------------+-------------+-----------+--------+
| 5a76380a-c2f4-46be-929d-b8d72df40849 | bm-deploy-ramdisk | ari | 420507190 | active |
+--------------------------------------+-------------------+-------------+-----------+--------+
List images:
(undercloud) [stack@lab-director x86_64]$ openstack image list
+--------------------------------------+------------------------+--------+
| ID | Name | Status |
+--------------------------------------+------------------------+--------+
| 4a2c49e5-3213-4f73-8ef7-9e29b7c684e5 | bm-deploy-kernel | active |
| 5a76380a-c2f4-46be-929d-b8d72df40849 | bm-deploy-ramdisk | active |
| ef7d74d8-2b5b-4a83-a6b3-ec19b89924ed | overcloud-full | active |
| 6939cf47-6ceb-4f27-af8c-f311383599aa | overcloud-full-initrd | active |
| e7d82dfa-b373-4b71-a14c-aa3dbe359bfc | overcloud-full-vmlinuz | active |
+--------------------------------------+------------------------+--------+
List subnets:
(undercloud) [stack@lab-director x86_64]$ openstack subnet list
+--------------------------------------+-----------------+--------------------------------------+----------------+
| ID | Name | Network | Subnet |
+--------------------------------------+-----------------+--------------------------------------+----------------+
| 109b1f5f-cf20-4e7e-b1f9-a81d26878444 | ctlplane-subnet | 3feb9d76-5fff-4cee-8ab6-92470649ce64 | 172.16.16.0/24 |
+--------------------------------------+-----------------+--------------------------------------+----------------+
Enable server in the ctlplane subnet:
(undercloud) [stack@lab-director x86_64]$ openstack subnet set --dns-nameserver 10.16.36.29 --dns-nameserver 10.11.5.19 ctlplane-subnet
(undercloud) [stack@lab-director x86_64]$ openstack subnet show ctlplane-subnet
+-------------------+----------------------------------------------------------+
| Field | Value |
+-------------------+----------------------------------------------------------+
| allocation_pools | 172.16.16.201-172.16.16.220 |
| cidr | 172.16.16.0/24 |
| created_at | 2018-12-14T20:56:08Z |
| description | |
| dns_nameservers | 10.11.5.19, 10.16.36.29 |
| enable_dhcp | True |
| gateway_ip | 172.16.16.1 |
| host_routes | destination='169.254.169.254/32', gateway='172.16.16.10' |
| id | 109b1f5f-cf20-4e7e-b1f9-a81d26878444 |
| ip_version | 4 |
| ipv6_address_mode | None |
| ipv6_ra_mode | None |
| name | ctlplane-subnet |
| network_id | 3feb9d76-5fff-4cee-8ab6-92470649ce64 |
| project_id | 3b7e4a4d7207465cbb1b354594974169 |
| revision_number | 1 |
| segment_id | None |
| service_types | |
| subnetpool_id | None |
| tags | |
| updated_at | 2018-12-14T21:33:50Z |
+-------------------+----------------------------------------------------------+
Templates preparation
Create a template to pull the images to the local Docker registry:
(undercloud) [stack@lab-director ~]$ mkdir ~/templates
(undercloud) [stack@lab-director ~]$ openstack overcloud container image prepare --namespace=registry.access.redhat.com/rhosp13 --push-destination=172.16.16.10:8787 --prefix=openstack- --tag-from-label {version}-{release} --output-env-file=/home/stack/templates/overcloud_images.yaml --output-images-file /home/stack/local_registry_images.yaml
container_images:
- imagename: registry.access.redhat.com/rhosp13/openstack-aodh-api:13.0-62.1543534121
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-aodh-listener:13.0-61.1543534105
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-cinder-api:13.0-63.1543534127
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-cinder-volume:13.0-63.1543534127
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-cron:13.0-66.1543534082
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-glance-api:13.0-64.1543534114
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-haproxy:13.0-62.1543534080
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-heat-api:13.0-61.1543534111
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-heat-engine:13.0-59.1543534131
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-horizon:13.0-60.1543534103
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-iscsid:13.0-60.1543534061
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-keystone:13.0-60.1543534107
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-mariadb:13.0-62.1543534066
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-memcached:13.0-61.1543534084
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-neutron-server:13.0-64.1543534105
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-api:13.0-67.1543534106
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-compute:13.0-72
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-conductor:13.0-66.1543534132
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-libvirt:13.0-73
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-panko-api:13.0-62.1543534139
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-rabbitmq:13.0-63.1543534083
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-redis:13.0-64.1543534104
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-swift-account:13.0-61.1543534113
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-swift-container:13.0-64.1543534110
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-swift-object:13.0-61.1543534106
push_destination: 172.16.16.10:8787
- imagename: registry.access.redhat.com/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
push_destination: 172.16.16.10:8787
Pull the images using the local_registry_images.yaml file:
(undercloud) [stack@lab-director ~]$ sudo openstack overcloud container image upload --config-file /home/stack/local_registry_images.yaml --verbose
[stack@lab-director ~]$ sudo openstack overcloud container image upload --config-file /home/stack/local_registry_images.yaml --verbose
START with options: [u'overcloud', u'container', u'image', u'upload', u'--config-file', u'/home/stack/local_registry_images.yaml', u'--verbose']
command: overcloud container image upload -> tripleoclient.v1.container_image.UploadImage (auth=False)
Using config files: [u'/home/stack/local_registry_images.yaml']
imagename: registry.access.redhat.com/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
Completed upload for image registry.access.redhat.com/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
imagename: registry.access.redhat.com/rhosp13/openstack-aodh-api:13.0-62.1543534121
imagename: registry.access.redhat.com/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
imagename: registry.access.redhat.com/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
imagename: registry.access.redhat.com/rhosp13/openstack-cinder-volume:13.0-63.1543534127
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-aodh-api:13.0-62.1543534121
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-cinder-volume:13.0-63.1543534127
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-cinder-volume:13.0-63.1543534127
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-aodh-api:13.0-62.1543534121
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
Completed upload for image registry.access.redhat.com/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
imagename: registry.access.redhat.com/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
Completed upload for image registry.access.redhat.com/rhosp13/openstack-aodh-api:13.0-62.1543534121
imagename: registry.access.redhat.com/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
Completed upload for image registry.access.redhat.com/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
imagename: registry.access.redhat.com/rhosp13/openstack-cinder-api:13.0-63.1543534127
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-cinder-api:13.0-63.1543534127
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-cinder-api:13.0-63.1543534127
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
Completed upload for image registry.access.redhat.com/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
imagename: registry.access.redhat.com/rhosp13/openstack-aodh-listener:13.0-61.1543534105
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-aodh-listener:13.0-61.1543534105
Completed upload for image registry.access.redhat.com/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
imagename: registry.access.redhat.com/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-aodh-listener:13.0-61.1543534105
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
Completed upload for image registry.access.redhat.com/rhosp13/openstack-aodh-listener:13.0-61.1543534105
imagename: registry.access.redhat.com/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
Completed upload for image registry.access.redhat.com/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
imagename: registry.access.redhat.com/rhosp13/openstack-haproxy:13.0-62.1543534080
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-haproxy:13.0-62.1543534080
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-haproxy:13.0-62.1543534080
Completed upload for image registry.access.redhat.com/rhosp13/openstack-cinder-api:13.0-63.1543534127
imagename: registry.access.redhat.com/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
Completed upload for image registry.access.redhat.com/rhosp13/openstack-haproxy:13.0-62.1543534080
imagename: registry.access.redhat.com/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
Completed upload for image registry.access.redhat.com/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
imagename: registry.access.redhat.com/rhosp13/openstack-heat-engine:13.0-59.1543534131
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-heat-engine:13.0-59.1543534131
Completed upload for image registry.access.redhat.com/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
imagename: registry.access.redhat.com/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-heat-engine:13.0-59.1543534131
Completed upload for image registry.access.redhat.com/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
imagename: registry.access.redhat.com/rhosp13/openstack-heat-api:13.0-61.1543534111
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-heat-api:13.0-61.1543534111
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-heat-api:13.0-61.1543534111
Completed upload for image registry.access.redhat.com/rhosp13/openstack-heat-engine:13.0-59.1543534131
imagename: registry.access.redhat.com/rhosp13/openstack-horizon:13.0-60.1543534103
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-horizon:13.0-60.1543534103
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-horizon:13.0-60.1543534103
Completed upload for image registry.access.redhat.com/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
imagename: registry.access.redhat.com/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
Completed upload for image registry.access.redhat.com/rhosp13/openstack-heat-api:13.0-61.1543534111
imagename: registry.access.redhat.com/rhosp13/openstack-keystone:13.0-60.1543534107
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-keystone:13.0-60.1543534107
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-keystone:13.0-60.1543534107
Completed upload for image registry.access.redhat.com/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
imagename: registry.access.redhat.com/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
Completed upload for image registry.access.redhat.com/rhosp13/openstack-keystone:13.0-60.1543534107
imagename: registry.access.redhat.com/rhosp13/openstack-mariadb:13.0-62.1543534066
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-mariadb:13.0-62.1543534066
Completed upload for image registry.access.redhat.com/rhosp13/openstack-horizon:13.0-60.1543534103
imagename: registry.access.redhat.com/rhosp13/openstack-iscsid:13.0-60.1543534061
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-iscsid:13.0-60.1543534061
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-mariadb:13.0-62.1543534066
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-iscsid:13.0-60.1543534061
Completed upload for image registry.access.redhat.com/rhosp13/openstack-cinder-volume:13.0-63.1543534127
imagename: registry.access.redhat.com/rhosp13/openstack-cron:13.0-66.1543534082
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-cron:13.0-66.1543534082
Completed upload for image registry.access.redhat.com/rhosp13/openstack-iscsid:13.0-60.1543534061
imagename: registry.access.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
Completed upload for image registry.access.redhat.com/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
imagename: registry.access.redhat.com/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-cron:13.0-66.1543534082
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
Completed upload for image registry.access.redhat.com/rhosp13/openstack-cron:13.0-66.1543534082
imagename: registry.access.redhat.com/rhosp13/openstack-glance-api:13.0-64.1543534114
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-glance-api:13.0-64.1543534114
Completed upload for image registry.access.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
imagename: registry.access.redhat.com/rhosp13/openstack-neutron-server:13.0-64.1543534105
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-neutron-server:13.0-64.1543534105
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-glance-api:13.0-64.1543534114
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-neutron-server:13.0-64.1543534105
Completed upload for image registry.access.redhat.com/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
imagename: registry.access.redhat.com/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
Completed upload for image registry.access.redhat.com/rhosp13/openstack-mariadb:13.0-62.1543534066
imagename: registry.access.redhat.com/rhosp13/openstack-memcached:13.0-61.1543534084
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-memcached:13.0-61.1543534084
Completed upload for image registry.access.redhat.com/rhosp13/openstack-neutron-server:13.0-64.1543534105
imagename: registry.access.redhat.com/rhosp13/openstack-nova-api:13.0-67.1543534106
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-api:13.0-67.1543534106
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-api:13.0-67.1543534106
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-memcached:13.0-61.1543534084
Completed upload for image registry.access.redhat.com/rhosp13/openstack-glance-api:13.0-64.1543534114
imagename: registry.access.redhat.com/rhosp13/openstack-nova-compute:13.0-72
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-compute:13.0-72
Completed upload for image registry.access.redhat.com/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
imagename: registry.access.redhat.com/rhosp13/openstack-nova-libvirt:13.0-73
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-libvirt:13.0-73
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-compute:13.0-72
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-libvirt:13.0-73
Completed upload for image registry.access.redhat.com/rhosp13/openstack-memcached:13.0-61.1543534084
imagename: registry.access.redhat.com/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-api:13.0-67.1543534106
imagename: registry.access.redhat.com/rhosp13/openstack-redis:13.0-64.1543534104
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-redis:13.0-64.1543534104
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-redis:13.0-64.1543534104
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
imagename: registry.access.redhat.com/rhosp13/openstack-panko-api:13.0-62.1543534139
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-panko-api:13.0-62.1543534139
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-panko-api:13.0-62.1543534139
Completed upload for image registry.access.redhat.com/rhosp13/openstack-redis:13.0-64.1543534104
imagename: registry.access.redhat.com/rhosp13/openstack-swift-account:13.0-61.1543534113
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-swift-account:13.0-61.1543534113
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-swift-account:13.0-61.1543534113
Completed upload for image registry.access.redhat.com/rhosp13/openstack-swift-account:13.0-61.1543534113
imagename: registry.access.redhat.com/rhosp13/openstack-swift-container:13.0-64.1543534110
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-swift-container:13.0-64.1543534110
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-swift-container:13.0-64.1543534110
Completed upload for image registry.access.redhat.com/rhosp13/openstack-panko-api:13.0-62.1543534139
imagename: registry.access.redhat.com/rhosp13/openstack-rabbitmq:13.0-63.1543534083
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-rabbitmq:13.0-63.1543534083
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-compute:13.0-72
imagename: registry.access.redhat.com/rhosp13/openstack-nova-conductor:13.0-66.1543534132
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-conductor:13.0-66.1543534132
Completed upload for image registry.access.redhat.com/rhosp13/openstack-swift-container:13.0-64.1543534110
imagename: registry.access.redhat.com/rhosp13/openstack-swift-object:13.0-61.1543534106
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-swift-object:13.0-61.1543534106
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-conductor:13.0-66.1543534132
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-rabbitmq:13.0-63.1543534083
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-swift-object:13.0-61.1543534106
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-conductor:13.0-66.1543534132
imagename: registry.access.redhat.com/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
Completed upload for image registry.access.redhat.com/rhosp13/openstack-swift-object:13.0-61.1543534106
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-libvirt:13.0-73
imagename: registry.access.redhat.com/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
imagename: registry.access.redhat.com/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
Running skopeo inspect docker://registry.access.redhat.com/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
Running skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
Completed upload for image registry.access.redhat.com/rhosp13/openstack-rabbitmq:13.0-63.1543534083
Completed upload for image registry.access.redhat.com/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
result ['registry.access.redhat.com/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108', '172.16.16.10:8787/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108', 'registry.access.redhat.com/rhosp13/openstack-aodh-api:13.0-62.1543534121', '172.16.16.10:8787/rhosp13/openstack-aodh-api:13.0-62.1543534121', 'registry.access.redhat.com/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127', '172.16.16.10:8787/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127', 'registry.access.redhat.com/rhosp13/openstack-aodh-listener:13.0-61.1543534105', '172.16.16.10:8787/rhosp13/openstack-aodh-listener:13.0-61.1543534105', 'registry.access.redhat.com/rhosp13/openstack-aodh-notifier:13.0-62.1543534128', '172.16.16.10:8787/rhosp13/openstack-aodh-notifier:13.0-62.1543534128', 'registry.access.redhat.com/rhosp13/openstack-ceilometer-central:13.0-59.1543534112', '172.16.16.10:8787/rhosp13/openstack-ceilometer-central:13.0-59.1543534112', 'registry.access.redhat.com/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134', '172.16.16.10:8787/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134', 'registry.access.redhat.com/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109', '172.16.16.10:8787/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109', 'registry.access.redhat.com/rhosp13/openstack-cinder-api:13.0-63.1543534127', '172.16.16.10:8787/rhosp13/openstack-cinder-api:13.0-63.1543534127', 'registry.access.redhat.com/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135', '172.16.16.10:8787/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135', 'registry.access.redhat.com/rhosp13/openstack-cinder-volume:13.0-63.1543534127', '172.16.16.10:8787/rhosp13/openstack-cinder-volume:13.0-63.1543534127', 'registry.access.redhat.com/rhosp13/openstack-cron:13.0-66.1543534082', '172.16.16.10:8787/rhosp13/openstack-cron:13.0-66.1543534082', 'registry.access.redhat.com/rhosp13/openstack-glance-api:13.0-64.1543534114', '172.16.16.10:8787/rhosp13/openstack-glance-api:13.0-64.1543534114', 'registry.access.redhat.com/rhosp13/openstack-gnocchi-api:13.0-61.1543534135', '172.16.16.10:8787/rhosp13/openstack-gnocchi-api:13.0-61.1543534135', 'registry.access.redhat.com/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116', '172.16.16.10:8787/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116', 'registry.access.redhat.com/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137', '172.16.16.10:8787/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137', 'registry.access.redhat.com/rhosp13/openstack-haproxy:13.0-62.1543534080', '172.16.16.10:8787/rhosp13/openstack-haproxy:13.0-62.1543534080', 'registry.access.redhat.com/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138', '172.16.16.10:8787/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138', 'registry.access.redhat.com/rhosp13/openstack-heat-api:13.0-61.1543534111', '172.16.16.10:8787/rhosp13/openstack-heat-api:13.0-61.1543534111', 'registry.access.redhat.com/rhosp13/openstack-heat-engine:13.0-59.1543534131', '172.16.16.10:8787/rhosp13/openstack-heat-engine:13.0-59.1543534131', 'registry.access.redhat.com/rhosp13/openstack-horizon:13.0-60.1543534103', '172.16.16.10:8787/rhosp13/openstack-horizon:13.0-60.1543534103', 'registry.access.redhat.com/rhosp13/openstack-iscsid:13.0-60.1543534061', '172.16.16.10:8787/rhosp13/openstack-iscsid:13.0-60.1543534061', 'registry.access.redhat.com/rhosp13/openstack-keystone:13.0-60.1543534107', '172.16.16.10:8787/rhosp13/openstack-keystone:13.0-60.1543534107', 'registry.access.redhat.com/rhosp13/openstack-mariadb:13.0-62.1543534066', '172.16.16.10:8787/rhosp13/openstack-mariadb:13.0-62.1543534066', 'registry.access.redhat.com/rhosp13/openstack-memcached:13.0-61.1543534084', '172.16.16.10:8787/rhosp13/openstack-memcached:13.0-61.1543534084', 'registry.access.redhat.com/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114', '172.16.16.10:8787/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114', 'registry.access.redhat.com/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114', '172.16.16.10:8787/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114', 'registry.access.redhat.com/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133', '172.16.16.10:8787/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133', 'registry.access.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110', '172.16.16.10:8787/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110', 'registry.access.redhat.com/rhosp13/openstack-neutron-server:13.0-64.1543534105', '172.16.16.10:8787/rhosp13/openstack-neutron-server:13.0-64.1543534105', 'registry.access.redhat.com/rhosp13/openstack-nova-api:13.0-67.1543534106', '172.16.16.10:8787/rhosp13/openstack-nova-api:13.0-67.1543534106', 'registry.access.redhat.com/rhosp13/openstack-nova-compute:13.0-72', '172.16.16.10:8787/rhosp13/openstack-nova-compute:13.0-72', 'registry.access.redhat.com/rhosp13/openstack-nova-conductor:13.0-66.1543534132', '172.16.16.10:8787/rhosp13/openstack-nova-conductor:13.0-66.1543534132', 'registry.access.redhat.com/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131', '172.16.16.10:8787/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131', 'registry.access.redhat.com/rhosp13/openstack-nova-libvirt:13.0-73', '172.16.16.10:8787/rhosp13/openstack-nova-libvirt:13.0-73', 'registry.access.redhat.com/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127', '172.16.16.10:8787/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127', 'registry.access.redhat.com/rhosp13/openstack-nova-placement-api:13.0-67.1543534124', '172.16.16.10:8787/rhosp13/openstack-nova-placement-api:13.0-67.1543534124', 'registry.access.redhat.com/rhosp13/openstack-nova-scheduler:13.0-67.1543534138', '172.16.16.10:8787/rhosp13/openstack-nova-scheduler:13.0-67.1543534138', 'registry.access.redhat.com/rhosp13/openstack-panko-api:13.0-62.1543534139', '172.16.16.10:8787/rhosp13/openstack-panko-api:13.0-62.1543534139', 'registry.access.redhat.com/rhosp13/openstack-rabbitmq:13.0-63.1543534083', '172.16.16.10:8787/rhosp13/openstack-rabbitmq:13.0-63.1543534083', 'registry.access.redhat.com/rhosp13/openstack-redis:13.0-64.1543534104', '172.16.16.10:8787/rhosp13/openstack-redis:13.0-64.1543534104', 'registry.access.redhat.com/rhosp13/openstack-swift-account:13.0-61.1543534113', '172.16.16.10:8787/rhosp13/openstack-swift-account:13.0-61.1543534113', 'registry.access.redhat.com/rhosp13/openstack-swift-container:13.0-64.1543534110', '172.16.16.10:8787/rhosp13/openstack-swift-container:13.0-64.1543534110', 'registry.access.redhat.com/rhosp13/openstack-swift-object:13.0-61.1543534106', '172.16.16.10:8787/rhosp13/openstack-swift-object:13.0-61.1543534106']
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-aodh-api:13.0-62.1543534121
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-aodh-listener:13.0-61.1543534105
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-cinder-api:13.0-63.1543534127
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-cinder-volume:13.0-63.1543534127
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-cron:13.0-66.1543534082
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-glance-api:13.0-64.1543534114
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-haproxy:13.0-62.1543534080
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-heat-api:13.0-61.1543534111
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-heat-engine:13.0-59.1543534131
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-horizon:13.0-60.1543534103
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-iscsid:13.0-60.1543534061
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-keystone:13.0-60.1543534107
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-mariadb:13.0-62.1543534066
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-memcached:13.0-61.1543534084
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-neutron-server:13.0-64.1543534105
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-api:13.0-67.1543534106
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-compute:13.0-72
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-conductor:13.0-66.1543534132
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-libvirt:13.0-73
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-panko-api:13.0-62.1543534139
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-rabbitmq:13.0-63.1543534083
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-redis:13.0-64.1543534104
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-swift-account:13.0-61.1543534113
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-swift-container:13.0-64.1543534110
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-swift-object:13.0-61.1543534106
Removing local copy of 172.16.16.10:8787/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
Removing local copy of registry.access.redhat.com/rhosp13/openstack-aodh-api:13.0-62.1543534121
Removing local copy of registry.access.redhat.com/rhosp13/openstack-aodh-evaluator:13.0-62.1543534127
Removing local copy of registry.access.redhat.com/rhosp13/openstack-aodh-listener:13.0-61.1543534105
Removing local copy of registry.access.redhat.com/rhosp13/openstack-aodh-notifier:13.0-62.1543534128
Removing local copy of registry.access.redhat.com/rhosp13/openstack-ceilometer-central:13.0-59.1543534112
Removing local copy of registry.access.redhat.com/rhosp13/openstack-ceilometer-compute:13.0-61.1543534134
Removing local copy of registry.access.redhat.com/rhosp13/openstack-ceilometer-notification:13.0-61.1543534109
Removing local copy of registry.access.redhat.com/rhosp13/openstack-cinder-api:13.0-63.1543534127
Removing local copy of registry.access.redhat.com/rhosp13/openstack-cinder-scheduler:13.0-65.1543534135
Removing local copy of registry.access.redhat.com/rhosp13/openstack-cinder-volume:13.0-63.1543534127
Removing local copy of registry.access.redhat.com/rhosp13/openstack-cron:13.0-66.1543534082
Removing local copy of registry.access.redhat.com/rhosp13/openstack-glance-api:13.0-64.1543534114
Removing local copy of registry.access.redhat.com/rhosp13/openstack-gnocchi-api:13.0-61.1543534135
Removing local copy of registry.access.redhat.com/rhosp13/openstack-gnocchi-metricd:13.0-62.1543534116
Removing local copy of registry.access.redhat.com/rhosp13/openstack-gnocchi-statsd:13.0-62.1543534137
Removing local copy of registry.access.redhat.com/rhosp13/openstack-haproxy:13.0-62.1543534080
Removing local copy of registry.access.redhat.com/rhosp13/openstack-heat-api-cfn:13.0-60.1543534138
Removing local copy of registry.access.redhat.com/rhosp13/openstack-heat-api:13.0-61.1543534111
Removing local copy of registry.access.redhat.com/rhosp13/openstack-heat-engine:13.0-59.1543534131
Removing local copy of registry.access.redhat.com/rhosp13/openstack-horizon:13.0-60.1543534103
Removing local copy of registry.access.redhat.com/rhosp13/openstack-iscsid:13.0-60.1543534061
Removing local copy of registry.access.redhat.com/rhosp13/openstack-keystone:13.0-60.1543534107
Removing local copy of registry.access.redhat.com/rhosp13/openstack-mariadb:13.0-62.1543534066
Removing local copy of registry.access.redhat.com/rhosp13/openstack-memcached:13.0-61.1543534084
Removing local copy of registry.access.redhat.com/rhosp13/openstack-neutron-dhcp-agent:13.0-66.1543534114
Removing local copy of registry.access.redhat.com/rhosp13/openstack-neutron-l3-agent:13.0-64.1543534114
Removing local copy of registry.access.redhat.com/rhosp13/openstack-neutron-metadata-agent:13.0-65.1543534133
Removing local copy of registry.access.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:13.0-65.1543534110
Removing local copy of registry.access.redhat.com/rhosp13/openstack-neutron-server:13.0-64.1543534105
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-api:13.0-67.1543534106
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-compute:13.0-72
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-conductor:13.0-66.1543534132
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-consoleauth:13.0-66.1543534131
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-libvirt:13.0-73
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-novncproxy:13.0-69.1543534127
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-placement-api:13.0-67.1543534124
Removing local copy of registry.access.redhat.com/rhosp13/openstack-nova-scheduler:13.0-67.1543534138
Removing local copy of registry.access.redhat.com/rhosp13/openstack-panko-api:13.0-62.1543534139
Removing local copy of registry.access.redhat.com/rhosp13/openstack-rabbitmq:13.0-63.1543534083
Removing local copy of registry.access.redhat.com/rhosp13/openstack-redis:13.0-64.1543534104
Removing local copy of registry.access.redhat.com/rhosp13/openstack-swift-account:13.0-61.1543534113
Removing local copy of registry.access.redhat.com/rhosp13/openstack-swift-container:13.0-64.1543534110
Removing local copy of registry.access.redhat.com/rhosp13/openstack-swift-object:13.0-61.1543534106
Removing local copy of registry.access.redhat.com/rhosp13/openstack-swift-proxy-server:13.0-61.1543534108
END return value: None
Check local repository folder:
[stack@lab-director ~]$ ls /var/lib/registry/docker/registry/v2/repositories/rhosp13/
openstack-aodh-api openstack-glance-api openstack-mariadb openstack-nova-libvirt
openstack-aodh-evaluator openstack-gnocchi-api openstack-memcached openstack-nova-novncproxy
openstack-aodh-listener openstack-gnocchi-metricd openstack-neutron-dhcp-agent openstack-nova-placement-api
openstack-aodh-notifier openstack-gnocchi-statsd openstack-neutron-l3-agent openstack-nova-scheduler
openstack-ceilometer-central openstack-haproxy openstack-neutron-metadata-agent openstack-panko-api
openstack-ceilometer-compute openstack-heat-api openstack-neutron-openvswitch-agent openstack-rabbitmq
openstack-ceilometer-notification openstack-heat-api-cfn openstack-neutron-server openstack-redis
openstack-cinder-api openstack-heat-engine openstack-nova-api openstack-swift-account
openstack-cinder-scheduler openstack-horizon openstack-nova-compute openstack-swift-container
openstack-cinder-volume openstack-iscsid openstack-nova-conductor openstack-swift-object
openstack-cron openstack-keystone openstack-nova-consoleauth openstack-swift-proxy-server
[stack@lab-director ~]$ du -d1 -h /var/lib/registry/docker/registry/v2/
1.9M /var/lib/registry/docker/registry/v2/repositories
2.0G /var/lib/registry/docker/registry/v2/blobs
2.0G /var/lib/registry/docker/registry/v2/
[stack@lab-director ~]$ skopeo inspect --tls-verify=false docker://172.16.16.10:8787/rhosp13/openstack-nova-api:13.0-67.1543534106
{
"Name": "172.16.16.10:8787/rhosp13/openstack-nova-api",
"Digest": "sha256:4755d77d47b279af38d64e64d0461a74b2e8d363ee73231a319e5b0d0068a928",
"RepoTags": [
"13.0-67.1543534106"
],
"Created": "2018-11-30T02:37:21.055976Z",
"DockerVersion": "1.13.1",
"Labels": {
"architecture": "x86_64",
"authoritative-source-url": "registry.access.redhat.com",
"batch": "20181102.1",
"build-date": "2018-11-30T02:34:08.660768",
"com.redhat.build-host": "cpt-0002.osbs.prod.upshift.rdu2.redhat.com",
"com.redhat.component": "openstack-nova-api-container",
"description": "Red Hat OpenStack Platform 13.0 nova-api",
"distribution-scope": "public",
"io.k8s.description": "Red Hat OpenStack Platform 13.0 nova-api",
"io.k8s.display-name": "Red Hat OpenStack Platform 13.0 nova-api",
"io.openshift.tags": "rhosp osp openstack osp-13.0",
"name": "rhosp13/openstack-nova-api",
"release": "67.1543534106",
"summary": "Red Hat OpenStack Platform 13.0 nova-api",
"url": "https://access.redhat.com/containers/#/registry.access.redhat.com/rhosp13/openstack-nova-api/images/13.0-67.1543534106",
"vcs-ref": "a47b6e4a2095d2a35b9374efbd654bb1b10af6db",
"vcs-type": "git",
"vendor": "Red Hat, Inc.",
"version": "13.0"
},
"Architecture": "amd64",
"Os": "linux",
"Layers": [
"sha256:9a1bea865f798d0e4f2359bd39ec69110369e3a1131aba6eb3cbf48707fdf92d",
"sha256:602125c154e3e132db63d8e6479c5c93a64cbfd3a5ced509de73891ff7102643",
"sha256:67a69bb7b40625fdd599449d78e504b44f8127b442e0e825f84a6e1067c332c3",
"sha256:ce39df8a199d647b2ef2d1730e989fb4d2431793436b5482e746f44bb38436b6",
"sha256:55d173d79e115a436595edc2745780d8148367caddcf2de135c222decc9a8b38",
"sha256:e64d60b009d2ed36dafe05b47c4a774c0748880aff90e0405089c6b406c95638"
]
}
Validate listening port of the local registry:
[stack@lab-director ~]$ sudo netstat -tepln |grep 8787
tcp 0 0 172.16.16.10:8787 0.0.0.0:* LISTEN 0 39020 4734/registry
Checks IPMI:
[stack@lab-director ~]$ ipmitool -I lanplus -H lab604-oob.lan.redhat.com -U root -P XXXXXXX chassis power status
Chassis Power is on
[stack@lab-director ~]$ ping lab604-oob.lan.redhat.com
PING lab604-oob.lan.redhat.com (10.19.152.193) 56(84) bytes of data.
64 bytes from lab604-oob.lan.redhat.com (10.19.152.193): icmp_seq=1 ttl=63 time=0.399 ms
IPMI test:
[stack@lab-director ~]$ cat ipmi-test.sh
#!/bin/bash
ipmitool -I lanplus -H lab604-oob.lan.redhat.com -U root -P XXXXXX chassis power status
ipmitool -I lanplus -U admin -P XXXXXXXX -H 172.16.16.1 -p 6230 power status
[stack@lab-director ~]$ cat ipmi-off.sh
#!/bin/bash
ipmitool -I lanplus -H lab604-oob.lan.redhat.com -U root -P XXXXXX chassis power off
ipmitool -I lanplus -U admin -P XXXXXXXX -H 172.16.16.1 -p 6230 power off
Register nodes
Prepare inventory instackenv.json:
[stack@lab-director ~]$ source ~/stackrc
(undercloud) [stack@lab-director ~]$ cat instackenv.json
{
"nodes": [
{
"capabilities": "profile:control",
"name": "lab-controller",
"pm_user": "admin",
"pm_password": "XXXXXXXX",
"pm_port": "6230",
"pm_addr": "192.168.122.1",
"pm_type": "pxe_ipmitool",
"mac": [
"52:54:00:f9:d1:7a"
]
},
{
"capabilities": "profile:compute",
"name": "lab-compute",
"pm_user": "root",
"pm_password": "XXXXXX",
"pm_addr": "10.19.152.193",
"pm_type": "pxe_ipmitool",
"mac": [
"18:66:da:fc:9c:2d"
]
}
]
}
(undercloud) [stack@lab-director ~]$ openstack baremetal node list
(undercloud) [stack@lab-director ~]$ openstack overcloud node import ~/instackenv.json
Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: e05ebfc9-0ea8-4977-ac5a-0da038f90adf
Waiting for messages on queue 'tripleo' with no timeout.
2 node(s) successfully moved to the "manageable" state.
Successfully registered node UUID d5da646a-92e7-48b3-bb98-e6074bcd910c
Successfully registered node UUID 18dc4573-c632-4617-92d9-4745b9825234
Check baremetal list:
(undercloud) [stack@lab-director ~]$ openstack baremetal node list
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| d5da646a-92e7-48b3-bb98-e6074bcd910c | lab-controller | None | power off | manageable | False |
| 18dc4573-c632-4617-92d9-4745b9825234 | lab-compute | None | power off | manageable | False |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
Instrospection
Launch the introspection:
(undercloud) [stack@lab-director ~]$ openstack overcloud node introspect --all-manageable --provide
Waiting for introspection to finish...
Started Mistral Workflow tripleo.baremetal.v1.introspect_manageable_nodes. Execution ID: 817ce746-cd44-4c1c-af3e-d026b7ec8ad0
Waiting for messages on queue 'tripleo' with no timeout.
Introspection of node 9f91cb39-b6f7-421f-9abb-54201229fd46 completed. Status:SUCCESS. Errors:None
Introspection of node fbd2c0a3-8605-4aac-9d8e-36b52b2c77da completed. Status:SUCCESS. Errors:None
Successfully introspected 2 node(s).
Introspection completed.
Started Mistral Workflow tripleo.baremetal.v1.provide_manageable_nodes. Execution ID: cc6f3e2a-0819-42ab-8c38-40baf0a52d64
Waiting for messages on queue 'tripleo' with no timeout.
2 node(s) successfully moved to the "available" state.
Prepare the controller
Introspection done:
(undercloud) [stack@lab-director ~]$ openstack baremetal node list
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| d5da646a-92e7-48b3-bb98-e6074bcd910c | lab-controller | None | power off | available | False |
| 18dc4573-c632-4617-92d9-4745b9825234 | lab-compute | None | power off | available | False |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
(undercloud) [stack@lab-director ~]$ openstack baremetal introspection list
+--------------------------------------+---------------------+---------------------+-------+
| UUID | Started at | Finished at | Error |
+--------------------------------------+---------------------+---------------------+-------+
| 18dc4573-c632-4617-92d9-4745b9825234 | 2018-12-14T22:13:39 | 2018-12-14T22:20:02 | None |
| d5da646a-92e7-48b3-bb98-e6074bcd910c | 2018-12-14T22:13:38 | 2018-12-14T22:15:30 | None |
+--------------------------------------+---------------------+---------------------+-------+
Check profiles imported from instackenv.json
(undercloud) [stack@lab-director ~]$ openstack overcloud profiles list
+--------------------------------------+--------------------+-----------------+-----------------+-------------------+
| Node UUID | Node Name | Provision State | Current Profile | Possible Profiles |
+--------------------------------------+--------------------+-----------------+-----------------+-------------------+
| d5da646a-92e7-48b3-bb98-e6074bcd910c | lab-controller | available | control | |
| 18dc4573-c632-4617-92d9-4745b9825234 | lab-compute | available | compute | |
+--------------------------------------+--------------------+-----------------+-----------------+-------------------+
List interfaces discovered with the introspection:
(undercloud) [stack@lab-director templates]$ openstack baremetal introspection interface list lab-controller
+-----------+-------------------+----------------------+-------------------+----------------+
| Interface | MAC Address | Switch Port VLAN IDs | Switch Chassis ID | Switch Port ID |
+-----------+-------------------+----------------------+-------------------+----------------+
| eth0 | 52:54:00:f9:d1:7a | [] | None | None |
+-----------+-------------------+----------------------+-------------------+----------------+
(undercloud) [stack@lab-director templates]$ openstack baremetal introspection interface list lab-compute
+-----------+-------------------+----------------------+-------------------+-------------------+
| Interface | MAC Address | Switch Port VLAN IDs | Switch Chassis ID | Switch Port ID |
+-----------+-------------------+----------------------+-------------------+-------------------+
| p2p1 | 3c:fd:fe:1f:bc:80 | [] | None | None |
| em4 | 18:66:da:fc:9c:2f | [] | None | None |
| p2p2 | 3c:fd:fe:1f:bc:82 | [] | None | None |
| em3 | 18:66:da:fc:9c:2e | [] | None | None |
| em2 | 18:66:da:fc:9c:2d | [] | d0:94:66:59:c7:aa | d0:94:66:59:c7:aa |
| em1 | 18:66:da:fc:9c:2c | [179] | 08:81:f4:a6:9e:80 | ge-1/0/35 |
+-----------+-------------------+----------------------+-------------------+-------------------+
List GPUs on the compute node (install manually a RHEL with DVD to get the PCI IDs):
[root@rhel-compute ~]# lspci -nn | grep -i nvidia
06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
84:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
85:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
86:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
87:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
Prepare templates
[stack@lab-director ~]$ cat << EOF > cat ~/overcloud-deploy.sh
#!/bin/bash
time openstack overcloud deploy \
--templates /usr/share/openstack-tripleo-heat-templates \
-e /home/stack/templates/node-info.yaml \
-e /home/stack/templates/custom-domain.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/templates/network-environment.yaml \
-e /home/stack/templates/overcloud_images.yaml \
-e /home/stack/templates/first-boot-env.yaml \
-e /home/stack/templates/controller-params.yaml \
-e /home/stack/templates/compute-params.yaml \
--timeout 60
EOF
PCI passthrough Compute IOMMU template
[stack@lab-director ~]$ cat << EOF > first-boot-env.yaml
resource_registry:
OS::TripleO::NodeUserData: /home/stack/templates/first-boot.yaml
EOF
PCI passthrough configuration IOMMU configuration
[stack@lab-director ~]$ cat << EOF > first-boot.yaml
heat_template_version: 2014-10-16
resources:
userdata:
type: OS::Heat::MultipartMime
properties:
parts:
- config: {get_resource: compute_kernel_args}
# Logs can be checked on /var/log/cloud-init.log on the overcloud node
compute_kernel_args:
type: OS::Heat::SoftwareConfig
properties:
config: |
#!/bin/bash
set -x
echo "First boot started" > /tmp/first-boot.log
# Set grub parameters
if hostname | grep compute >/dev/null
then
sed -i.orig 's/quiet"$/quiet intel_iommu=on iommu=pt"/' /etc/default/grub
grub2-mkconfig -o /etc/grub2.cfg
systemctl stop os-collect-config.service
/sbin/reboot
fi
outputs:
OS::stack_id:
value: {get_resource: userdata}
EOF
PCI passthrough Controller template
PCI passthrough configuration for the controller nodes:
[stack@lab-director ~]$ cat << EOF > ~/templates/controller-params.yaml
parameter_defaults:
ControllerExtraConfig:
nova::api::pci_alias:
- name: gpu_m10
vendor_id: '10de'
product_id: '13bd'
- name: gpu_m60
vendor_id: '10de'
product_id: '13f2'
EOF
PCI passthrough Compute template
PCI passthrough configuration for the compute nodes:
[stack@lab-director ~]$ cat << EOF > ~/templates/compute-params.yaml
parameter_defaults:
NovaPCIPassthrough:
- vendor_id: '10de'
product_id: '13bd'
- vendor_id: '10de'
product_id: '13f2'
EOF
[stack@lab-director ~]$ cat << EOF > custom-domain.yaml
parameter_defaults:
CloudDomain: lan.redhat.com
CloudName: overcloud.lan.redhat.com
CloudNameInternal: overcloud.internalapi.lan.redhat.com
CloudNameStorage: overcloud.storage.lan.redhat.com
CloudNameStorageManagement: overcloud.storagemgmt.lan.redhat.com
CloudNameCtlplane: overcloud.ctlplane.lan.redhat.com
EOF
[stack@lab-director ~]$ cat << EOF > network-environment.yaml
resource_registry:
OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml
OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml
#OS::TripleO::AllNodes::Validation: OS::Heat::None
parameter_defaults:
# This sets 'external_network_bridge' in l3_agent.ini to an empty string
# so that external networks act like provider bridge networks (they
# will plug into br-int instead of br-ex)
NeutronExternalNetworkBridge: "''"
# Internal API used for private OpenStack Traffic
InternalApiNetCidr: 172.16.2.0/24
InternalApiAllocationPools: [{'start': '172.16.2.50', 'end': '172.16.2.100'}]
InternalApiNetworkVlanID: 102
# Tenant Network Traffic - will be used for VXLAN over VLAN
TenantNetCidr: 172.16.3.0/24
TenantAllocationPools: [{'start': '172.16.3.50', 'end': '172.16.3.100'}]
TenantNetworkVlanID: 103
# Public Storage Access - e.g. Nova/Glance <--> Ceph
StorageNetCidr: 172.16.4.0/24
StorageAllocationPools: [{'start': '172.16.4.50', 'end': '172.16.4.100'}]
StorageNetworkVlanID: 104
# Private Storage Access - i.e. Ceph background cluster/replication
StorageMgmtNetCidr: 172.16.5.0/24
StorageMgmtAllocationPools: [{'start': '172.16.5.50', 'end': '172.16.5.100'}]
StorageMgmtNetworkVlanID: 105
# External Networking Access - Public API Access
ExternalNetCidr: 172.16.0.0/24
# Leave room for floating IPs in the External allocation pool (if required)
ExternalAllocationPools: [{'start': '172.16.0.50', 'end': '172.16.0.250'}]
# Set to the router gateway on the external network
ExternalInterfaceDefaultRoute: 172.16.0.1
ExternalNetworkVlanID: 101
# Add in configuration for the Control Plane
ControlPlaneSubnetCidr: "24"
ControlPlaneDefaultRoute: 172.16.16.1
EC2MetadataIp: 172.16.16.10
DnsServers: ['10.16.36.29']
EOF
[stack@lab-director ~]$ cat << EOF > node-info.yaml
parameter_defaults:
OvercloudControllerFlavor: control
OvercloudComputeFlavor: compute
ControllerCount: 1
ComputeCount: 1
NtpServer: 'clock.redhat.com'
NeutronNetworkType: 'vxlan,vlan'
NeutronTunnelTypes: 'vxlan'
EOF
Launch the deployment
Deploy RHOSP13:
(undercloud) [stack@lab-director ~]$ ./overcloud-deploy.sh
Started Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: 32edb257-034a-4a21-9f46-0042cdfeeba4
Waiting for messages on queue 'tripleo' with no timeout.
Removing the current plan files
Uploading new plan files
Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. Execution ID: ddf8a8f0-6fef-423b-9352-c86010b1cbbd
Plan updated.
Processing templates in the directory /tmp/tripleoclient-kYstOU/tripleo-heat-templates
Started Mistral Workflow tripleo.plan_management.v1.get_deprecated_parameters. Execution ID: f0282d2b-6f18-417b-9951-684b6530856b
WARNING: Following parameters are defined but not used in plan. Could be possible that parameter is valid but currently not used.
…
2018-12-15 14:37:34Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE state changed
2018-12-15 14:37:34Z [overcloud]: CREATE_COMPLETE Stack CREATE completed successfully
Stack overcloud CREATE_COMPLETE
Started Mistral Workflow tripleo.deployment.v1.get_horizon_url. Execution ID: cc3de18a-d40f-49bd-b53e-a5d86e3e012b
Overcloud Endpoint: http://172.16.0.54:5000/
Overcloud Horizon Dashboard URL: http://172.16.0.54:80/dashboard
Overcloud rc file: /home/stack/overcloudrc
Overcloud Deployed
real 55m4.133s
user 0m8.471s
sys 0m0.802s
Check if Keystone can deliver tokens:
(overcloud) [stack@lab-director ~]$ openstack token issue
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| expires | 2018-10-12T18:16:18+0000 |
| id | gAAAAABbwNbiFtcgz9w-klGmbHdzIdEffqlEbduDmWURwnO_Knu96rLd7Vsl9RJ8ymqMp8ZMl3UvCJTrJVfL7yRQkl1WxPHX_TbNoDPSUFknBtPSKZ4wF1Lv6vgiv1ZmaMJy0LCJhMgBeqqRggujYqf2UxVFsk81_Ml6TjTatqGdVnaYDLDOI5s |
| project_id | 8356795b0b7d43d7a36768a6fb0fc0cb |
| user_id | 163a3ca9bc3848d3932dce876a8b3354 |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(overcloud) [stack@lab-director ~]$ openstack hypervisor list
+----+------------------------------------+-----------------+-------------+-------+
| ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |
+----+------------------------------------+-----------------+-------------+-------+
| 1 | overcloud-compute-0.lan.redhat.com | QEMU | 172.16.2.72 | up |
+----+------------------------------------+-----------------+-------------+-------+
Check PCI
Check configuration prepared by director:
[root@overcloud-controller-0 docker]# grep -r -A1 gpu_m10 /var/lib/config-data/puppet-generated/nova/etc/*
/var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf:alias={"name":"gpu_m10","product_id":"13bd","vendor_id":"10de"}
/var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf:alias={"name":"gpu_m60","product_id":"13f2","vendor_id":"10de"}
Check if IOMMU is enabled:
[root@overcloud-compute-0 ~]# cat /var/log/messages | grep -e DMAR -e IOMMU | tail -10
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Setting RMRR:
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Ignoring identity map for HW passthrough device 0000:00:1a.0 [0x7ae07000 - 0x7af06fff]
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Ignoring identity map for HW passthrough device 0000:00:1d.0 [0x7ae07000 - 0x7af06fff]
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Prepare 0-16MiB unity mapping for LPC
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Ignoring identity map for HW passthrough device 0000:00:1f.0 [0x0 - 0xffffff]
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Intel(R) Virtualization Technology for Directed I/O
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: 32bit 0000:00:1a.0 uses non-identity mapping
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Setting identity map for device 0000:00:1a.0 [0x7ae07000 - 0x7af06fff]
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: 32bit 0000:00:1d.0 uses non-identity mapping
Dec 15 09:03:24 overcloud-compute-0 kernel: DMAR: Setting identity map for device 0000:00:1d.0 [0x7ae07000 - 0x7af06fff]
Prepare environment
Setup a basic environment with external network, ssh key, default image, one flavor and floating IPs:
[stack@lab-director ~]$ cat << EOF > overcloud-prepare.sh
#!/bin/bash
source /home/stack/overcloudrc
openstack network create --external --share --provider-physical-network datacentre --provider-network-type vlan --provider-segment 101 external
openstack subnet create external --gateway 172.16.0.1 --allocation-pool start=172.16.0.20,end=172.16.0.49 --no-dhcp --network external --subnet-range 172.16.0.0/24
openstack router create router0
neutron router-gateway-set router0 external
openstack network create internal0
openstack subnet create subnet0 --network internal0 --subnet-range 172.31.0.0/24 --dns-nameserver 10.16.36.29
openstack router add subnet router0 subnet0
curl "https://launchpad.net/~egallen/+sshkeys" -o ~/.ssh/id_rsa_lambda.pub
openstack keypair create --public-key ~/.ssh/id_rsa_lambda.pub lambda
openstack flavor create --ram 1024 --disk 40 --vcpus 2 m1.small
curl http://download.cirros-cloud.net/0.3.5/cirros-0.3.5-x86_64-disk.img -o /var/images/x86_64/cirros-0.3.5-x86_64-disk.img
openstack image create cirros035 --file /var/images/x86_64/cirros-0.3.5-x86_64-disk.img --disk-format qcow2 --container-format bare --public
openstack floating ip create external
openstack floating ip create external
openstack floating ip create external
openstack floating ip create external
openstack security group create web --description "Web servers"
openstack security group rule create --protocol icmp web
openstack security group rule create --protocol tcp --dst-port 22:22 --src-ip 0.0.0.0/0 web
openstack server create --flavor m1.small --image cirros035 --security-group web --nic net-id=internal0 --key-name lambda instance0
FLOATING_IP_ID=\$( openstack floating ip list -f value -c ID --status 'DOWN' | head -n 1 )
openstack server add floating ip instance0 \$FLOATING_IP_ID
openstack server list
echo "To connect to the test instance use: openstack server ssh instance0 --login cirros"
EOF
Launch the setup script:
[stack@lab-director ~]$ ./overcloud-prepare.sh
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | UP |
| availability_zone_hints | |
| availability_zones | |
| created_at | 2018-12-15T22:26:33Z |
| description | |
| dns_domain | None |
| id | 90ba298d-238f-4c77-bd5c-c0a6d4e7e8d9 |
| ipv4_address_scope | None |
| ipv6_address_scope | None |
| is_default | False |
| is_vlan_transparent | None |
| mtu | 1500 |
| name | external |
| port_security_enabled | True |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| provider:network_type | vlan |
| provider:physical_network | datacentre |
| provider:segmentation_id | 101 |
| qos_policy_id | None |
| revision_number | 6 |
| router:external | External |
| segments | None |
| shared | True |
| status | ACTIVE |
| subnets | |
| tags | |
| updated_at | 2018-12-15T22:26:33Z |
+---------------------------+--------------------------------------+
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| allocation_pools | 172.16.0.20-172.16.0.49 |
| cidr | 172.16.0.0/24 |
| created_at | 2018-12-15T22:26:37Z |
| description | |
| dns_nameservers | |
| enable_dhcp | False |
| gateway_ip | 172.16.0.1 |
| host_routes | |
| id | a7788289-2d7a-4229-a81d-370c947365e0 |
| ip_version | 4 |
| ipv6_address_mode | None |
| ipv6_ra_mode | None |
| name | external |
| network_id | 90ba298d-238f-4c77-bd5c-c0a6d4e7e8d9 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| revision_number | 0 |
| segment_id | None |
| service_types | |
| subnetpool_id | None |
| tags | |
| updated_at | 2018-12-15T22:26:37Z |
+-------------------+--------------------------------------+
+-------------------------+--------------------------------------+
| Field | Value |
+-------------------------+--------------------------------------+
| admin_state_up | UP |
| availability_zone_hints | |
| availability_zones | |
| created_at | 2018-12-15T22:26:40Z |
| description | |
| distributed | False |
| external_gateway_info | None |
| flavor_id | None |
| ha | False |
| id | 385804ad-7190-49d7-8e6f-bc9ee0a639c7 |
| name | router0 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| revision_number | 1 |
| routes | |
| status | ACTIVE |
| tags | |
| updated_at | 2018-12-15T22:26:40Z |
+-------------------------+--------------------------------------+
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
Set gateway for router router0
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | UP |
| availability_zone_hints | |
| availability_zones | |
| created_at | 2018-12-15T22:26:46Z |
| description | |
| dns_domain | None |
| id | 39172871-1615-4b9e-bad6-9b8338d12899 |
| ipv4_address_scope | None |
| ipv6_address_scope | None |
| is_default | False |
| is_vlan_transparent | None |
| mtu | 1450 |
| name | internal0 |
| port_security_enabled | True |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| provider:network_type | vxlan |
| provider:physical_network | None |
| provider:segmentation_id | 22 |
| qos_policy_id | None |
| revision_number | 3 |
| router:external | Internal |
| segments | None |
| shared | False |
| status | ACTIVE |
| subnets | |
| tags | |
| updated_at | 2018-12-15T22:26:47Z |
+---------------------------+--------------------------------------+
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| allocation_pools | 172.31.0.2-172.31.0.254 |
| cidr | 172.31.0.0/24 |
| created_at | 2018-12-15T22:26:49Z |
| description | |
| dns_nameservers | 10.16.36.29 |
| enable_dhcp | True |
| gateway_ip | 172.31.0.1 |
| host_routes | |
| id | d0fae940-632d-4373-9863-7a132d005ffe |
| ip_version | 4 |
| ipv6_address_mode | None |
| ipv6_ra_mode | None |
| name | subnet0 |
| network_id | 39172871-1615-4b9e-bad6-9b8338d12899 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| revision_number | 0 |
| segment_id | None |
| service_types | |
| subnetpool_id | None |
| tags | |
| updated_at | 2018-12-15T22:26:49Z |
+-------------------+--------------------------------------+
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1140 100 1140 0 0 1084 0 0:00:01 0:00:01 --:--:-- 1085
+-------------+-------------------------------------------------+
| Field | Value |
+-------------+-------------------------------------------------+
| fingerprint | b7:fd:f8:00:c1:98:9f:62:d3:62:59:a9:65:37:3c:f6 |
| name | lambda |
| user_id | 85fe0aaf303449b19d8d9a4aae487d07 |
+-------------+-------------------------------------------------+
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 40 |
| id | 96e0a0dd-8b1d-4c1d-9c7b-48b055284211 |
| name | m1.small |
| os-flavor-access:is_public | True |
| properties | |
| ram | 1024 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 2 |
+----------------------------+--------------------------------------+
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 12.6M 100 12.6M 0 0 9775k 0 0:00:01 0:00:01 --:--:-- 9778k
+------------------+------------------------------------------------------------------------------+
| Field | Value |
+------------------+------------------------------------------------------------------------------+
| checksum | f8ab98ff5e73ebab884d80c9dc9c7290 |
| container_format | bare |
| created_at | 2018-12-15T22:27:07Z |
| disk_format | qcow2 |
| file | /v2/images/5c81fc9c-ddd6-4ff5-9578-0e9be63e9467/file |
| id | 5c81fc9c-ddd6-4ff5-9578-0e9be63e9467 |
| min_disk | 0 |
| min_ram | 0 |
| name | cirros035 |
| owner | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | direct_url='swift+config://ref1/glance/5c81fc9c-ddd6-4ff5-9578-0e9be63e9467' |
| protected | False |
| schema | /v2/schemas/image |
| size | 13267968 |
| status | active |
| tags | |
| updated_at | 2018-12-15T22:27:09Z |
| virtual_size | None |
| visibility | public |
+------------------+------------------------------------------------------------------------------+
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| created_at | 2018-12-15T22:27:13Z |
| description | |
| fixed_ip_address | None |
| floating_ip_address | 172.16.0.31 |
| floating_network_id | 90ba298d-238f-4c77-bd5c-c0a6d4e7e8d9 |
| id | 689f6f90-f8b4-4d45-bfa0-8c77bcda2ce4 |
| name | 172.16.0.31 |
| port_id | None |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| qos_policy_id | None |
| revision_number | 0 |
| router_id | None |
| status | DOWN |
| subnet_id | None |
| updated_at | 2018-12-15T22:27:13Z |
+---------------------+--------------------------------------+
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| created_at | 2018-12-15T22:27:18Z |
| description | |
| fixed_ip_address | None |
| floating_ip_address | 172.16.0.22 |
| floating_network_id | 90ba298d-238f-4c77-bd5c-c0a6d4e7e8d9 |
| id | 30a0f20a-6dd2-4c27-8299-50aa45c333b4 |
| name | 172.16.0.22 |
| port_id | None |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| qos_policy_id | None |
| revision_number | 0 |
| router_id | None |
| status | DOWN |
| subnet_id | None |
| updated_at | 2018-12-15T22:27:18Z |
+---------------------+--------------------------------------+
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| created_at | 2018-12-15T22:27:21Z |
| description | |
| fixed_ip_address | None |
| floating_ip_address | 172.16.0.21 |
| floating_network_id | 90ba298d-238f-4c77-bd5c-c0a6d4e7e8d9 |
| id | 30c715e7-f0b4-4654-b392-1119c3ff2f37 |
| name | 172.16.0.21 |
| port_id | None |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| qos_policy_id | None |
| revision_number | 0 |
| router_id | None |
| status | DOWN |
| subnet_id | None |
| updated_at | 2018-12-15T22:27:21Z |
+---------------------+--------------------------------------+
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| created_at | 2018-12-15T22:27:26Z |
| description | |
| fixed_ip_address | None |
| floating_ip_address | 172.16.0.26 |
| floating_network_id | 90ba298d-238f-4c77-bd5c-c0a6d4e7e8d9 |
| id | 10cbcec6-ab3d-4c9e-babb-93e0a8b33f3b |
| name | 172.16.0.26 |
| port_id | None |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| qos_policy_id | None |
| revision_number | 0 |
| router_id | None |
| status | DOWN |
| subnet_id | None |
| updated_at | 2018-12-15T22:27:26Z |
+---------------------+--------------------------------------+
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| created_at | 2018-12-15T22:27:28Z |
| description | Web servers |
| id | 5c46d8ce-51f6-4d6f-b57e-b15767234162 |
| name | web |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| revision_number | 2 |
| rules | created_at='2018-12-15T22:27:29Z', direction='egress', ethertype='IPv4', id='5e1e714a-410d-42da-aeb6-bb89664c1741', updated_at='2018-12-15T22:27:29Z' |
| | created_at='2018-12-15T22:27:29Z', direction='egress', ethertype='IPv6', id='a345d22f-8f15-4856-bc2b-4c270fe0d0a7', updated_at='2018-12-15T22:27:29Z' |
| updated_at | 2018-12-15T22:27:29Z |
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| created_at | 2018-12-15T22:27:31Z |
| description | |
| direction | ingress |
| ether_type | IPv4 |
| id | 95669427-d054-47ec-abeb-e9fb378eb424 |
| name | None |
| port_range_max | None |
| port_range_min | None |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| protocol | icmp |
| remote_group_id | None |
| remote_ip_prefix | 0.0.0.0/0 |
| revision_number | 0 |
| security_group_id | 5c46d8ce-51f6-4d6f-b57e-b15767234162 |
| updated_at | 2018-12-15T22:27:31Z |
+-------------------+--------------------------------------+
The --src-ip option is deprecated, please use --remote-ip instead.
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| created_at | 2018-12-15T22:27:34Z |
| description | |
| direction | ingress |
| ether_type | IPv4 |
| id | 9d219daa-4d2a-4fa0-868c-dc1a46668de2 |
| name | None |
| port_range_max | 22 |
| port_range_min | 22 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| protocol | tcp |
| remote_group_id | None |
| remote_ip_prefix | 0.0.0.0/0 |
| revision_number | 0 |
| security_group_id | 5c46d8ce-51f6-4d6f-b57e-b15767234162 |
| updated_at | 2018-12-15T22:27:34Z |
+-------------------+--------------------------------------+
+-------------------------------------+--------------------------------------------------+
| Field | Value |
+-------------------------------------+--------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | aBzcqd7Nzhxc |
| config_drive | |
| created | 2018-12-15T22:27:38Z |
| flavor | m1.small (96e0a0dd-8b1d-4c1d-9c7b-48b055284211) |
| hostId | |
| id | 6f7b6572-3bd8-4d6f-958f-2012e2cdfd84 |
| image | cirros035 (5c81fc9c-ddd6-4ff5-9578-0e9be63e9467) |
| key_name | lambda |
| name | instance0 |
| progress | 0 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | |
| security_groups | name='5c46d8ce-51f6-4d6f-b57e-b15767234162' |
| status | BUILD |
| updated | 2018-12-15T22:27:38Z |
| user_id | 85fe0aaf303449b19d8d9a4aae487d07 |
| volumes_attached | |
+-------------------------------------+--------------------------------------------------+
+--------------------------------------+-----------+--------+------------------------------------+-----------+----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-----------+--------+------------------------------------+-----------+----------+
| 6f7b6572-3bd8-4d6f-958f-2012e2cdfd84 | instance0 | ACTIVE | internal0=172.31.0.10, 172.16.0.26 | cirros035 | m1.small |
+--------------------------------------+-----------+--------+------------------------------------+-----------+----------+
To connect to the test instance use: openstack server ssh instance0 --login cirros
Login on the first instance ‘instance0’:
(overcloud) [stack@lab-director ~]$ openstack server ssh instance0 --login cirros
The authenticity of host '172.16.0.26 (172.16.0.26)' can't be established.
RSA key fingerprint is SHA256:H3Fw5Fd2P8MZjapZp2AiDKpy7B0SWRwNtQV/bHaSeIQ.
RSA key fingerprint is MD5:2f:2b:40:e5:b7:5f:da:b1:7a:ad:eb:90:7c:b5:21:c8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.16.0.26' (RSA) to the list of known hosts.
cirros@172.16.0.26's password:
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast qlen 1000
link/ether fa:16:3e:15:cd:24 brd ff:ff:ff:ff:ff:ff
inet 172.31.0.10/24 brd 172.31.0.255 scope global eth0
inet6 fe80::f816:3eff:fe15:cd24/64 scope link
valid_lft forever preferred_lft forever
$ uptime
22:37:20 up 1 min, 1 users, load average: 0.00, 0.01, 0.00
Go to https://access.redhat.com/downloads/content/69/ver=/rhel—7/7.6/x86_64/product-software Download the RHEL 7.6 guest qcow image and upload this image to Glance:
[stack@lab-director ~]$ cd /var/images/x86_64/
[stack@lab-director x86_64]$ curl -O "https://access.cdn.redhat.com/content/origin/files/sha256/3b/XXXXXXXXXXXXXXXX/rhel-server-7.6-x86_64-kvm.qcow2?user=XXXXXXXX&_auth_=XXXXX_XXXXXXXX"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
(overcloud) [stack@lab-director ~]$ openstack image create rhel76 --file /var/images/x86_64/rhel-server-7.6-x86_64-kvm.qcow2 --disk-format qcow2 --container-format bare --public
+------------------+------------------------------------------------------------------------------+
| Field | Value |
+------------------+------------------------------------------------------------------------------+
| checksum | c51410acbe8bb67a797f520e68274b00 |
| container_format | bare |
| created_at | 2018-12-15T22:54:36Z |
| disk_format | qcow2 |
| file | /v2/images/826e9d7f-cf8c-48e8-8f05-bb821514baef/file |
| id | 826e9d7f-cf8c-48e8-8f05-bb821514baef |
| min_disk | 0 |
| min_ram | 0 |
| name | rhel76 |
| owner | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | direct_url='swift+config://ref1/glance/826e9d7f-cf8c-48e8-8f05-bb821514baef' |
| protected | False |
| schema | /v2/schemas/image |
| size | 685295616 |
| status | active |
| tags | |
| updated_at | 2018-12-15T22:54:45Z |
| virtual_size | None |
| visibility | public |
+------------------+------------------------------------------------------------------------------+
Create a RHEL 7.6 instance:
(overcloud) [stack@lab-director ~]$ openstack server create --flavor m1.small --image rhel76 --security-group web --nic net-id=internal0 --key-name lambda instance1
+-------------------------------------+-------------------------------------------------+
| Field | Value |
+-------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | rBT8AQSr9EfP |
| config_drive | |
| created | 2018-12-15T22:55:48Z |
| flavor | m1.small (96e0a0dd-8b1d-4c1d-9c7b-48b055284211) |
| hostId | |
| id | 041e4bf8-3679-4919-ad54-144ccfcc7ce9 |
| image | rhel76 (826e9d7f-cf8c-48e8-8f05-bb821514baef) |
| key_name | lambda |
| name | instance1 |
| progress | 0 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | |
| security_groups | name='5c46d8ce-51f6-4d6f-b57e-b15767234162' |
| status | BUILD |
| updated | 2018-12-15T22:55:48Z |
| user_id | 85fe0aaf303449b19d8d9a4aae487d07 |
| volumes_attached | |
+-------------------------------------+-------------------------------------------------+
Attach a floating IP:
(overcloud) [stack@lab-director ~]$ FLOATING_IP_ID=$( openstack floating ip list -f value -c ID --status 'DOWN' | head -n 1 )
(overcloud) [stack@lab-director ~]$ openstack server add floating ip instance1 $FLOATING_IP_ID
List OpenStack instances:
(overcloud) [stack@lab-director ~]$ openstack server list
+--------------------------------------+-----------+--------+------------------------------------+-----------+----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-----------+--------+------------------------------------+-----------+----------+
| 041e4bf8-3679-4919-ad54-144ccfcc7ce9 | instance1 | ACTIVE | internal0=172.31.0.21, 172.16.0.22 | rhel76 | m1.small |
| 6f7b6572-3bd8-4d6f-958f-2012e2cdfd84 | instance0 | ACTIVE | internal0=172.31.0.10, 172.16.0.26 | cirros035 | m1.small |
+--------------------------------------+-----------+--------+------------------------------------+-----------+----------+
Connect via ssh to the new RHEL instance:
(overcloud) [stack@lab-director ~]$ ssh cloud-user@172.16.0.22
The authenticity of host '172.16.0.22 (172.16.0.22)' can't be established.
ECDSA key fingerprint is SHA256:Qv6oTDYcgNtwnR1MukCzh5eseA/fevWsGhyYRBQkzsQ.
ECDSA key fingerprint is MD5:2d:2b:db:53:a4:f8:ab:ee:d0:29:c3:72:8f:ee:06:31.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.16.0.22' (ECDSA) to the list of known hosts.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
[cloud-user@instance1 ~]$ uptime
18:00:48 up 1 min, 1 user, load average: 0.28, 0.19, 0.08
Register instance1 to RHN:
[cloud-user@instance1 ~]$ sudo subscription-manager register --username mycdnaccount
[cloud-user@instance1 ~]$ sudo subscription-manager attach --pool=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[cloud-user@instance1 ~]$ sudo subscription-manager repos --disable=*
[cloud-user@instance1 ~]$ sudo subscription-manager repos --enable=rhel-7-server-rpms
[cloud-user@instance1 ~]$ sudo yum install pciutils -y
Loaded plugins: product-id, search-disabled-repos, subscription-manager
Resolving Dependencies
--> Running transaction check
---> Package pciutils.x86_64 0:3.5.1-3.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===================================================================================================================================================
Package Arch Version Repository Size
===================================================================================================================================================
Installing:
pciutils x86_64 3.5.1-3.el7 rhel-7-server-rpms 93 k
Transaction Summary
===================================================================================================================================================
Install 1 Package
Total download size: 93 k
Installed size: 196 k
Is this ok [y/d/N]: y
Downloading packages:
warning: /var/cache/yum/x86_64/7Server/rhel-7-server-rpms/packages/pciutils-3.5.1-3.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
Public key for pciutils-3.5.1-3.el7.x86_64.rpm is not installed
pciutils-3.5.1-3.el7.x86_64.rpm | 93 kB 00:00:00
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
Importing GPG key 0xFD431D51:
Userid : "Red Hat, Inc. (release key 2) <security@redhat.com>"
Fingerprint: 567e 347a d004 4ade 55ba 8a5f 199e 2f91 fd43 1d51
Package : redhat-release-server-7.6-4.el7.x86_64 (installed)
From : /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
Is this ok [y/N]: y
Importing GPG key 0x2FA658E0:
Userid : "Red Hat, Inc. (auxiliary key) <security@redhat.com>"
Fingerprint: 43a6 e49c 4a38 f4be 9abf 2a53 4568 9c88 2fa6 58e0
Package : redhat-release-server-7.6-4.el7.x86_64 (installed)
From : /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
Is this ok [y/N]: y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : pciutils-3.5.1-3.el7.x86_64 1/1
rhel-7-server-rpms/7Server/x86_64/productid | 2.1 kB 00:00:00
Verifying : pciutils-3.5.1-3.el7.x86_64 1/1
Installed:
pciutils.x86_64 0:3.5.1-3.el7
Complete!
List PCI devices:
[cloud-user@instance1 ~]$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)
00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000]
00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010]
00:01.2 USB controller [0c03]: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] [8086:7020] (rev 01)
00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)
00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8]
00:03.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000]
00:04.0 SCSI storage controller [0100]: Red Hat, Inc. Virtio block device [1af4:1001]
00:05.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1002]
List flavors:
(overcloud) [stack@lab-director ~]$ openstack flavor list
+--------------------------------------+----------+------+------+-----------+-------+-----------+
| ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+----------+------+------+-----------+-------+-----------+
| 96e0a0dd-8b1d-4c1d-9c7b-48b055284211 | m1.small | 1024 | 40 | 0 | 2 | True |
+--------------------------------------+----------+------+------+-----------+-------+-----------+
Attach one physical GPU M10 to an instance
Create a flavor M10
(overcloud) [stack@lab-director ~]$ openstack flavor create --ram 4096 --disk 10 --vcpus 4 m1.small-gpu-m10
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 10 |
| id | 7fdf2c3f-e7c9-4e25-a3e1-d25ed49646de |
| name | m1.small-gpu-m10 |
| os-flavor-access:is_public | True |
| properties | |
| ram | 4096 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 4 |
+----------------------------+--------------------------------------+
(overcloud) [stack@lab-director ~]$ openstack flavor set m1.small-gpu-m10 --property "pci_passthrough:alias"="gpu_m10:1"
(overcloud) [stack@lab-director ~]$ openstack flavor show m1.small-gpu-m10
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| disk | 10 |
| id | 7fdf2c3f-e7c9-4e25-a3e1-d25ed49646de |
| name | m1.small-gpu-m10 |
| os-flavor-access:is_public | True |
| properties | pci_passthrough:alias='gpu_m10:1' |
| ram | 4096 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 4 |
+----------------------------+--------------------------------------+
Create an instance with one pGPU M10 attached:
(overcloud) [stack@lab-director ~]$ openstack server create --flavor m1.small-gpu-m10 --image rhel76 --security-group web --nic net-id=internal0 --key-name lambda instance-m10-0
+-------------------------------------+---------------------------------------------------------+
| Field | Value |
+-------------------------------------+---------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | uZ3QBNmuSR3D |
| config_drive | |
| created | 2018-12-15T23:26:58Z |
| flavor | m1.small-gpu-m10 (7fdf2c3f-e7c9-4e25-a3e1-d25ed49646de) |
| hostId | |
| id | 4e29af65-670b-4251-b04c-ce48c1cb9b0a |
| image | rhel76 (826e9d7f-cf8c-48e8-8f05-bb821514baef) |
| key_name | lambda |
| name | instance-m10-0 |
| progress | 0 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | |
| security_groups | name='5c46d8ce-51f6-4d6f-b57e-b15767234162' |
| status | BUILD |
| updated | 2018-12-15T23:26:58Z |
| user_id | 85fe0aaf303449b19d8d9a4aae487d07 |
| volumes_attached | |
+-------------------------------------+---------------------------------------------------------+
Associate a floating IP to instance-m10-0:
(overcloud) [stack@lab-director ~]$ FLOATING_IP_ID=$( openstack floating ip list -f value -c ID --status 'DOWN' | head -n 1 )
(overcloud) [stack@lab-director ~]$ echo $FLOATING_IP_ID
10cbcec6-ab3d-4c9e-babb-93e0a8b33f3b
(overcloud) [stack@lab-director ~]$ openstack server add floating ip instance-m10-0 $FLOATING_IP_ID
List instances:
(overcloud) [stack@lab-director ~]$ openstack server list
+--------------------------------------+----------------+--------+------------------------------------+--------+------------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------------+--------+------------------------------------+--------+------------------+
| 4e29af65-670b-4251-b04c-ce48c1cb9b0a | instance-m10-0 | ACTIVE | internal0=172.31.0.5, 172.16.0.26 | rhel76 | m1.small-gpu-m10 |
| 041e4bf8-3679-4919-ad54-144ccfcc7ce9 | instance1 | ACTIVE | internal0=172.31.0.21, 172.16.0.22 | rhel76 | m1.small |
+--------------------------------------+----------------+--------+------------------------------------+--------+------------------+
Connect to instance-m10-0:
(overcloud) [stack@lab-director ~]$ ssh cloud-user@172.16.0.26
The authenticity of host '172.16.0.26 (172.16.0.26)' can't be established.
ECDSA key fingerprint is SHA256:EDyzYr1Ru4HFxVO6upNWTg1yee9xs1tBBUsEdjDsonI.
ECDSA key fingerprint is MD5:16:d7:ee:66:49:d8:f0:4b:13:ac:8e:74:b1:55:17:97.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.16.0.26' (ECDSA) to the list of known hosts.
[cloud-user@instance-m10-0 ~]$ uptime
18:29:59 up 2 min, 1 user, load average: 0.13, 0.19, 0.09
List PCI devices, we can see the NVIDIA device “NVIDIA Corporation GM107GL [Tesla M10]”:
[cloud-user@instance-m10-0 ~]$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)
00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000]
00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010]
00:01.2 USB controller [0c03]: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] [8086:7020] (rev 01)
00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)
00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8]
00:03.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000]
00:04.0 SCSI storage controller [0100]: Red Hat, Inc. Virtio block device [1af4:1001]
00:05.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:06.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1002]
Check RHEL and kernel version:
[cloud-user@instance-m10-0 ~]$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.6 (Maipo)
[cloud-user@instance-m10-0 ~]$ uname -a
Linux instance-m10-0 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 15 17:36:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Apply the last RHEL updates:
[cloud-user@instance-m10-0 ~]$ sudo yum upgrade -y
Manual guest CUDA driver installation
Install manually cuda drivers:
[cloud-user@instance-m10-0 ~]$ sudo yum -y install kernel-devel kernel-headers pciutils
[cloud-user@instance-m10-0 ~]$ sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
[cloud-user@instance-m10-0 ~]$ sudo yum -y install https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.0.130-1.x86_64.rpm
[cloud-user@instance-m10-0 ~]$ sudo yum clean all
[cloud-user@instance-m10-0 ~]$ time sudo yum -y install cuda
...
Complete!
real 11m46.938s
user 5m44.575s
sys 1m7.395s
[cloud-user@instance-m10-0 ~]$ sudo reboot
Check NVIDIA System Management Interface status, we can find one M10 pGPU:
[cloud-user@instance-m10-0 ~]$ nvidia-smi -L
GPU 0: Tesla M10 (UUID: GPU-41eaf008-3743-9469-fc3e-a632ac13c957)
[cloud-user@instance-m10-0 ~]$ nvidia-smi
Sat Dec 15 19:08:03 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M10 Off | 00000000:00:05.0 Off | N/A |
| N/A 27C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Attach four M10 an two M60 pGPUs to an instance
Flavor large 4 x M10 & 2 x M60
(overcloud) [stack@lab-director ~]$ openstack flavor create --ram 40960 --disk 100 --vcpus 40 m2.large-gpu
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 100 |
| id | b021ebcf-b9a1-4074-91ac-b7dfc74d5a6b |
| name | m2.large-gpu |
| os-flavor-access:is_public | True |
| properties | |
| ram | 40960 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 40 |
+----------------------------+--------------------------------------+
(overcloud) [stack@lab-director ~]$ openstack flavor set m2.large-gpu --property "pci_passthrough:alias"="gpu_m10:4, gpu_m60:2"
(overcloud) [stack@lab-director ~]$ openstack flavor show m2.large-gpu
+----------------------------+----------------------------------------------+
| Field | Value |
+----------------------------+----------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| disk | 100 |
| id | b021ebcf-b9a1-4074-91ac-b7dfc74d5a6b |
| name | m2.large-gpu |
| os-flavor-access:is_public | True |
| properties | pci_passthrough:alias='gpu_m10:4, gpu_m60:2' |
| ram | 40960 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 40 |
+----------------------------+----------------------------------------------+
Connect to the instance:
(overcloud) [stack@lab-director ~]$ source stackrc
(undercloud) [stack@lab-director ~]$ openstack server list
+--------------------------------------+------------------------+--------+------------------------+----------------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+------------------------+--------+------------------------+----------------+---------+
| c02dbff9-b5c1-4cd2-923e-c6797c85f8bb | overcloud-controller-0 | ACTIVE | ctlplane=172.16.16.203 | overcloud-full | control |
| 8df71b94-b236-475e-b3fe-f197fc483ac3 | overcloud-compute-0 | ACTIVE | ctlplane=172.16.16.212 | overcloud-full | compute |
+--------------------------------------+------------------------+--------+------------------------+----------------+---------+
(undercloud) [stack@lab-director ~]$ ssh heat-admin@172.16.16.203
The authenticity of host '172.16.16.203 (172.16.16.203)' can't be established.
ECDSA key fingerprint is SHA256:iy/GIib0NpY0qbiLdzwQ1BUiZkk+d0Qqv9Di7vjHK50.
ECDSA key fingerprint is MD5:6a:f6:39:4f:b9:dd:2f:83:83:9e:6e:68:ba:42:4e:b5.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.16.16.203' (ECDSA) to the list of known hosts.
Automatic guest CUDA driver preparation in a glance image
Copy the default RHEL 7.6 qcow file to a new GPU one:
(overcloud) [stack@lab-director x86_64]$ cp rhel-server-7.6-x86_64-kvm.qcow2 rhel-server-7.6-x86_64-kvm-GPU.qcow2
Enale RHN repositories with libguestfs in the qcow2 image:
[root@lab607 egallen]# virt-customize -m 4096 -a rhel-server-7.6-x86_64-kvm-GPU.qcow2 --run-command 'subscription-manager register --username=mycdnaccount --password=XXXXXX --force' --run-command 'subscription-manager attach --pool=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' --run-command 'subscription-manager repos --disable=*' --run-command 'subscription-manager repos --enable=rhel-7-server-rpms' --run-command 'yum update -y -v --exclude=selinux*' --update --selinux-relabel
[ 0.0] Examining the guest ...
[ 2.7] Setting a random seed
[ 2.7] Setting the machine ID in /etc/machine-id
[ 2.7] Running: subscription-manager register --username=mycdnaccount --password=XXXXXX --force
[ 17.5] Running: subscription-manager attach --pool=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[ 56.2] Running: subscription-manager repos --disable=*
[ 88.1] Running: subscription-manager repos --enable=rhel-7-server-rpms
[ 122.9] Running: yum update -y -v --exclude=selinux*
[ 343.6] Updating packages
[ 364.6] SELinux relabelling
[ 374.2] Finishing off
Enable cuda repositories with libguestfs in the qcow2 image:
[egallen@lab607 ~]$ time virt-customize -m 4096 -a rhel-server-7.6-x86_64-kvm-GPU.qcow2 --run-command 'yum -y install kernel-devel kernel-headers pciutils tmux' --run-command 'yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm' --run-command 'yum -y install https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.0.130-1.x86_64.rpm' --run-command 'yum clean all' --selinux-relabel
[ 0.0] Examining the guest ...
[ 2.9] Setting a random seed
[ 2.9] Running: yum -y install kernel-devel kernel-headers pciutils tmux
[ 94.7] Running: yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
[ 103.7] Running: yum -y install https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.0.130-1.x86_64.rpm
[ 146.9] Running: yum clean all
[ 153.0] SELinux relabelling
[ 167.4] Finishing off
real 2m47.604s
user 0m0.079s
sys 0m0.056s
Install cuda drivers with libguestfs in the qcow2 image:
[egallen@lab607 ~]$ time virt-customize -m 4096 -a rhel-server-7.6-x86_64-kvm-GPU.qcow2 --run-command 'yum -y install cuda' --run-command 'dracut --force' --selinux-relabel
[ 0.0] Examining the guest ...
[ 2.5] Setting a random seed
[ 2.5] Running: yum -y install cuda
[ 255.7] Running: dracut --force
[ 279.3] SELinux relabelling
[ 302.7] Finishing off
real 5m3.606s
user 0m0.082s
sys 0m0.051s
Attach two physical GPUs M10 and four GPUs M60 to an instance
Create double flavor:
(overcloud) [stack@lab-director ~]$ openstack server create --flavor m1.small-gpu-double-m10-m60 --image rhel76 --security-group web --nic net-id=internal0 --key-name lambda instance-double-0
+-------------------------------------+--------------------------------------------------------------------+
| Field | Value |
+-------------------------------------+--------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | TuVKspjam4Z5 |
| config_drive | |
| created | 2018-12-16T01:05:55Z |
| flavor | m1.small-gpu-double-m10-m60 (d7a0e791-8940-4045-b57c-9a8c9ecad226) |
| hostId | |
| id | 10ec3a1e-39e4-4673-b8fb-1b17b8af9378 |
| image | rhel76 (826e9d7f-cf8c-48e8-8f05-bb821514baef) |
| key_name | lambda |
| name | instance-double-0 |
| progress | 0 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | |
| security_groups | name='5c46d8ce-51f6-4d6f-b57e-b15767234162' |
| status | BUILD |
| updated | 2018-12-16T01:05:55Z |
| user_id | 85fe0aaf303449b19d8d9a4aae487d07 |
| volumes_attached | |
+-------------------------------------+--------------------------------------------------------------------+
(overcloud) [stack@lab-director ~]$ FLOATING_IP_ID=$( openstack floating ip list -f value -c ID --status 'DOWN' | head -n 1 )
(overcloud) [stack@lab-director ~]$ openstack server add floating ip instance-double-0 $FLOATING_IP_ID
(overcloud) [stack@lab-director ~]$ openstack server list
+--------------------------------------+-------------------+--------+------------------------------------+--------+-----------------------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------------------+--------+------------------------------------+--------+-----------------------------+
| 10ec3a1e-39e4-4673-b8fb-1b17b8af9378 | instance-double-0 | ACTIVE | internal0=172.31.0.14, 172.16.0.22 | rhel76 | m1.small-gpu-double-m10-m60 |
+--------------------------------------+-------------------+--------+------------------------------------+--------+-----------------------------+
Check PCI devices available on the instance:
[root@instance-double-0 cloud-user]# lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)
00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000]
00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010]
00:01.2 USB controller [0c03]: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] [8086:7020] (rev 01)
00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)
00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8]
00:03.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000]
00:04.0 SCSI storage controller [0100]: Red Hat, Inc. Virtio block device [1af4:1001]
00:05.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:06.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
00:09.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon [1af4:1002]
Check NVIDIA System Management Interface status:
[cloud-user@instance-double-0 ~]$ nvidia-smi
Sun Dec 16 03:43:27 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M10 Off | 00000000:00:05.0 Off | N/A |
| N/A 27C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M10 Off | 00000000:00:06.0 Off | N/A |
| N/A 27C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 Off | 00000000:00:07.0 Off | Off |
| N/A 31C P0 37W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 Off | 00000000:00:08.0 Off | Off |
| N/A 30C P0 38W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Create an image with cuda driver inside:
(overcloud) [stack@lab-director ~]$ openstack image create rhel76-gpu --file /var/images/x86_64/rhel-server-7.6-x86_64-kvm-GPU.qcow2 --disk-format qcow2 --container-format bare --public
+------------------+------------------------------------------------------------------------------+
| Field | Value |
+------------------+------------------------------------------------------------------------------+
| checksum | 503060eb06291afba1a2f3012508a0db |
| container_format | bare |
| created_at | 2018-12-16T19:07:30Z |
| disk_format | qcow2 |
| file | /v2/images/f39214a6-9e75-4128-b161-428d36815336/file |
| id | f39214a6-9e75-4128-b161-428d36815336 |
| min_disk | 0 |
| min_ram | 0 |
| name | rhel76-gpu |
| owner | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | direct_url='swift+config://ref1/glance/f39214a6-9e75-4128-b161-428d36815336' |
| protected | False |
| schema | /v2/schemas/image |
| size | 7420510208 |
| status | active |
| tags | |
| updated_at | 2018-12-16T19:08:59Z |
| virtual_size | None |
| visibility | public |
+------------------+------------------------------------------------------------------------------+
Extend quotas:
(overcloud) [stack@lab-director ~]$ openstack quota set --cores 400 admin --ram 819200
Create an instance instancelarge0:
(overcloud) [stack@lab-director ~]$ openstack server create --flavor m2.large-gpu --image rhel76-gpu --security-group web --nic net-id=internal0 --key-name lambda instancelarge0
+-------------------------------------+-----------------------------------------------------+
| Field | Value |
+-------------------------------------+-----------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | bXihjR9ZV5WC |
| config_drive | |
| created | 2018-12-16T19:09:52Z |
| flavor | m2.large-gpu (b021ebcf-b9a1-4074-91ac-b7dfc74d5a6b) |
| hostId | |
| id | dc5922ee-0cdd-4429-84d1-4978f9367e4d |
| image | rhel76-gpu (f39214a6-9e75-4128-b161-428d36815336) |
| key_name | lambda |
| name | instancelarge0 |
| progress | 0 |
| project_id | 4235d708dd2e4514a02e69f0b4f1d7fb |
| properties | |
| security_groups | name='5c46d8ce-51f6-4d6f-b57e-b15767234162' |
| status | BUILD |
| updated | 2018-12-16T19:09:52Z |
| user_id | 85fe0aaf303449b19d8d9a4aae487d07 |
| volumes_attached | |
+-------------------------------------+-----------------------------------------------------+
Attach a floating IP:
(overcloud) [stack@lab-director ~]$ FLOATING_IP_ID=$( openstack floating ip list -f value -c ID --status 'DOWN' | head -n 1 ) ; openstack server add floating ip instancelarge0 $FLOATING_IP_ID ; openstack server list
+--------------------------------------+----------------+--------+-----------------------------------+------------+--------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------------+--------+-----------------------------------+------------+--------------+
| dc5922ee-0cdd-4429-84d1-4978f9367e4d | instancelarge0 | ACTIVE | internal0=172.31.0.3, 172.16.0.26 | rhel76-gpu | m2.large-gpu |
+--------------------------------------+----------------+--------+-----------------------------------+------------+--------------+
Login via ssh to the instance:
(overcloud) [stack@lab-director ~]$ ssh cloud-user@172.16.0.26
The authenticity of host '172.16.0.26 (172.16.0.26)' can't be established.
ECDSA key fingerprint is SHA256:vOShSeykwX+cVvBdFO0mTIr912vbHD173eLlmXKVJ2c.
ECDSA key fingerprint is MD5:1f:6d:ca:6b:98:26:60:d2:26:83:2f:88:02:de:b1:55.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.16.0.26' (ECDSA) to the list of known hosts.
[cloud-user@instancelarge0 ~]$ uptime
13:21:22 up 0 min, 1 user, load average: 1.75, 0.42, 0.14
[cloud-user@instancelarge0 ~]$ lspci -nn | grep -i nvidia
00:05.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:06.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Tesla M10] [10de:13bd] (rev a2)
00:09.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
00:0a.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
[cloud-user@instancelarge0 ~]$ nvidia-smi
Sun Dec 16 14:20:59 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M10 Off | 00000000:00:05.0 Off | N/A |
| N/A 28C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M10 Off | 00000000:00:06.0 Off | N/A |
| N/A 27C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M10 Off | 00000000:00:07.0 Off | N/A |
| N/A 27C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M10 Off | 00000000:00:08.0 Off | N/A |
| N/A 28C P0 15W / 53W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 Tesla M60 Off | 00000000:00:09.0 Off | Off |
| N/A 31C P0 39W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 Tesla M60 Off | 00000000:00:0A.0 Off | Off |
| N/A 31C P0 40W / 150W | 0MiB / 8129MiB | 3% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
GPU burn testing in an instance
Install gpu-burn to test the pGPU:
[cloud-user@instancelarge0 ~]$ git clone https://github.com/wilicc/gpu-burn.git
Cloning into 'gpu-burn'...
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 41 (delta 0), reused 1 (delta 0), pack-reused 38
Unpacking objects: 100% (41/41), done.
[egallen@lab607 ~]$ export PATH=$PATH:/usr/local/cuda/bin
[egallen@lab607 ~]$ cd gpu-burn
[egallen@lab607 ~]$ make
gpu-burn runs by default 10 seconds.
Test pGPUs on instancelarge0, the GPU is detected has valid:
[cloud-user@instance-m10-0 gpu-burn]$ ./gpu_burn
Run length not specified in the command line. Burning for 10 secs
GPU 0: Tesla M10 (UUID: GPU-41eaf008-3743-9469-fc3e-a632ac13c957)
Initialized device 0 with 8129 MB of memory (8060 MB available, using 7254 MB of it), using FLOATS
80.0% proc'd: 451 (1033 Gflop/s) errors: 0 temps: 36 C
Summary at: Sat Dec 15 19:10:09 EST 2018
100.0% proc'd: 451 (1033 Gflop/s) errors: 0 temps: 36 C
Summary at: Sat Dec 15 19:10:11 EST 2018
100.0% proc'd: 902 (1244 Gflop/s) errors: 0 temps: 37 C
Killing processes.. done
Tested 1 GPUs:
GPU 0: OK
Test pGPUs on instancelarge0, the GPUs are detected has valid:
[cloud-user@instancelarge0 gpu-burn]$ ./gpu_burn
Run length not specified in the command line. Burning for 10 secs
GPU 0: Tesla M60 (UUID: GPU-728c39f6-315e-98de-1a77-75fb72b764a0)
GPU 1: Tesla M60 (UUID: GPU-ab79a696-a4a9-4c7c-95b7-56560e431b19)
GPU 2: Tesla M10 (UUID: GPU-ff674f92-c6f1-a0a3-cca4-7f213aa5ccdd)
GPU 3: Tesla M10 (UUID: GPU-41eaf008-3743-9469-fc3e-a632ac13c957)
GPU 4: Tesla M10 (UUID: GPU-3932760c-5f3a-85f8-9392-dbaab12ec4b7)
GPU 5: Tesla M10 (UUID: GPU-9e46a4a6-de20-28de-fb16-7efe49a21f04)
Initialized device 0 with 8129 MB of memory (8015 MB available, using 7213 MB of it), using FLOATS
Initialized device 3 with 8129 MB of memory (8060 MB available, using 7254 MB of it), using FLOATS
Initialized device 4 with 8129 MB of memory (8060 MB available, using 7254 MB of it), using FLOATS
Initialized device 2 with 8129 MB of memory (8060 MB available, using 7254 MB of it), using FLOATS
Initialized device 5 with 8129 MB of memory (8060 MB available, using 7254 MB of it), using FLOATS
Initialized device 1 with 8129 MB of memory (8015 MB available, using 7213 MB of it), using FLOATS
…
100.0% proc'd: 1792 (3808 Gflop/s) - 1792 (3827 Gflop/s) - 451 (942 Gflop/s) - 451 (938 Gflop/s) - 451 (928 Gflop/s) - 451 (931 Gflop/s) errors: 0 - 0 - 0 - 0 - 0 - 0 tem100.0% proc'd: 1792 (3808 Gflop/s) - 1792 (3827 Gflop/s) - 451 (942 Gflop/s) - 451 (938 Gflop/s) - 451 (928 Gflop/s) - 451 (931 Gflop/s) errors: 0 - 0 - 0 - 0 - 0 - 0 tem100.0% proc'd: 2240 (3818 Gflop/s) - 1792 (3827 Gflop/s) - 451 (942 Gflop/s) - 451 (938 Gflop/s) - 451 (928 Gflop/s) - 451 (931 Gflop/s) errors: 0 - 0 - 0 - 0 - 0 - 0 temps: 47 C - 43 C - 46 C - 43 C - 38 C - 41 C
Killing processes.. done
Tested 6 GPUs:
GPU 0: OK
GPU 1: OK
GPU 2: OK
GPU 3: OK
GPU 4: OK
GPU 5: OK
TensorFlow benchmarking in an instance with NVIDIA CUDA Deep Neural Network library (cuDNN)
Install python librairies:
[cloud-user@instance-double-0 ~]$ sudo yum install python-pip -y
[cloud-user@instance-double-0 ~]$ sudo pip install absl-py
[cloud-user@instance-double-0 ~]$ sudo pip install tf-nightly-gpu
[cloud-user@instance-double-0 ~]$ tar xvzf cudnn-10.0-linux-x64-v7.4.1.5.tgz
[cloud-user@instance-double-0 ~]$ cp ~/cuda/lib64/libcudnn* /usr/local/cuda/lib64
[cloud-user@instance-double-0 ~]$ git clone https://github.com/tensorflow/benchmarks.git
Cloning into 'benchmarks'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 3320 (delta 1), reused 0 (delta 0), pack-reused 3315
Receiving objects: 100% (3320/3320), 1.81 MiB | 0 bytes/s, done.
Resolving deltas: 100% (2322/2322), done.
[cloud-user@instance-double-0 ~]$ cd benchmarks/tf_cnn_benchmarks
[cloud-user@instancelarge0 tf_cnn_benchmarks]$ export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH
[cloud-user@instancelarge0 tf_cnn_benchmarks]$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
[cloud-user@instancelarge0 tf_cnn_benchmarks]$ export TF_MIN_GPU_MULTIPROCESSOR_COUNT 4
Benchmark TensorFlow, test with six GPUs and a batch size of 32:
[root@instancelarge0 tf_cnn_benchmarks]# python tf_cnn_benchmarks.py --num_gpus=6 --batch_size=32 --model=resnet50 --variable_update=parameter_server
2018-12-16 15:24:56.958577: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-12-16 15:24:59.651280: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-16 15:24:59.658925: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-16 15:24:59.687935: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-16 15:24:59.716490: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-16 15:24:59.743680: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-16 15:24:59.762933: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-16 15:24:59.765277: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x4fd9f00 executing computations on platform CUDA. Devices:
2018-12-16 15:24:59.765333: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): Tesla M60, Compute Capability 5.2
2018-12-16 15:24:59.765350: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (1): Tesla M60, Compute Capability 5.2
2018-12-16 15:24:59.765366: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (2): Tesla M10, Compute Capability 5.0
2018-12-16 15:24:59.765381: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (3): Tesla M10, Compute Capability 5.0
2018-12-16 15:24:59.765396: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (4): Tesla M10, Compute Capability 5.0
2018-12-16 15:24:59.765425: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (5): Tesla M10, Compute Capability 5.0
2018-12-16 15:24:59.771973: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199995000 Hz
2018-12-16 15:24:59.778213: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5130c40 executing computations on platform Host. Devices:
2018-12-16 15:24:59.778244: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
2018-12-16 15:24:59.779844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:00:05.0
totalMemory: 7.94GiB freeMemory: 7.86GiB
2018-12-16 15:24:59.780603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: Tesla M60 major: 5 minor: 2 memoryClockRate(GHz): 1.1775
pciBusID: 0000:00:06.0
totalMemory: 7.94GiB freeMemory: 7.86GiB
2018-12-16 15:24:59.781355: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 2 with properties:
name: Tesla M10 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:00:07.0
totalMemory: 7.94GiB freeMemory: 7.90GiB
2018-12-16 15:24:59.782098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 3 with properties:
name: Tesla M10 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:00:08.0
totalMemory: 7.94GiB freeMemory: 7.90GiB
2018-12-16 15:24:59.782835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 4 with properties:
name: Tesla M10 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:00:09.0
totalMemory: 7.94GiB freeMemory: 7.90GiB
2018-12-16 15:24:59.783581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 5 with properties:
name: Tesla M10 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:00:0a.0
totalMemory: 7.94GiB freeMemory: 7.90GiB
2018-12-16 15:24:59.783895: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3, 4, 5
2018-12-16 15:24:59.794990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-16 15:24:59.795015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2 3 4 5
2018-12-16 15:24:59.795030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N N N N N
2018-12-16 15:24:59.795042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N N N N N
2018-12-16 15:24:59.795054: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2: N N N N N N
2018-12-16 15:24:59.795065: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3: N N N N N N
2018-12-16 15:24:59.795077: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 4: N N N N N N
2018-12-16 15:24:59.795089: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 5: N N N N N N
2018-12-16 15:24:59.799464: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7643 MB memory) -> physical GPU (device: 0, name: Tesla M60, pci bus id: 0000:00:05.0, compute capability: 5.2)
2018-12-16 15:24:59.800034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 7643 MB memory) -> physical GPU (device: 1, name: Tesla M60, pci bus id: 0000:00:06.0, compute capability: 5.2)
2018-12-16 15:24:59.801199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 7686 MB memory) -> physical GPU (device: 2, name: Tesla M10, pci bus id: 0000:00:07.0, compute capability: 5.0)
2018-12-16 15:24:59.801746: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 7686 MB memory) -> physical GPU (device: 3, name: Tesla M10, pci bus id: 0000:00:08.0, compute capability: 5.0)
2018-12-16 15:24:59.803043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:4 with 7686 MB memory) -> physical GPU (device: 4, name: Tesla M10, pci bus id: 0000:00:09.0, compute capability: 5.0)
2018-12-16 15:24:59.803559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:5 with 7686 MB memory) -> physical GPU (device: 5, name: Tesla M10, pci bus id: 0000:00:0a.0, compute capability: 5.0)
TensorFlow: 1.13
Model: resnet50
Dataset: imagenet (synthetic)
Mode: training
SingleSess: False
Batch size: 192 global
32 per device
Num batches: 100
Num epochs: 0.01
Devices: ['/gpu:0', '/gpu:1', '/gpu:2', '/gpu:3', '/gpu:4', '/gpu:5']
NUMA bind: False
Data format: NCHW
Optimizer: sgd
Variables: parameter_server
==========
Generating training model
W1216 15:24:59.819736 139685064017728 deprecation.py:317] From /usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
W1216 15:24:59.836664 139685064017728 deprecation.py:317] From /home/cloud-user/benchmarks/scripts/tf_cnn_benchmarks/convnet_builder.py:129: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
W1216 15:24:59.878568 139685064017728 deprecation.py:317] From /home/cloud-user/benchmarks/scripts/tf_cnn_benchmarks/convnet_builder.py:261: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.max_pooling2d instead.
W1216 15:25:02.439440 139685064017728 deprecation.py:317] From /usr/lib/python2.7/site-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W1216 15:25:02.514906 139685064017728 deprecation.py:317] From /usr/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Initializing graph
W1216 15:25:13.340526 139685064017728 deprecation.py:317] From /home/cloud-user/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py:2252: __init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2018-12-16 15:25:14.945666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3, 4, 5
2018-12-16 15:25:14.946143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-16 15:25:14.946169: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2 3 4 5
2018-12-16 15:25:14.946183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N N N N N
2018-12-16 15:25:14.946197: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N N N N N
2018-12-16 15:25:14.946209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2: N N N N N N
2018-12-16 15:25:14.946222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3: N N N N N N
2018-12-16 15:25:14.946234: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 4: N N N N N N
2018-12-16 15:25:14.946247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 5: N N N N N N
2018-12-16 15:25:14.950722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7643 MB memory) -> physical GPU (device: 0, name: Tesla M60, pci bus id: 0000:00:05.0, compute capability: 5.2)
2018-12-16 15:25:14.951848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 7643 MB memory) -> physical GPU (device: 1, name: Tesla M60, pci bus id: 0000:00:06.0, compute capability: 5.2)
2018-12-16 15:25:14.952305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 7686 MB memory) -> physical GPU (device: 2, name: Tesla M10, pci bus id: 0000:00:07.0, compute capability: 5.0)
2018-12-16 15:25:14.953354: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 7686 MB memory) -> physical GPU (device: 3, name: Tesla M10, pci bus id: 0000:00:08.0, compute capability: 5.0)
2018-12-16 15:25:14.953680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:4 with 7686 MB memory) -> physical GPU (device: 4, name: Tesla M10, pci bus id: 0000:00:09.0, compute capability: 5.0)
2018-12-16 15:25:14.954606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:5 with 7686 MB memory) -> physical GPU (device: 5, name: Tesla M10, pci bus id: 0000:00:0a.0, compute capability: 5.0)
I1216 15:25:23.474859 139685064017728 session_manager.py:491] Running local_init_op.
I1216 15:25:23.728400 139685064017728 session_manager.py:493] Done running local_init_op.
Running warm up
2018-12-16 15:25:29.187238: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
Done warm up
Step Img/sec total_loss
1 images/sec: 179.3 +/- 0.0 (jitter = 0.0) 7.903
10 images/sec: 174.8 +/- 1.9 (jitter = 2.7) 7.890
20 images/sec: 176.3 +/- 1.1 (jitter = 1.9) 7.765
30 images/sec: 175.9 +/- 1.1 (jitter = 2.2) 7.826
40 images/sec: 176.2 +/- 0.9 (jitter = 2.2) 7.983
50 images/sec: 176.7 +/- 0.8 (jitter = 2.7) 7.854
60 images/sec: 176.3 +/- 0.7 (jitter = 2.8) 7.986
70 images/sec: 176.2 +/- 0.6 (jitter = 3.0) 7.889
80 images/sec: 176.2 +/- 0.6 (jitter = 3.5) 7.921
90 images/sec: 175.9 +/- 0.6 (jitter = 4.0) 7.892
100 images/sec: 176.0 +/- 0.6 (jitter = 4.1) 7.901
----------------------------------------------------------------
total images/sec: 176.01
----------------------------------------------------------------
Test with one GPU and a batch size of 32:
[cloud-user@instance-double-0 ~]$ python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet50 --variable_update=parameter_server
…
Done warm up
Step Img/sec total_loss
1 images/sec: 80.4 +/- 0.0 (jitter = 0.0) 8.169
10 images/sec: 80.7 +/- 0.1 (jitter = 0.1) 7.593
20 images/sec: 80.6 +/- 0.0 (jitter = 0.2) 7.696
30 images/sec: 80.6 +/- 0.0 (jitter = 0.2) 7.753
40 images/sec: 80.6 +/- 0.0 (jitter = 0.2) 8.007
50 images/sec: 80.5 +/- 0.0 (jitter = 0.2) 7.520
60 images/sec: 80.5 +/- 0.0 (jitter = 0.2) 7.989
70 images/sec: 80.5 +/- 0.0 (jitter = 0.2) 8.027
80 images/sec: 80.5 +/- 0.0 (jitter = 0.2) 7.930
90 images/sec: 80.5 +/- 0.0 (jitter = 0.2) 7.849
100 images/sec: 80.5 +/- 0.0 (jitter = 0.2) 7.793
----------------------------------------------------------------
total images/sec: 80.52
----------------------------------------------------------------
Test with one GPU and a batch size of 16:
[cloud-user@instance-double-0 ~]$ python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=16 --model=resnet50 --variable_update=parameter_server
Done warm up
Step Img/sec total_loss
1 images/sec: 153.0 +/- 0.0 (jitter = 0.0) 8.012
10 images/sec: 157.6 +/- 1.4 (jitter = 5.0) 7.965
20 images/sec: 158.0 +/- 0.9 (jitter = 4.5) 7.920
30 images/sec: 158.4 +/- 0.7 (jitter = 4.5) 7.914
40 images/sec: 159.0 +/- 0.6 (jitter = 4.1) 7.900
50 images/sec: 159.1 +/- 0.5 (jitter = 3.9) 7.862
60 images/sec: 158.4 +/- 0.5 (jitter = 4.3) 7.682
70 images/sec: 158.8 +/- 0.5 (jitter = 4.6) 7.949
80 images/sec: 158.6 +/- 0.5 (jitter = 5.0) 7.755
90 images/sec: 158.8 +/- 0.5 (jitter = 5.4) 7.962
100 images/sec: 158.8 +/- 0.4 (jitter = 5.1) 7.897
----------------------------------------------------------------
total images/sec: 158.83
----------------------------------------------------------------