Enterprise OpenStack: RHEL OSP

31 minute read

Overview

As of today there are over eleven OpenStack services and more are coming. Each service has complete isolation from other services and that allows OpenStack to scale far beyond the reach of current computing platforms. However due to all these independent services, OpenStack can be very complicated to operationalize in enterprise environments.

Each service has a database or means for storing stateful data, requires load balancing and can scale at a different pace from that of other services. In addition OpenStack has over 1000 different configuration parameters, making it impossible to deploy and manage without complete automation through configuration management.

Enterprise OpenStack is about high availability, scalability, automation, configuration management, life-cycles and of course support. Choosing the right distribution is the most important decision you will ever make in regards to OpenStack. Below are some points to consider:

  • Support - How much experience does company have with Open Source, Linux, Ceph and KVM?
  • Vision - How complete is the OpenStack story?  Does it extend to applications?
  • Enterprise Features - High availability, scalability and support for diversified hardware?
  • Configuration Management - Can OpenStack be managed centrally and deployment be 100% automated?
  • Enterprise Management - Charge-back, governance policies, single-pane-of-glass, hybrid cloud and support for traditional platforms (VMware, Microsoft Hyper-V, RHEV)?
  • Linux Containers - How will OpenStack provide infrastructure for DevOps and containerized next-gen applications?
  • Lock-in - Are you free to choose your underlying hardware vendor for compute, network and storage?

I know at least one company that can talk about all of these points and that is Red Hat. In this post we will focus on the Red Hat OpenStack Platform (RHEL OSP).

foreman-e

Node Types

RHEL OSP defines four types of nodes: admin, controller, compute and Ceph storage node.

Admin Node

The admin node is responsible for management of an OpenStack environment. It provides configuration management through puppet and automated installation through foreman. The admin node is constantly evolving and mixes best of breed open source technologies together. The admin node will allow an OpenStack administrator to deploy OpenStack with 100% automation on bare-metal or virtual infrastructure. The admin node maintains configuration for a deployment centrally therefore you can not only deploy initial environment but you can grow and scale-out the environment in an automated manner. The admin node provides the following additional services for an OpenStack deployment:

  • DHCP
  • DNS
  • PXE
  • TFTP

These services enable automated provisioning and configuration management. The admin node can provision two types of OpenStack nodes today: controller and compute.

Controller Node

The controller node runs all OpenStack services except for Nova (compute). The Red Hat best practice is to run three controller nodes. The admin node and the RHEL OSP installer will configure a pacemaker cluster that uses corosync for the cluster network as well as fencing nodes. All OpenStack services will be individually clustered with an HA proxy for load balancing. The main reason for running three controller nodes aside from scalability and availability is fencing. Having three nodes provides clear ownership in a pacemaker cluster and as such two nodes is not a valid configuration.

The controller node runs the following OpenStack services:

  • Horizon (dashboard)
  • Keystone (identity)
  • Nova scheduler (compute)
  • Neutron (network)
  • Neutron metadata (metadata - cloud-init)
  • Neutron L3 Agent (L3 networking)
  • Cinder (block storage)
  • Swift (object storage)
  • Heat (orchestration)
  • Ceilometer (telemetry)

Compute Node

The compute node runs Nova (compute) and the Neutron openvswitch agent. Compute nodes do not provide any high availability, if a compute node goes down then all instances hosted on that node are down. The point of OpenStack though is to provide horizontal scaling. If you need more compute resources you simply add more compute nodes. Applications must be resilient and able to handle instances going down, this is a requirement that is often misunderstood when discussing cloud computing.

Ceph Storage Node

The Ceph storage node runs Ceph services such as the Object Storage Daemon (OSD) that provides storage to a Ceph cluster. These nodes are not at this time provisioned by the admin node and must be configured separately. The admin node will let the administrator configure Ceph as a storage back-end for Glance or Cinder but the Ceph cluster must be available. Ceph requires metadata services however and it is ideal to run these on the controller nodes but other than the Ceph metadata services nothing else Ceph related should be running on OpenStack controller nodes.

Ceph is the defacto storage for OpenStack because it meets the requirements for OpenStack storage very well. Ceph can scale far beyond other storage systems and like OpenStack it abstracts hardware allowing you to be free in choice of your hardware vendors.

Installing Admin Node

The first step to deploying enterprise OpenStack is to install and configure the admin node. The admin node must at a minimum be connected to the provisioning network. If the admin node goes down you will be unable to provision. In addition if the admin node is providing DNS or DHCP to OpenStack environment those services will be offline. Infrastructure decisions regarding the admin node require thoughtful planning.

Prepare

Install RHEL 7.1
#subscription-manager register
#subscription-manager list --available
#subscription-manager attach --pool=<pool id>
#systemctl disable NetworkManager.service
#systemctl stop NetworkManager.service
#yum remove dnsmasq
#subscription-manager repos --disable=*
#subscription-manager repos --enable=rhel-7-server-rpms
#subscription-manager repos --enable=rhel-7-server-openstack-6.0-installer-rpms
#subscription-manager repos --enable=rhel-server-rhscl-7-rpms
#yum update -y
#yum install -y rhel-osp-installer

Setup Provisioning Network

Before starting the installer ensure your network interface is setup correctly. Below is an example.

DEVICE=eth0
BOOTPROTO=none
HWADDR=52:54:00:a5:fe:26
ONBOOT=yes
HOTPLUG=yes
TYPE=Ethernet
IPADDR=192.168.122.99
NETMASK=255.255.255.0
PEERDNS=yes
DNS1=192.168.122.99
DNS2=192.168.122.1
NM_CONTROLLED=no

Install and Configure

#rhel-osp-installer
Please select NIC on which you want provisioning enabled: 
1. eth0
2. eth1
? 1

Networking setup:
Network interface: 'eth0'
IP address: '192.168.122.99'
Network mask: '255.255.255.0'
Network address: '192.168.122.0'
Host Gateway: '192.168.122.1'
DHCP range start: '192.168.122.100'
DHCP range end: '192.168.122.254'
DHCP Gateway: '192.168.122.99'
DNS forwarder: '192.168.122.1'
Domain: 'lab.local'
NTP sync host: '0.rhel.pool.ntp.org'
Timezone: 'Europe/Berlin'
Configure networking on this machine: ✓
Configure firewall on this machine: ✓

Ensure your DNS forwarder is a system that can provide external DNS. The DHCP gateway should be admin node itself since it provides DHCP unless of course you want to handle this externally.

How would you like to proceed?:

1. Proceed with the above values
2. Change Network interface
3. Change IP address
4. Change Network mask
5. Change Network address
6. Change Host Gateway
7. Change DHCP range start
8. Change DHCP range end
9. Change DHCP Gateway
10. Change DNS forwarder
11. Change Domain
12. Change NTP sync host
13. Change Timezone
14. Do not configure networking
15. Do not configure firewall
16. Cancel Installation

9

new value for DHCP Gateway
192.168.122.1

Change the NTP server, it is critical you provide an internal NTP server or an external one that is available as OpenStack replies heavily on NTP.

How would you like to proceed?:

1. Proceed with the above values
2. Change Network interface
3. Change IP address
4. Change Network mask
5. Change Network address
6. Change Host Gateway
7. Change DHCP range start
8. Change DHCP range end
9. Change DHCP Gateway
10. Change DNS forwarder
11. Change Domain
12. Change NTP sync host
13. Change Timezone
14. Do not configure networking
15. Do not configure firewall
16. Cancel Installation

12

Enter a list of NTP hosts, separated by commas. First in the list will be the default.
clock.redhat.com
Configure client authentication
SSH public key: ''
Root password: '*******************************************'

Please set a default root password for newly provisioned machines. If you choose not to set a password, it will be generated randomly. The password must be a minimum of 8 characters. You can also set a public ssh key which will be deployed to newly provisioned machines.

How would you like to proceed?:
1. Proceed with the above values
2. Change SSH public key
3. Change Root password
4. Toggle Root password visibility
3

new value for root password
********

enter new root password again to confirm
********
Now you should configure installation media which will be used for provisioning.
Note that if you don't configure it properly, host provisioning won't work until you configure installation media manually.

Enter RHEL repo path:
1. Set RHEL repo path (http or https URL): http://
2. Proceed with configuration
3. Skip this step (provisioning won't work)
1
Path: http://192.168.122.99:8120/RHEL7
Enter RHEL repo path:
1. Set RHEL repo path (http or https URL): http://192.168.122.99:8120/RHEL7
2. Proceed with configuration
3. Skip this step (provisioning won't work)
2
Enter your subscription manager credentials:

1. Subscription manager username: myuser
2. Subscription manager password: ********
3. Comma or Space separated repositories: rhel-7-server-openstack-6.0-rpms rhel-7-server-openstack-6.0-installer-rpms rhel-7-server-rh-common-rpms
4. Subscription manager pool (recommended): mypool
5. Subscription manager proxy hostname:
6. Subscription manager proxy port:
7. Subscription manager proxy username:
8. Subscription manager proxy password:
9. Proceed with configuration
10. Skip this step (provisioning won't subscribe your machines)
9
Starting to seed provisioning data

Use 'base_RedHat_7' hostgroup for provisioning

Success!

* Foreman is running at https://admin.lab.local

Initial credentials are admin / 7wHcE3YZYHSRffmh

* Foreman Proxy is running at https://admin.lab.local:8443

* Puppetmaster is running at port 8140

The full log is at /var/log/rhel-osp-installer/rhel-osp-installer.log

Configure Provisioning Media

The admin node needs to install RHEL on nodes it provisions. To do this we need to expose the RHEL install media through HTTP. Below is the process:

mount RHEL 7.1 install media in cdrom
#mkdir /RHEL7
#mount -o ro /dev/cdrom /RHEL7
#cp -dpR /RHEL7 /var/www/html/.
#chmod -R 755 /var/www/html/RHEL7
#semanage port -a -t http_port_t -p tcp 8120
vi /etc/httpd/conf.d/medium.conf
Listen 8120
NameVirtualHost *:8120
<VirtualHost *:8120>
DocumentRoot /var/www/html/
ServerName 192.168.122.99
<Directory "/var/www/html/">
Options All Indexes FollowSymLinks
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
#iptables -I INPUT 1 -p tcp -m multiport --ports 8120 -m comment --comment "8120 accept - medium" -j ACCEPT
#iptables-save > /etc/sysconfig/iptables
#systemctl restart network.service
#systemctl restart httpd

Host Discovery

Now that the admin node has been installed and configured we can start our OpenStack deployment. The first step is host discovery. As mentioned OpenStack nodes can run on bare-metal or Virtual Machines. In either case you must configure host for boot via PXE. At minimum the controller node should have three NICs connecting to networks for provisioning / management, external and tenant. The compute node should have minimum two NICs for provisioning / management and tenant. You also generally want an additional network for public API and of course storage. Configure host networking properly before booting.

Once hosts are booted foreman will discover them and they will show up in the RHEL OSP installer under Hosts->Discovered Hosts.

Create Networks in RHEL OSP installer under Infrastructure->Subnet. You should have at minimum three subnets (provisioning / management, external and tenant). It is also recommend to separate public and storage.

Below both hosts have been discovered based on MAC address (names can be changed later).

RHEL_OSP_6_Deployment_Networks

Create an external subnet (you need to do this for every network).

OSP_6_Create_External_Network

Deploying OpenStack

The first step to deploying OpenStack is to create a new deployment by going to OpenStack Installer->Deployments. Creating a new deployment is a four step process that can be observed below.

OSP_6_Deployment_1

OSP_6_Deployment_2

OSP_6_Deployment_3

OSP_6_Deployment_4

Glance requires a storage back-end, in this case NFS was chosen.

OSP_6_Deployment_GLance

Cinder also requires a storage back-end, in this case NFS was chosen. Note NetApp as being an option. Red Hat partners such as NetApp have started integrating into the RHEL OSP installer.

OSP_6_Deployment_Cinder

Once the deployment has been created we can assign hosts. In this configuration we have one controller and compute node (the minimum setup). Below we will assign controller and compute node to deployment. Keep in mind the beauty of the installer allows you to grow the environment and scale-out more controllers or compute nodes as required. I would recommend starting small until at least the kinks are worked out (see troubleshooting section).

OSP_6_Assign_Controllers

OSP_6_Deployment_assign_compute

Once our hosts have been assigned to a deployment we can update the hosts. In this case I changed the hostname but you can also configure network, domain and realm information for every host.

OSP_6_Deployment_rename_hosts

Finally we are ready for deployment. Under Infrastructure->Deployments select the deployment and deploy. The progress can be followed from the UI. It typically takes around 2 hours to deploy an OpenStack environment and periodically you will want to check on the progress. Once RHEL is installed you can log into controller and compute hosts to follow progress at more granular level. The install log is located under /var/log/foreman-installer/foreman-installer.log.

OSP_6_Deyploment_status

Congrats you have deployed an Enterprise OpenStack environment!

OSP_6_Deployment_success

Monitoring

One of the great features with the RHEL OSP installer is monitoring. You get dashboards, reports and detailed host information that allow administrators to proactively monitor their deployments. Below are a few dashboards to give you an idea of the capability.

RHEL_OSP_Monitoring

 

RHEL_OSP_Monitoring_2

Neutron Networking

At this point we have a running OpenStack environment. The last step is to setup OpenStack networking. In this example we will use vxlan for tenant tunneling traffic and a flat network for external access via floating IPs.

Configure internal tenant network using vxlan.

neutron net-create internal --provider:network_type vxlan
neutron subnet-create internal --name internal_subnet --allocation-pool start=10.10.1.100,end=10.10.1.200 10.10.1.0/24

Configure external flat provider network

neutron net-create external --provider:network_type flat --provider:physical_network physnet-external --router:external=True
neutron subnet-create external --name external_subnet --allocation-pool start=192.168.123.100,end=192.168.123.200 --disable-dhcp --gateway 192.168.123.1 192.168.123.0/24

Configure a OpenStack router without HA

neutron router-create prod-router --ha False

Set router gateway to our external provider network and add interface to tenant network

neutron router-gateway-set prod-router external
neutron router-interface-add prod-router internal_subnet

When complete the network topology should look something like diagram below.

OSP_6_Netowkr_Topology

Troubleshooting

OpenStack is not for the faint of heart. If you are expecting click-next and grab a coffee you are in for a rude awakening. OpenStack requires a certain level of Open Source and Linux knowledge. You should understand how puppet and foreman work, these skills are recommended for troubleshooting (especially puppet). You also need decent skills in OpenStack networking (openvswitch) in general. Just because the RHEL OSP installer automates everything doesn't mean it is autopilot.

Before we get into troubleshooting lets understand the basic workflow of a RHEL OSP deployment.

  1. provision nodes and install base RHEL.
  2. register nodes with subscription manager
  3. download packages from appropriate channels
  4. configure base networking
  5. puppet run on all controller nodes (openstack, openvswitch and pacemaker cluster configured)
  6. puppet run on all compute nodes (openstack nova and openvswitch configured)

Here are some common problems that I have run into, hopefully they are helpful.

  •    Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Local ip for ovs agent must be set when tunneling is enabled at
    /etc/puppet/environments/production/modules/neutron/manifests/agents/ovs.pp:32 on node controller1.lab.local

Solution: this is a configuration problem. It will occur on the controller node and means that the controller does not have access to a particular network. In this case it was due to only giving the compute node access to the tenant network. This error requires re-configuring network and deployment in the admin node and starting over.

  • Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed when searching for node ostack-ctr1.lab.local: Failed to find ostack-ctr1.lab.local via exec: Execution of '/etc/puppet/node.rb ostack-ctr1.lab.local' returned 1:
    Warning: Not using cache on failed catalog
    Error: Could not retrieve catalog; skipping run

Solution: This error was caused mainly by a bug fixed in RHEL OSP 6.0.1. Ensure you are running RHEL 6.0.1, RHEL 7.1 and have performed a yum update. This error requires re-install of the the admin node itself to update to latest version.

  • Error: Deployment completes but br-ex is missing.

Solution: the br-ex is an openvswitch bridge that handles external access for instances via floating IPs. If br-ex is missing you wont have ability to assign floating IPs to instances for external access. This could be a configuration problem in deployment. Edit the controller host and ensure the external network has the correct subnet.

RHEL_OSP_DEPLOYMENT_EXTERNAL_SUBNET

In order to create br-ex openvswitch bridge manually follow these steps

  • Ensure physical NIC in this case eth1 is configured correctly
#cat /etc/sysconfig/network-scripts/ifcfg-eth1 
DEVICE=eth1
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-ex
ONBOOT=yes
BOOTPROTO=none
  • Ensure br-ex interface for openvswitch is configured correctly
#cat /etc/sysconfig/network-scripts/ifcfg-br-ex 
IPADDR="192.168.123.43"
NETMASK="255.255.255.0"
GATEWAY="192.168.123.1"
ONBOOT=yes
PEERROUTES=no
NM_CONTROLLED=no
DEFROUTE=no
PEERDNS=no
DEVICE=br-ex
DEVICETYPE=ovs
OVSBOOTPROTO="none"
TYPE=OVSBridge
  • Create br-ex bridge
#ovs-vsctl add-br br-ex
  • Add physical NIC to bridge as internal openvswitch port
#ovs-vsctl add-port br-ex eth1
  • Create a patch port on br-int that patches over to br-ex
#ovs-vsctl add-port br-int br-int-ex -- set Interface br-int-ex type=patch options:peer=phys-br-ex
  • Create a openvswitch patch port on br-ex that patches over to br-int
#ovs-vsctl add-port br-ex phy-br-ex -- set Interface phy-br-ex type=patch options:peer=br-int-ex
     Bridge br-ex
        Port "eth1"
            Interface "eth1"
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=br-int-ex}
        Port br-ex
            Interface br-ex
                type: internal
  • Error: registering host with foreman (https://admin.osp.lab.com) could not send facts to foreman: connection refused - connect(2)

Solution: this error occurs during discovery of hosts by foreman. It indicates a DNS or firewall problem. RHEL OSP 6.0.1 fixed a lot of problems with discovery but it is important to ensure DNS and DHCP are working correctly from admin node.

  • Error: Openvswitch interfaces show down when issuing "ip a" command

Solution: this is not generally a problem. Openvswitch creates interfaces for every bridge for Linux compatibility reasons. They are not otherwise used. The kernel does not recognize these devices correctly and hence sees them as down interfaces.

  • Error: Puppet throws error no certificate found and waitforcert is disabled

Solution: this problem is fairly uncommon however if it happens you need to regenerate certificates. Below is the process.

On puppet master (admin node)

#puppet cert sign --all
#puppet cert clean --all

On puppet agent (controller or compute)

#rm -rf /var/lib/puppet/ssl/*
vi /etc/puppet/puppet.conf
certificate_revocation = false
#puppet agent --no-daemonize --server admin.local.lab --onetime --verbose

On puppet master (admin node)

#puppet cert --list
#puppet cert sign "controller or compute hostname"

Restarting Puppet

In case puppet fails for any reason you need to run the following command on the controller or compute node where the error occurred in order to restart puppet

#puppet agent -td

Summary

Enterprise OpenStack is not just about support it is so much more. As we have seen to operationalize OpenStack in enterprise environments we require automation, provisioning, central management / monitoring, declarative configuration management, Linux experience / expertise and of course world class support from a premier Open Source company like Red Hat. The biggest difference between OpenStack distributions and if you will feature disparity lies within the admin node. OpenStack is OpenStack but how it is deployed and maintained will determine your success. This is just the beginning, if you like the current capabilities you are going to love what is coming down the pipe with Triple-O (OpenStack on OpenStack) and over / under clouds. Things are just getting started!

As always if you have feedback, ideas or suggestions I would love to hear about them.

Happy Stacking!

(c) 2015 Keith Tenzer