Auto Scaling Instances with OpenStack
Overview
Intelligently and automatically scaling applications based on resource requirements is at the heart of the OpenStack value proposition. It is one of the key capabilities that differentiate cloud vs traditional infrastructure such as VMware. For those of us who have been in IT a while auto scaling was always a white unicorn, often discussed but never actually seen. In this article we will talk about how to do this in OpenStack using Heat and Ceilometer.
Heat
Orchestration and automation within OpenStack is handled by Heat. It is the brains of your cloud. Heat provides a declarative structure for defining IT processes using YAML. You could say the value of OpenStack is implemented by Heat as this is where your business processes exist. Heat will let you automatically provision infrastructure (compute, network and storage) based on YAML templates. In addition Heat also lets you create policies around running infrastructure. One such policy is around auto scaling.
Ceilometer
Collecting and persisting utilization measurements within OpenStack is handled by Ceilometer. OpenStack attempts to handle IT infrastructure as a utility and as such metering is a critical aspect. Furthermore, enabling billing systems to provide pay-as-you-go consumption models, requires exact metering. Beyond billing, such metering is also key for auto scaling. Decisions made by Heat on when to scale applications, are based on data collected by Ceilometer.
Auto Scaling Heat Templates
A lot of the configuration information in this article comes from article written by Christian Berendt. In this auto scaling example, Heat and Ceilometer will be used to scale CPU bound virtual machines. Heat has the concept of a stack which is simply the environment itself. The Heat stack template describes the process or logic around how a Heat stack will be built and managed. This is where you can create an auto scaling group and configure Ceilometer thresholds. The environment template explains how to create the stack itself, what image or volume to use, network configuration, software to install and everything an instance or instances need to properly function. You can put everything into the Heat stack template, but separating the Heat stack template from the environment is much cleaner, at least in more complex configurations such as auto scaling.
Environment Template
Below we will create an environment template for a cirros image. The instance template will create an instance based on Cirros image, configure a cinder volume, add IP from private network, add floating IP from public network, add security group, private ssh-key and generate 100% CPU load through user-data. Note: you will need to make changes below depending on your OpenStack configuration. The OpenStack installation and configuration used for these examples can be found in this article.
#vi /etc/heat/templates/cirros.yaml
[code language="java"]
heat_template_version: 2014-10-16
description: A base Cirros 0.3.4 server
resources:
server:
type: OS::Nova::Server
properties:
block_device_mapping:
- device_name: vda
delete_on_termination: true
volume_id: { get_resource: volume }
flavor: m1.nano
key_name: admin
networks:
- port: { get_resource: port }
port:
type: OS::Neutron::Port
properties:
network: private
security_groups:
- all
floating_ip:
type: OS::Neutron::FloatingIP
properties:
floating_network: public
floating_ip_assoc:
type: OS::Neutron::FloatingIPAssociation
properties:
floatingip_id: { get_resource: floating_ip }
port_id: { get_resource: port }
volume:
type: OS::Cinder::Volume
properties:
image: 'Cirros 0.3.4'
size: 1
[/code]
Now that we have an environment template, we need to create a Heat resource type and link it to above file /etc/heat/templates/cirros.yaml.
#vi /root/environment.yaml
[code language="java"]
resource_registry:
"OS::Nova::Server::Cirros": "file:///etc/heat/templates/cirros.yaml"
[/code]
Stack Template
Below we will define our Heat stack template. We will create the following resources: scaleup_group, scaleup_policy and cpu_alarm_high. The scaleup_group explains how the instance should be scaled and also defines the environment (OS::Nova::Server::Cirros) that points to the environment yaml file "/etc/heat/templates/cirros.yaml". The scaleup_policy defines how to handle a scale-up event. Finally we have a threshold, the cpu_alarm_high resource is used to trigger a scale-up event. Here we define the threshold and actions provided by Ceilometer.
#vi /root/example.yaml
[code language="java"]
heat_template_version: 2014-10-16
description: Example auto scale group, policy and alarm
resources:
scaleup_group:
type: OS::Heat::AutoScalingGroup
properties:
cooldown: 60
desired_capacity: 1
max_size: 3
min_size: 1
resource:
type: OS::Nova::Server::Cirros
scaleup_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: { get_resource: scaleup_group }
cooldown: 60
scaling_adjustment: 1
scaledown_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: { get_resource: scaleup_group }
cooldown: 60
scaling_adjustment: -1
cpu_alarm_high:
type: OS::Ceilometer::Alarm
properties:
meter_name: cpu_util
statistic: avg
period: 60
evaluation_periods: 1
threshold: 50
alarm_actions:
- {get_attr: [scaleup_policy, alarm_url]}
comparison_operator: gt
cpu_alarm_low:
type: OS::Ceilometer::Alarm
properties:
meter_name: cpu_util
statistic: avg
period: 60
evaluation_periods: 1
threshold: 10
alarm_actions:
- {get_attr: [scaledown_policy, alarm_url]}
comparison_operator: lt
[/code]
Update Ceilometer Collection Interval
By default Ceilometer will collect CPU data from instances every 10 minutes. For this example we want to change that to 60 seconds. Change the interval to 60 in the pipeline.yaml file and restart OpenStack services.
#vi /etc/ceilometer/pipeline.yaml
[code language="java"]
- name: cpu_source
interval: 60
meters:
- "cpu"
sinks:
- cpu_sink
[/code]
#openstack-service restart
Running Heat Stack
At this point we are ready to run our auto scaling Heat stack. The expected results should be that a single Cirros instance is launched. It should have private and floating IPs.
#heat stack-create example -f /root/example.yaml -e /root/environment.yaml +--------------------------------------+------------+--------------------+----------------------+ | id | stack_name | stack_status | creation_time | +--------------------------------------+------------+--------------------+----------------------+ | 6fca513c-25a1-4849-b7ab-909e37f52eca | example | CREATE_IN_PROGRESS | 2015-08-31T16:18:02Z | +--------------------------------------+------------+--------------------+----------------------+
Heat will create the stack and launch the one cirros instance.
#nova list +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+ | 3f627c84-06aa-4782-8c12-29409964cc73 | ex-qeki-3azno6me5gvm-pqmr5zd6kuhm-server-gieck7uoyrwc | ACTIVE | - | Running | private=10.10.1.156, 192.168.122.234 | +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
Heat will also create two cpu alarms which are used to trigger scale-up or scale-down events.
ceilometer alarm-list +--------------------------------------+-------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+ | Alarm ID | Name | State | Severity | Enabled | Continuous | Alarm condition | Time constraints | +--------------------------------------+-------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+ | 04b4f845-f5b6-4c5a-8af0-59e03c22e6fa | example-cpu_alarm_high-rd5kysmlahvx | ok | low | True | True | cpu_util > 50.0 during 1 x 60s | None | | ac81cd81-20b3-45f9-bea4-e51f00499602 | example-cpu_alarm_low-6t65kswutupz | ok | low | True | True | cpu_util < 10.0 during 1 x 60s | None | +--------------------------------------+-------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+
Automatically Scaling Up
Heat will scale instances up based on the cpu_alarm_high threshold. Once CPU utilization is above 50% instances will be scaled up. In order to generate CPU load, log into the instance and run the "dd" command.
$ssh -i admin.pem cirros@192.168.122.232 $sudo -i #dd if=/dev/zero of=/dev/null & #dd if=/dev/zero of=/dev/null & #dd if=/dev/zero of=/dev/null &
Upon running "dd" commands we should have close to 100% CPU utilization in our cirros instance. After 60 seconds we should see that Heat has scaled and we have two instances.
#ceilometer alarm-list +--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+ | Alarm ID | Name | State | Severity | Enabled | Continuous | Alarm condition | Time constraints | +--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+ | 04b4f845-f5b6-4c5a-8af0-59e03c22e6fa | example-cpu_alarm_high-rd5kysmlahvx | ok | low | True | True | cpu_util > 50.0 during 1 x 60s | None | | ac81cd81-20b3-45f9-bea4-e51f00499602 | example-cpu_alarm_low-6t65kswutupz | alarm | low | True | True | cpu_util < 10.0 during 1 x 60s | None | +--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
#nova list +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+ | 3f627c84-06aa-4782-8c12-29409964cc73 | ex-qeki-3azno6me5gvm-pqmr5zd6kuhm-server-gieck7uoyrwc | ACTIVE | - | Running | private=10.10.1.156, 192.168.122.234 | | 0f69dfbe-4654-474f-9308-1b64de3f5c18 | ex-qeki-qmvor5rkptj7-krq7i66h6n7b-server-b4pk3dzjvbpi | ACTIVE | - | Running | private=10.10.1.157, 192.168.122.235 | +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
After additional 60 seconds we should see that Heat has scaled again and we have three instances. Since three is the maximum for this configuration, we will not scale any higher.
#nova list +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+ | 3f627c84-06aa-4782-8c12-29409964cc73 | ex-qeki-3azno6me5gvm-pqmr5zd6kuhm-server-gieck7uoyrwc | ACTIVE | - | Running | private=10.10.1.156, 192.168.122.234 | | 0e805e75-aa6f-4375-b057-2c173b68f172 | ex-qeki-gajdwmu2cgm2-vckf4g2gpwis-server-r3smbhtqij76 | ACTIVE | - | Running | private=10.10.1.158, 192.168.122.236 | | 0f69dfbe-4654-474f-9308-1b64de3f5c18 | ex-qeki-qmvor5rkptj7-krq7i66h6n7b-server-b4pk3dzjvbpi | ACTIVE | - | Running | private=10.10.1.157, 192.168.122.235 | +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
Automatically Scaling Down
Heat will scale instances down based on the cpu_alarm_low threshold. Once CPU utilization is below 10% instances will be scaled down. We can simply kill the "dd" processes and watch Heat scale instances back down.
After stopping ""dd" processes we should see that the cpu_alarm_low event is triggered. This will cause Heat to scale down and remove instance.
#ceilometer alarm-list +--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+ | Alarm ID | Name | State | Severity | Enabled | Continuous | Alarm condition | Time constraints | +--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+ | 04b4f845-f5b6-4c5a-8af0-59e03c22e6fa | example-cpu_alarm_high-rd5kysmlahvx | ok | low | True | True | cpu_util > 50.0 during 1 x 60s | None | | ac81cd81-20b3-45f9-bea4-e51f00499602 | example-cpu_alarm_low-6t65kswutupz | alarm | low | True | True | cpu_util < 10.0 during 1 x 60s | None | +--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
After a few minutes we should be back to a single instance.
Summary
In this article we talked about how Heat and Ceilometer work together within OpenStack, providing the brains behind your cloud. We looked at a typical cloud use case around auto scaling and how to configure auto scaling through Heat. Hopefully you found this article informative. As always any feedback is greatly appreciated and as we say at Red Hat, sharing is caring.
Happy Auto Scaling!
(c) 2015 Keith Tenzer