Red Hat Ceph Storage 2.0 Lab + Object Storage Configuration Guide

23 minute read

1015767

Overview

Ceph has become the defacto standard for software-defined storage. Ceph is 100% opensource, built on open standards and as such is offered by many vendors not just Red Hat. If you are new to Ceph or software-defined storage, I would recommend the following article before proceeding to understand some high-level concepts:

Ceph - the future of storage

In this article we will configure a Red Hat Ceph 2.0 cluster and set it up for object storage. We will configure RADOS Gateway (RGW), Red Hat Storage Console (RHCS) and show how to configure the S3 and Swift interfaces of the RGW. Using python we will access both the S3 and Swift interfaces.

If you are interested in configuring Ceph for OpenStack see the following article:

OpenStack - Integrating Ceph as Storage Backend

Prerequisites

Ceph has a few different components to be aware of: monitors (mons), storage or osd nodes (osds), Red Hat Storage Console (RHSC), RHSC agents, Calamari, clients and gateways.

Monitors - maintain maps (crush, pg, osd, etc) and cluster state. Monitors use Paxos to establish consensus.

Storage or OSD Node - Provides one or more OSDs. Each OSD represents a disk and has a running daemon process controlled by systemctl. There are two types of disks in Ceph: data and journal. The journal enable Ceph to commit small writes quickly and guarantees atomic compound operations. Journals can be collocated with data on same disks or separate. Splitting journals out to SSDs provides higher performance for certain use cases such as block.

Red Hat Storage Console (optional) - UI and dashboard that can monitor multiple clusters and monitor not only Ceph but Gluster as well.

RHSC Agents (optional) - Each monitor and osd node runs an agent that reports to the RHSC.

Calamari (optional) - Runs on one of the monitors to get statistics on ceph cluster and provides rest endpoint. RHSC talks to calamari.

Clients - Ceph provides an RBD (RADOS Block Device) client for block storage, CephFS for file storage and a fuse client as well. The RADOS GW itself can be viewed as a Ceph client. Each client requires authentication if cephx is enabled. Cephx is based on kerberos.

Gateways (optional) - Ceph is based on RADOS (Reliable Atomic Distributed Object Store). The RADOS Gateway is a web server that provides s3 and swift endpoints and sends those requests to Ceph via RADOS. Similarily there is an ISCSI Gateway that provides ISCSI target to clients and talks to Ceph via RADOS. Ceph itself is of course an object store that supports not only object but file and block clients as well.

Red Hat recommends at minimum three monitors and 10 storage nodes. All of which should be physical not virtual machines. For the gateways and RHSC, VMs can be used. Since the purpose of this article is about building a lab environment we are doing everything on just three VMs. The VMs should be configured as follows with Red Hat Enterprise Linux (7.2 or 7.3):

  • ceph1: 4096 MB RAM, 2 Cores, 30GB root disk, 2 X 100 GB data disk, 192.168.122.81/24.
  • ceph2: 4096 MB RAM, 2 Cores, 30GB root disk , 2 X 100 GB data disk, 192.168.122.82/24.
  • ceph3: 4096 MB RAM, 2 Cores, 30GB root disk, 2 X 100 GB data disk, 192.168.122.83/24.

Note: this entire environment runs on my 12GB thinkpad laptop. If memory is tight you can cut ceph2 down to 2048MB RAM.

The roles will be devided across the nodes as follows:

  • Ceph1: RHSC, Rados Gateway, Monitor and OSD
  • Ceph2: Calamari, Monitor and OSD
  • Ceph3: Monitor and OSD

ceph_lab_setup

Install Ceph Cluster

Register subscription and enable repositories.

# subscription-manager register
# subscription-manager list --available
# subscription-manager attach --pool=8a85f981weuweu63628333293829
# subscription-manager repos --disable=*
# subscription-manager repos --enable=rhel-7-server-rpms
# subscription-manager repos --enable=rhel-7-server-rhceph-2-mon-rpms
# subscription-manager repos --enable=rhel-7-server-rhceph-2-osd-rpms
# subscription-manager repos --enable=rhel-7-server-rhscon-2-agent-rpms

Note: If you are using centos you will need to install ansible and get the ceph-ansible playbooks from github.

Disable firewall

Since this is a lab environment we can make life a bit easier. If you are interested in enabling firewall then follow official documentation here.

#systemctl stop firewalld
#systemctl disable firewalld

Configure NTP

Time synchronization is absolutely critical for Ceph. Make sure it is reliable.

# yum install -y ntp
# systemctl enable ntpd
# systemctl start ntpd

Test to ensure ntp is working properly.

# ntpq -p

Update hosts file

If dns is working you can skip this step.

#vi /etc/hosts
192.168.122.81 ceph1.lab.com ceph1
192.168.122.82 ceph2.lab.com ceph2
192.168.122.83 ceph3.lab.com ceph3

Create Ansible User

Ceph 2.0 now uses ansible to deploy, configure and update. A user is required that has sudo permissions.

# useradd ansible
# passwd ansible
#cat << EOF > /etc/sudoers.d/ansible
ansible ALL = (root) NOPASSWD:ALL
Defaults:ansible !requiretty
EOF

Enable repositories for RHSC

# subscription-manager repos --enable=rhel-7-server-rhscon-2-installer-rpms
# subscription-manager repos --enable=rhel-7-server-rhscon-2-main-rpms

Install Ceph-Ansible

# yum install -y ceph-ansible

Setup ssh keys for ansible user

# su - ansible
$ ssh-keygen
$ ssh-copy-id ceph1
$ ssh-copy-id ceph2
$ ssh-copy-id ceph3
$ mkdir ~/ceph-ansible-keys

Update Ansible Hosts file

$ sudo vi /etc/ansible/hosts

[mons]
ceph1
ceph2
ceph3

[osds]
ceph1
ceph2
ceph3

[rgws]
ceph1
ceph3

Update Ansible Group Vars

The ceph configuration is maintained by group vars. By default samples are provided. We need to copy these and then update them. For this deployment we need to update group vars for all, mons and osds.

You can find these group var files in github.

$ cd /usr/share/ceph-ansible/group_vars

Update general group vars

$ cp all.sample all
$ vi all
fetch_directory: /home/ansible/ceph-ansible-keys
cluster: ceph
ceph_stable_rh_storage: true
ceph_stable_rh_storage_cdn_install: true
generate_fsid: true
cephx: true
monitor_interface: eth0
journal_size: 1024
public_network: 192.168.122.0/24
cluster_network: ""
osd_mkfs_type: xfs
osd_mkfs_options_xfs: -f -i size=2048
radosgw_frontend: civetweb
radosgw_civetweb_port: 8080
radosgw_keystone: false

Update monitor group vars

$ cp mons.sample mons

Update osd group vars

$ cp osds.sample osds
$ vi osds
osd_auto_discovery: true
journal_collocation: true

Update rados gateway group vars

We are just going with defaults here so no changes.

$ cp rgws.sample rgws

Run Ansible playbook

$ cd /usr/share/ceph-ansible
$ sudo cp site.yml.sample site.yml
$ ansible-playbook site.yml -vvvv

If everything is successful you should see message similar to below. If something fails simple fix problem and re-run playbook till it succeeds.

PLAY RECAP ********************************************************************
 ceph1 : ok=370 changed=17 unreachable=0 failed=0
 ceph2 : ok=286 changed=14 unreachable=0 failed=0
 ceph3 : ok=286 changed=13 unreachable=0 failed=0

Check Ceph Health

You should see HEALTH_OK. If it is not ok then you can run "ceph health detail" to get more information.

$ sudo ceph -s
cluster 1e0c9c34-901d-4b46-8001-0d1f93ca5f4d
health HEALTH_OK
monmap e1: 3 mons at {ceph1=192.168.122.81:6789/0,ceph2=192.168.122.82:6789/0,ceph3=192.168.122.83:6789/0}
election epoch 6, quorum 0,1,2 ceph1,ceph2,ceph3
osdmap e14: 3 osds: 3 up, 3 in
flags sortbitwise
pgmap v26: 104 pgs, 6 pools, 1636 bytes data, 171 objects
103 MB used, 296 GB / 296 GB avail
104 active+clean

Configure erasure coded pool for RADOSGW

By default the pool default.rgw.data.root contains data for a RADOSGW and it is configured for replication not erasure coding. In order to change to erasure coding you need to delete the pool and re-create it. For object storage we usually recommend erasure coding as it is much more efficient and brings down costs.

# ceph osd pool delete default.rgw.data.root default.rgw.data.root --yes-i-really-really-mean-it

Ceph supports many different erasure coding schemes.

# ceph osd erasure-code-profile ls
default
k4m2
k6m3
k8m4

The default profile is 2+1. Since we only have three nodes this is the only profile that could actually work so we will use that.

# ceph osd erasure-code-profile get default
k=2
m=1
plugin=jerasure
technique=reed_sol_van

Create new erasure coded pool using 2+1.

# ceph osd pool create default.rgw.data.root 128 128 erasure default

Here we are creating 128 placement groups for the pool. This was calculated using the pg calculation tool: https://access.redhat.com/labs/cephpgc/

Configure rados gateway s3 user

The rados gateway was installed and configured on ceph1 per ansible however a user needs to be created for s3. This is not part of Ansible installation by default.

#radosgw-admin user create --uid="s3user" --display-name="S3user"
 {
 "user_id": "s3user",
 "display_name": "S3user",
 "email": "",
 "suspended": 0,
 "max_buckets": 1000,
 "auid": 0,
 "subusers": [],
 "keys": [
 {
 "user": "s3user",
 "access_key": "PYVPOGO2ODDQU24NXPXZ",
 "secret_key": "pM1QULv2YgAEbvzFr9zHRwdQwpQiT9uJ8hG6JUZK"
 }
 ],
 "swift_keys": [],
 "caps": [],
 "op_mask": "read, write, delete",
 "default_placement": "",
 "placement_tags": [],
 "bucket_quota": {
 "enabled": false,
 "max_size_kb": -1,
 "max_objects": -1
 },
 "user_quota": {
 "enabled": false,
 "max_size_kb": -1,
 "max_objects": -1
 },
 "temp_url_keys": []
 }

Test S3 Access

In order to test s3 access we will use a basic python script that uses the boto library.

# pip install boto

Create script and update with information from above. The script is located in github.

# cd /root
# vi s3upload.py
#!/usr/bin/python

import sys

import boto
import boto.s3.connection
from boto.s3.key import Key

access_key = 'PYVPOGO2ODDQU24NXPXZ'
secret_key = 'pM1QULv2YgAEbvzFr9zHRwdQwpQiT9uJ8hG6JUZK'
rgw_hostname = 'ceph1'
rgw_port = 8080
local_testfile = '/tmp/testfile'
bucketname = 'mybucket'


conn = boto.connect_s3(
	aws_access_key_id = access_key,
	aws_secret_access_key = secret_key,
	host = rgw_hostname,
	port = rgw_port,
	is_secure=False,
	calling_format = boto.s3.connection.OrdinaryCallingFormat(),
	)

def printProgressBar (iteration, total, prefix = '', suffix = '', decimals = 1, length = 100, fill = '#'):
    percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
    filledLength = int(length * iteration // total)
    bar = fill * filledLength + '-' * (length - filledLength)
    print('\r%s |%s| %s%% %s' % (prefix, bar, percent, suffix))
    if iteration == total: 
        print()

def percent_cb(complete, total):
    printProgressBar(complete, total)

bucket = conn.create_bucket('mybucket')
for bucket in conn.get_all_buckets():
	print "{name}\t{created}".format( name = bucket.name, created = bucket.creation_date,)

bucket = conn.get_bucket(bucketname) 

k = Key(bucket)
k.key = 'my test file'
k.set_contents_from_filename(local_testfile, cb=percent_cb, num_cb=20)

Change permissions and run script

# chmod 755 s3upload.py
#./s3upload.py

Watch the 'ceph -s' command.

# watch ceph -s
Every 2.0s: ceph -s Thu Feb 2 19:09:58 2017

cluster 1e0c9c34-901d-4b46-8001-0d1f93ca5f4d
 health HEALTH_OK
 monmap e1: 3 mons at {ceph1=192.168.122.81:6789/0,ceph2=192.168.122.82:6789/0,ceph3=192.168.122.83:6789/0}
 election epoch 36, quorum 0,1,2 ceph1,ceph2,ceph3
 osdmap e102: 3 osds: 3 up, 3 in
 flags sortbitwise
 pgmap v1543: 272 pgs, 12 pools, 2707 MB data, 871 objects
 9102 MB used, 287 GB / 296 GB avail
 272 active+clean
 client io 14706 kB/s wr, 0 op/s rd, 32 op/s wr

Configure rados gateway swift user

In order to enable access to object store using swift you need to create a sub-user or nested user for swift access. This user is created under already existing user. We will use the s3user already created. From outside the swift user is it's own user.

# radosgw-admin subuser create --uid=s3user --subuser=s3user:swift --access=full
{
"user_id": "s3user",
"display_name": "S3user",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [
{
"id": "s3user:swift",
"permissions": "full-control"
}
],
"keys": [
{
"user": "s3user",
"access_key": "PYVPOGO2ODDQU24NXPXZ",
"secret_key": "pM1QULv2YgAEbvzFr9zHRwdQwpQiT9uJ8hG6JUZK"
}
],
"swift_keys": [
{
"user": "s3user:swift",
"secret_key": "vzo0KErmx5I9zaE3Y7bIOGGbJaECpJmNtNikFEYh"
}
],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"temp_url_keys": []
}

Generate keys

This is done by default but in case you want to generate keys you can do so at anytime.

# radosgw-admin key create --subuser=s3user:swift --key-type=swift --gen-secret

Test swift access

We will use python again but this time for swift cli which is written in python.

# pip install --upgrade setuptools
# pip install python-swiftclient

List buckets using swift

# swift -A http://192.168.122.81:8080/auth/1.0 -U s3user:swift -K 'DvvYI2uzd9phjHNTa4gag6VkWCrX29M17A0mATRg' list

If things worked then you should see bucket called 'mybucket'.

Configure Red Hat Storage Console (RHSC)

Next we will configure the storage console and import the existing cluster. Ansible does not take care of setting up RHSC.

Install RHSC

# yum install -y rhscon-core rhscon-ceph rhscon-ui

Configure RHSC

# skyring-setup
Would you like to create one now? (yes/no): yes
 Username (leave blank to use 'root'):
 Email address:
 Password:
 Password (again):
 Superuser created successfully.
 Installing custom SQL ...
 Installing indexes ...
 Installed 0 object(s) from 0 fixture(s)
 ValueError: Type carbon_var_lib_t is invalid, must be a file or device type
 Created symlink from /etc/systemd/system/multi-user.target.wants/carbon-cache.service to /usr/lib/systemd/system/carbon-cache.service.
 Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.
 Please enter the FQDN of server [ceph1.lab.com]:
 Skyring uses HTTPS to secure the web interface.
 Do you wish to generate and use a self-signed certificate? (Y/N): y

-------------------------------------------------------
 Now the skyring setup is ready!
 You can start/stop/restart the server by executing the command
 systemctl start/stop/restart skyring
 Skyring log directory: /var/log/skyring
 URLs to access skyring services
 - http://ceph1.lab.com/skyring
 - http://ceph1.lab.com/graphite-web
 -------------------------------------------------------
 Done!

Once installation is complete you can access RHSC via web browser.

https://192.168.122.81
user: admin password: admin

Configure RHSC Agent

Each ceph monitor and osd node requires an RHSC agent.

Install agent

Likely ansible took care of this but doesn't hurt to check.

# yum install -y rhscon-agent
# curl 192.168.122.81:8181/setup/agent/ | bash

Setup calamari server

The calamari server runs on one of the monitor nodes. It's purpose is to collect cluster health, events and statistics. In this case we will run the calamari server on ceph2.

# yum install calamari-server
# calamari-ctl clear --yes-i-am-sure

Gotta love the --yes-i-am-sure flags.

# calamari-ctl initialize --admin-username admin --admin-password admin --admin-email ktenzer@redhat.com

Accept nodes in RHSC

https://

Click on tasks icon at top right and accept all hosts.

ceph_accept_hosts

Once accepting hosts you should see them available.

ceph_hosts

Import cluster in RHSC

Go to cluster and import a cluster, ensure to select the monitor node running the calamari server. In this case ceph 2.

Note: we could have also deployed Ceph using RHSC that is what new cluster does but that is no fun,

ceph_import_cluster

Click import to start the import process.

ceph_import_cluster_2

Cluster now should be visible in RHSC.

ceph_cluster_complete

RHSC dashboard shows a high-level glimpse of ceph cluster.

ceph_dashboard

Summary

In this article we saw how to deploy a ceph 2.0 cluster from scratch using VMs that can run on your laptop. We enabled object storage through ragosgw and configured both s3 as well as swift access. Finally we setup the Red Hat Storage Console (RHSC) to provide insight into our ceph cluster. This article provides you a good starting point for your journey into ceph and the future of storage which is software-defined as well as object based. Unlike all other storage systems ceph is the only one that is opensource, built on open standards and truly unified (object, block, file). Ceph is supported by many vendors and can run on any x86 hardware, even commodity. What else is the more to say?

Happy Cephing!

(c) 2017 Keith Tenzer