Category Archives: Containers

Automating S3 compliant Object stores via Nutanix Objects API

As part of an API first strategy within the company, the Objects team at Nutanix has developed a REST API to enable the automated creation, deletion, management and monitoring of S3 compliant Object stores. I was fortunate to be given early access to the developing API. As part of this preview work, I have been looking at how to use CALM’s built-in support for “chaining” REST calls together, in order to build a JSON payload that creates an object store via its API. 

POST /objectstores

Let’s take a brief look at a subset of the Objects API. In order to create our objectstore, we need to make several intermediate calls to the standard v3 API. These calls are used to obtain (for example) reference UUIDs from entities like the underlying Nutanix cluster or required networks. The image below shows how the desired objectstore payload is pre-populated using macro variables that are either entered as part of the initial CALM blueprint configuration – @@{objectstore_name}@@ or generated from CALM tasks that pass in a variable at runtime – @@{CLUSTER}@@. We’ll discuss the latter shortly.

The Objects API (OSS) is accessed via a Prism Central (PC) endpoint. Notice the Objects API endpoint URL, where @@{address}@@ defines the PC IP address.

https://@@{address}@@:9440/oss/api/nutanix/v3/objectstores

The REST call to create the objectstore is then handled by the CALM provided URL request function, urlreq(). The underlying call is still made via the Python requests module however. See below for how it was used in this scenario. More details on the various supported CALM functions can be found on the Nutanix documentation portal

Task type: Set Variable

Let’s look at how we generate the various saved UUIDs and other required entities, in order to pass them around our code. Recall that such entities are used to build the final JSON payload for the objectstore creation step we have already covered above. CALM provides a task framework that performs various functions. For example, to run a script or some Python code. There’s also a task option that results in the setting of a required variable. Once such a variable is created or set, it is then available to all other tasks. The next image below shows how we configure a task to set a variable.

Application profile : Objects

On the left hand pane in the above image, you will see an Application profile, entitled Objects. This profile gives me a set of default actions for my object store, such as Create, Start, Restart, etc. It also allows the creation of custom actions. We will look at REST_Create as an example of a custom action. From the list of tasks associated with REST_Create in the central canvas, we have an a task entitled, GetClusterUUID. The right hand pane shows how this task is configured. Note the task type is “Set Variable”. We also run a Python request, in the Script canvas. This populates an Output variable entitled CLUSTER. CLUSTER contains the Nutanix cluster UUID. We can see how this works in a little more detail below.

Script

First, we set the credentials for Prism Central access. How credentials get set up in this kind of configuration, will be discussed later in the post. Next step is to populate the REST headers, URL and the JSON payload. Payload here is empty, but you can choose to either limit the number of clusters returned or use pagination if preferred. Pagination will require additional coding however.

We cycle through the response content of cluster entities looking for a match against our supplied cluster name – @@{cluster_name}@@. If found, we have guardrail code that ensures we only proceed if both hypervisor and version of AOS are supported. We do this in the GetClusterUUID task as its the first call we make. In doing so we exit as early as possible if we find a problem.

The matching cluster UUID from the response is saved into the CLUSTER variable. This UUID is then available to other tasks in the blueprint. Similar patterns are repeated in the tasks GetInfraNetUUID and GetClientNetUUID. Both tasks populate a variable with their respective network references (UUIDs). These variables are both used in the CreateObjectstore task, covered above. Without going into too much architecture detail, the Objects feature set is built on a microservice architecture. The networks mentioned are required for the internal Kubernetes inter-node/pod communication.

CALM Service

I will quickly go over the creation of the required Objects_Store service in CALM. This will cover the previously only mentioned credentials setup and so on. I think the image below is fairly self explanatory. It shows how to configure a blueprint to run against the incumbent Prism Central instance, and deploy the application (in our case an Object store) on an existing cluster infrastructure.

The CALM blueprint discussed here for automated Object store creation is available here (in its current form):

https://github.com/rayhassan/calm-bp-objects

As the API develops towards General Availability, I hope to add more functionality to the blueprint (DELETE, Replace Certs, and so on). For now, here’s a quick run through of how the blueprint deploys the Objectstore via API. The image below shows the running application after the blueprint is launched.

The objectstore is then “managed” via the now provisioned application. To then create an objectstore according to the options set at the blueprint launch, we run the custom actions we previously created. Select first the Manage tab and then the REST_Create task

While the objectstore is being created, we can run other tasks that perform API calls that monitor objectstore progress and status. The output from the Audit tab is how ever we decided to format the JSON response in our REST_Status task. For example….

This ties in with exactly what we see in the Prism GUI at that time.

Big Data use case

In addition to the use cases outlined below, I am interested in investigating how Nutanix Objects  will play in the Big Data space. In particular, how Objects can be used to create standby environments for an Hadoop ecosystem. Ideally in another location. This is something that usually requires a large amount of work. Using Objects there’s the potential to de-risk the data lake replication part to a large extent. I hope to make this investigation a part of our upcoming Hadoop certification work.

Current Use Cases

  • Backup: Consolidate Nutanix and non-Nutanix primary infrastructure.
  • Long Term Retention (e.g.Splunk cold tier, Doc archives,Images/Videos): Cheap & deep,
    with regulatory content retention.
  • DevOps: Enable IT to provide an AWS S3 like service, on-premises, for cloud-native

Let me know if you find the Objects blueprint useful or feel free to share your experience of Nutanix Objects and how we can make things work better.

Configuring Docker Storage on Nutanix

I have recently been looking at how best to deploy a container ecosystem on my Nutanix XCP environment. At present I can run a number of containers in a virtual machine (VM). This will give me the required networking, persistent storage and the ability to migrate between hosts. Something that containers are only just becoming capable of in many cases. So that I can scale out my container deployment within my Docker host VM, I am going to have to consider increasing  the available space within /var/lib/docker. By default, if you provide no additional storage/disk for your Docker install, loopback files get created to store your containers/images. This configuration is not particularly performant, so it’s not supported for production.  You can see below how the default setup looks…

# docker info
Containers: 0
Images: 0
Storage Driver: devicemapper
 Pool Name: docker-253:1-33883287-pool
 Pool Blocksize: 65.54 kB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 1.821 GB
 Data Space Total: 107.4 GB
 Data Space Available: 50.2 GB
 Metadata Space Used: 1.479 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.146 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.93 (2015-01-30)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.5-201.fc22.x86_64
Operating System: Fedora 22 (Twenty Two)
CPUs: 1
Total Memory: 993.5 MiB
Name: docker-client
ID: VHCA:JO3X:IRF5:44RG:CFZ6:WETN:YBJ2:6IL5:BNDT:FK32:KH6E:UZED

Configuring Docker Storage Options

Looking at the various methods we can use to provide dedicated block storage for Docker containers. With the Devicemapper storage driver Docker automatically creates a base thin device, this is two block devices, one for data and one for metadata. The thin device is automatically formatted with an empty filesystem on creation. This device is the base of all docker images and containers. All base images are snapshots of this device and those images are then in turn used as snapshots for other images and eventually containers. This is the Docker supported production setup. Also, by using LVM based devices as the underlying storage you are then accessing them as raw devices and no longer go through the VFS layer.

Devicemapper : direct-lvm

To begin, create two LVM devices, one for container data and another to hold metadata. By default the loopback method creates a storage pool with 100GB of space. In this example I am creating a 200G LVM volume for data and a 5G metadata volume. I prefer separate volumes where possible for performance reasons. We start by hot-adding the required Nutanix vDisks to the virtual machine (docker-directlvm) guest OS (Fedora22)

<acropolis> vm.disk_create docker-directlvm create_size=200g container=DEFAULT-CTR
DiskCreate: complete
<acropolis> vm.disk_create docker-directlvm create_size=10g container=DEFAULT-CTR
DiskCreate: complete

[root@docker-directlvm ~]# lsscsi 
 
[2:0:1:0] disk NUTANIX VDISK 0 /dev/sdb 
[2:0:2:0] disk NUTANIX VDISK 0 /dev/sdc 

The next step is to create the individual LVM volumes

# pvcreate /dev/sdb /dev/sdc
# vgcreate direct-lvm /dev/sdb /dev/sdc

# lvcreate --wipesignatures y -n data direct-lvm -l 95%VG
# lvcreate --wipesignatures y -n metadata direct-lvm -l 5%VG

If setting up a new metadata pool need to zero the first 4k to indicate empty metadata:
# dd if=/dev/zero of=/dev/direct-lvm/metadata bs=1M count=1

For sizing the metadata volume above, the rule of thumb seems to be 0.1% of the data volume. This is somewhat anecdotal, so size with a little headroom perhaps? Next, start the Docker daemon using the required options in the file /etc/sysconfig/docker-storage.

DOCKER_STORAGE_OPTIONS="--storage-opt dm.datadev=/dev/direct-lvm/data --storage-opt \
dm.metadatadev=/dev/direct-lvm/metadata --storage-opt dm.fs=xfs"

You can then verify that the requested underlying storage is in use, with the docker info command

# docker info
Containers: 5
Images: 2
Storage Driver: devicemapper
 Pool Name: docker-253:1-33883287-pool
 Pool Blocksize: 65.54 kB
 Backing Filesystem: xfs
 Data file: /dev/direct-lvm/data
 Metadata file: /dev/direct-lvm/metadata
 Data Space Used: 10.8 GB
 Data Space Total: 199.5 GB
 Data Space Available: 188.7 GB
 Metadata Space Used: 7.078 MB
 Metadata Space Total: 10.5 GB
 Metadata Space Available: 10.49 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Library Version: 1.02.93 (2015-01-30)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.5-201.fc22.x86_64
Operating System: Fedora 22 (Twenty Two)
CPUs: 1
Total Memory: 993.5 MiB
Name: docker-directlvm
ID: VHCA:JO3X:IRF5:44RG:CFZ6:WETN:YBJ2:6IL5:BNDT:FK32:KH6E:UZED

All well and good so far, but the storage options to expose data and metadata locations namely dm.datadev and dm.metadatadev have been deprecated in favour of a preferred model, which is to have a thin pool reserved outside of Docker and passed to the daemon via the dm.thinpooldev storage option.  There’s a helper script in some Linux distros called  /etc/sysconfig/docker-storage-setup. This does all the heavy lifting, you just need to supply a device  and, or a volume group name.

Devicemapper: Thinpool

Once again start by creating the virtual device – a Nutanix vDisk –  and adding it to the virtual machine guest OS

<acropolis> vm.disk_create docker-thinp create_size=200g container=DEFAULT-CTR
DiskCreate: complete

[root@localhost sysconfig]# lsscsi
[0:0:0:0] cd/dvd QEMU QEMU DVD-ROM 1.5. /dev/sr0
[2:0:0:0] disk NUTANIX VDISK 0 /dev/sda
[2:0:1:0] disk NUTANIX VDISK 0 /dev/sdd

Edit the file /etc/sysconfig/docker-storage-setup as follows:

root@localhost sysconfig]# cat /etc/sysconfig/docker-storage-setup
# Edit this file to override any configuration options specified in
# /usr/lib/docker-storage-setup/docker-storage-setup.
#
# For more details refer to "man docker-storage-setup"
DEVS=/dev/sdd
VG=docker

Then run the storage helper script:

[root@docker-thinp ~]# pvcreate /dev/sdd
 Physical volume "/dev/sdd" successfully created

[root@docker-thinp ~]# vgcreate docker /dev/sdd 
 Volume group "docker" successfully created 

[root@docker-thinp ~]# docker-storage-setup 
Rounding up size to full physical extent 192.00 MiB 
Logical volume "docker-poolmeta" created. 
Wiping xfs signature on /dev/docker/docker-pool. 
Logical volume "docker-pool" created. 
WARNING: Converting logical volume docker/docker-pool and docker/docker-poolmeta to pool's data and metadata volumes. 
THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) 
Converted docker/docker-pool to thin pool. 
Logical volume "docker-pool" changed.

You can then verify the underlying storage being used in the usual way

[root@docker-thinp ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT

sdd 8:48 0 186.3G 0 disk
├─docker-docker--pool_tmeta 253:5 0 192M 0 lvm
│ └─docker-docker--pool 253:7 0 74.4G 0 lvm
└─docker-docker--pool_tdata 253:6 0 74.4G 0 lvm
 └─docker-docker--pool 253:7 0 74.4G 0 lvm


[root@docker-thinp ~]# docker info
Containers: 0
Images: 0
Storage Driver: devicemapper
 Pool Name: docker-docker--pool
 Pool Blocksize: 524.3 kB
 Backing Filesystem: xfs
 Data file:
 Metadata file:
 Data Space Used: 62.39 MB
 Data Space Total: 79.92 GB
 Data Space Available: 79.86 GB
 Metadata Space Used: 90.11 kB
 Metadata Space Total: 201.3 MB
 Metadata Space Available: 201.2 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Library Version: 1.02.93 (2015-01-30)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.5-201.fc22.x86_64
Operating System: Fedora 22 (Twenty Two)
CPUs: 1
Total Memory: 993.5 MiB
Name: docker-thinp
ID: VHCA:JO3X:IRF5:44RG:CFZ6:WETN:YBJ2:6IL5:BNDT:FK32:KH6E:UZED

On completion this will have created the correct entries in /etc/sysconfig/docker-storage

[root@docker-thinp ~]# cat /etc/sysconfig/docker-storage
DOCKER_STORAGE_OPTIONS=--storage-driver devicemapper --storage-opt dm.fs=xfs \
 --storage-opt dm.thinpooldev=/dev/mapper/docker-docker--pool

and runtime looks like...

[root@docker-thinp ~]# ps -ef | grep docker
root 8988 1 0 16:13 ? 00:00:11 /usr/bin/docker daemon --selinux-enabled
--storage-driver devicemapper --storage-opt dm.fs=xfs 
--storage-opt dm.thinpooldev=/dev/mapper/docker-docker--pool

Bear in mind that when you are changing the underlying docker storage driver or storage options similar to the examples described above.  Then, typically, the following destructive command sequence is run (be sure to backup any important data before running the following ….

$ sudo systemctl stop docker
$ sudo rm -rf /var/lib/docker 

post changes 

$ systemctl daemon-reload
$ systemctl start docker

Additional Info

http://developerblog.redhat.com/2014/09/30/overview-storage-scalability-docker/

https://jpetazzo.github.io/2014/01/29/docker-device-mapper-resize/

http://docs.docker.com/engine/reference/commandline/daemon/#storage-driver-options