Tag Archives: AHV

Using CALM Blueprints – Automation is the new punk!

Repeatability

I can imagine there are a lot of people like me who are continually setting up and tearing down environments in order to run application benchmarks, test out APIs or run various new features etc. The consequence of this is that I have umpteen sources of best practice notes for each and every technology stack I get involved with. What’s worse is that some of the configurations and their changes are identical across multiple applications. So often, I am digging around in directories entitled somewhat unhelpfully like “Notes” and “Best_Practices” or …. wait for it …..”Tunings”.

As part of an ongoing move towards Infrastructure as Code, I am on a mission now to get all of the crufty bits of info I keep here, there and everywhere, into a source code repository format. To that end I have been looking at using the Multi-VM blueprint functionality of Nutanix Calm (Automated Lifecycle Management). Calm allows me to create a blueprint and reuse all my original code snippets and config edits. They can be in Bash, Python or Powershell and so on. Once created, the blueprint can be stored in a repository on Github, for example. Then everytime I use that blueprint I get a repeatable deployment that is the same, each & every time I run it.

Here’s one I made earlier

Let’s take a look at building out a stack to benchmark Elasticsearch using esrally. I covered some of this in a my last post. I want to start off by discussing a few prerequisites that will be needed. First and foremost – the image used to create the virtual machines (VMs). I used CentOS 7 cloud images which will require ssh key based access for the default user (centos). This means I need to store both public and private keys in the various parts of the configuration. See below for the Configuration > DOWNLOADABLE IMAGE CONFIGURATION and Credentials sections in the blueprint

The blueprint automatically creates three virtual machines (VMs), one to host a single Elasticsearch instance, one for the Kibana instance and another that will run the esrally workload generator. See below for the basic layout of the blueprint. As the Kibana instance needs to know the address of the Elasticsearch instance, I need to create a dependency between the Elasticsearch and Kibana services. I do this by creating “an edge” between the services. This is delineated by the white line. That way the Kibana configuration/install only proceeds when the Elasticsearch configuration/install has completed. However, all underlying VMs are created simultaneously.

Services and dependencies

Each service requires a virtual machine in order to provide that service. So configure each VM with storage (vDISKS), network (NIC), ssh access (Credentials), along with any guest customisation and so on.  For the Search_Index (Elasticsearch) service, I built the Elasticsearch VMs to host six 200GB vdisks, and used the cloud-config already installed in the image to set access keys and permissions. See below…

Application Profiles and variables

The use of application profiles not only allows you to specify the platform (or substrate in Calm speak). You can also encapsulate variables which are then passed to that application. I am deploying to a Nutanix platform in this case. This works just as well however, with AWS, GCP and Azure. You can see from the application profile below the variables I have created. I could very quickly deploy several application stacks using this in a blueprint and each one could have a different java heap size. I could then make performance comparisons between the two. Each application stack would be exactly the same apart from the one changed variable. By extension I could add other variables I am interested in, like LVM stripe width or filesystem block size and so on.

Application Installation and Configuration

How variables in the application profiles get used, can be shown below in the package install task. The bulk of any configuration is done here. Tasks can be assigned to any action that are related to a service or the application profile. So a start, restart, stop or delete can have an associated task. For each service there’s a package install task and that’s where we use the application profile variables. Each of the services I configured have a package install task, below is the task for the Elasticsearch/Search_Index service 

The canvas (above) shows a number of ways to update or edit files based on various patterns. Note that all config file edits/updates are done in place. You should avoid using a CLI that relies on creating temporary files. Your package install script could end up trying to write/access files outside of the deployment environment. This is a potential security hole which Calm will not allow. Notice how the variable macros in the above package tasks are invoked below :

...
sudo sed -i 's/-Xms1g/-Xms@@{java_heap_size}@@g/' /etc/elasticsearch/jvm.options
...
sudo sed -i 's%path.data: /var/lib/elasticsearch%path.data: @@{elastic_data_path}@@%' /etc/elasticsearch/elasticsearch.yml
...

Calm internal macros are also available. For example: passing the address of one service into another – this is from the package task for the Data Visualisation service (kibana instance):

...
sudo sed -i 's%^#elasticsearch.hosts: \["http://localhost:9200"\]%elasticsearch.hosts: \["http://@@{Search_Index.address}@@:@@{elastic_http_port}@@"\]%' /etc/kibana/kibana.yml
...

or for cardinal numbers for unique VM names (see the VM configuration section of any service):

elastic-@@{calm_array_index}@@

Provisioning and Auditing

That’s the the blueprint complete. It should be saved without errors or warnings. Now it’s time to launch the blueprint to build the application stack. At this point you can name what will be your running application instance and change/set any runtime variables. Once launched the blueprint is queued, verified and then cloned ready to run. While its running you can audit the steps of the workflow in the blueprint:

 

Once the application is marked RUNNING, you can then either connect to individual VMs, or access an application via a browser. It’s common for all means of VM or application access to be placed in the blueprint description (Note: it also expands macro variables – see below):

The following is an example of the /etc/motd when logging into the VM installed with esrally

# ssh -i ./keys.pem -l centos 10.68.58.87
Last login: Wed Jul 17 15:46:57 2019 from 10.68.64.60

Configuration successfully written to /home/centos/.rally/rally.ini. Happy benchmarking!

More info about Rally:

* Type esrally --help
* Read the documentation at https://esrally.readthedocs.io/en/1.2.1/
* Ask a question on the forum at https://discuss.elastic.co/c/elasticsearch/rally

To get started:
esrally list tracks

Or....

esrally --pipeline=benchmark-only --target-hosts=10.68.58.177:9200 \
--track=eventdata --track-repository=eventdata --challenge=bulk-size-evaluation

Conclusion 

The final version (for now) of the blueprint is available to clone or download at:

https://github.com/rayhassan/calm-bp-elastic

Upload the blueprint to the Calm service on Prism Central. Then work through it as you read this post. Make your own changes if required. At the end (~10 minutes) you will have a running environment with which to test various Elasticsearch workloads. I intend to work through more blueprints related to other cloud native applications, with a view to developing larger scale deployments. Stay tuned,

 

Openstack + Nutanix : Nova and Cinder integration

Now that we have setup an allinone deployment of the Acropolis OVM, configured networking, and an image registry. It’s time to look at the steps required to launch virtual machine (VM) instances and setup appropriate storage.  The first steps to take are to provide the necessary network access rules for the VM’s if they don’t already exist. The easiest way to do this is to create rules to ensure SSH (port 22) access from any address range and to make the VMs pingable.

Compute > Access & Security > Security Groups

Compute > Access & Security > Security Groups

Compute > Access-Security > Security Groups

Compute > Access & Security > Security Groups

Next create an SSH key-pair that can be assigned to your instances and subsequently control VM remote login access to holders of the appropriate private key. I will show how this is used later in the post, when we launch an instance. First, select the Key Pairs tab in the Access & Security frame and save the resulting PEM file to be used when accessing your VMs.

access-kp-create

Create a named key-pair (for example fedora-kp) for the set of instances you will create.

As an example, I am going to create a single volume using the Cinder service, in order to show we can attach this to a running VM. In this instance, Cinder gets redirected to the Acropolis Volume API and the subsequent volume gets attached to the instance as an iSCSI block device.

volume-create

Next step will be to spin up a number of VM instances, I have given a generic instance prefix for the name, and I am choosing to boot a Fedora 23 Cloud image. You can see the Flavour Details in the side panel in the screenshot below – Note the root disk size is big enough to accommodate the base image.

instances-launch

I also need to specify the SSH key-pair I am using and the Network on which the instances get launched. See below :

instances-network

instances-kps

At this point I can go ahead and launch my instances. We can see the 10 instances chosen all get created below, along with the assigned IP addresses from the already defined network, the instance flavour, and the named key-pair ….

instance-list

So now, if we were to take a look at the Nutanix cluster backend via Prism, we can see those VM instances created on the cluster and how they are spread across the hypervisor hosts. That’s all down to Acropolis management and placement.

prims-vm-list

We can dig a little deeper into the Acropolis functionality and show how each of the steps taken by the Acropolis REST API calls have built and deployed the VMs on the backend. Here’s the list of VMs that were created as defined in the http://<CVM-IP>:2030 page.

2030-vm-list

And we can see the breakdown of the individual task steps and how long each one took and how long they might have queued for, and if they were ultimately successful and so on. The key take away from all this is that the speed of creation of the VM instances is largely down to the Acropolis management interfaces consumed by the REST API calls.

ergon-task-list

Let’s take one of those VMs and add some volumes to it, let’s add a data and a log volume to fedvm-10. First of all we need to create the iSCSI volumes

volume-attach

 

Then we can attach the volumes to the VM instance ….

attach-volume

We now have the two volumes attached to the VM ….

volume-attachment-list

The two volumes should show up as virtual disks under /dev in the VM itself. We can verify this by logging into the VM directly using the private key I created earlier as part of the key-pair assigned to this series of instances.

# ssh -i ./fedora-kp.pem fedora@10.68.56.29
Last login: Thu Apr 7 21:28:21 2016 from 10.68.64.172
[fedora@fedvm-10 ~]$ 

[fedora@fedvm-10 ~]$ sudo fdisk -l
Disk /dev/sda: 3 GiB, 3221225472 bytes, 6291456 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x6e3892a8

Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 6291455 6289408 3G 83 Linux


Disk /dev/sdb: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdc: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

So from here, we can format the newly assigned disks and mount them as needed.

That’s it for this post, hopefully this series of posts has gone a little way to clarify how a Nutanix cluster can be used to scale out an Openstack deployment to form a highly available on-premise cloud. The deployment of which is radically simplified by using Nutanix as the Compute, Volume, Image and Network backend.

In future posts I intend to look at deploying an upstream Openstack controller, have a play around with snapshots within Openstack and their use as images. Also, some additional troubleshooting perhaps. Let me know what you find useful.

Sharded MongoDB config on Nutanix (1) : Deployment

So far I have posted on MongoDB deployments either as standalone or as part of a replica set. This is fine when you can size your VM memory to hold the entire database working set. However, if your VM’s RAM will not accommodate the working set in memory, you will need to shard to aggregate RAM from multiple replica sets and form a MongoDB cluster.

Having already discussed using clones of gold image VMs to create members for a replica set, then the most basic of MongoDB clusters requires at least two replica sets. On top of which we need a number of MongoDB “infrastructure” VMs that make MongoDB cluster operation possible. These entail a minimum of three (3) Configuration Databases (mongod –configsvr) per cluster and around one (1) Query Router (mongos) for every two shards. Here is the layout of a cluster deployment on my lab system:

2shard-system

In the above lab deployment, for availability considerations, I avoid co-locating any primary replica VM on the same physical host, and likewise any of the Query Router or ConfigDB VMs. One thing to bear in mind is that sharding is done on a per collection basis. Simply put, the idea behind sharding is that you split the collections across the replica sets and then by connecting to a mongos process you are routed to the appropriate shard holding the part of the collection that can serve your query. The following commands show the syntax to create one of the three required configdb’s (ran on three separate VMs, and need to be started first), and a Query Router, or mongos process (where we add the IP addresses of each configdb server VM) :

Config DB Servers – each ran as:
mongod --configsvr --dbpath /data/configdb --port 27019

Query Router - ran as:
mongos --configdb 10.68.64.142:27019,10.68.64.143:27019,10.68.64.145:27019

- the above IP addresses in mongos command line are the addresses of each config DB.

This brings up an issue if you are not cloning replica VMs from “blank” gold VMs. By cloning a new replica set from a current working replica set, ie: so that you essentially have each replica set holding a full copy of all your databases and their collections. Then when you come to add such a replica set as a shard, you generate the error condition shown below.

Here’s the example of what can happen when you attempt to shard and your new replica set (rs02)  is simply cloned off a current running replica set (rs01):

mongos> sh.addShard("rs02/192.168.1.52")
{s
 "ok" : 0,
 "errmsg" : "can't add shard rs02/192.168.1.52:27017 because a local database 'ycsb' 
exists in another rs01:rs01/192.168.1.27:27017,192.168.1.32:27017,192.168.1.65:27017"
}

This is the successful workflow adding both shards (the primary of each replica set) via the mongos router VM:

$ mongo --host localhost --port 27017
MongoDB shell version: 3.0.3
connecting to: localhost:27017/test
mongos>
 
mongos> sh.addShard("rs01/10.68.64.111")
{ "shardAdded" : "rs01", "ok" : 1 }
mongos> sh.addShard("rs02/10.68.64.110")
{ "shardAdded" : "rs02", "ok" : 1 }

We next need to enable sharding on the database and subsequently shard on the collection we want to distribute across the replica sets available. The choice of shard key is crucial to future MongoDB cluster performance. Issues such as read and write scaling, cardinality etc are covered here. For my test cluster I am using the _id field for demonstration purposes.

mongos> sh.enableSharding("ycsb")
{ "ok" : 1 }

mongos> sh.shardCollection("ycsb.usertable", { "_id": 1})
{ "collectionsharded" : "ycsb.usertable", "ok" : 1 }

The balancer process will run for the period of time needed to migrate data between the available shards. This can take anywhere from a number of hours to a number of days depending on the size of the collection, the number of shards, the current workload etc. Once complete however, this results in the following sharding status output. Notice  the “chunks” of the usertable collection held in the ycsb database are now shared across both shards (522 chunks in each shard) :

 mongos> sh.status()
--- Sharding Status ---
 sharding version: {
 "_id" : 1,
 "minCompatibleVersion" : 5,
 "currentVersion" : 6,
 "clusterId" : ObjectId("55f96e6c5dfc4a5c6490bea3")
}
 shards:
 { "_id" : "rs01", "host" : "rs01/10.68.64.111:27017,10.68.64.131:27017,10.68.64.144:27017" }
 { "_id" : "rs02", "host" : "rs02/10.68.64.110:27017,10.68.64.114:27017,10.68.64.137:27017" }
 balancer:
 Currently enabled: yes
 Currently running: no
 Failed balancer rounds in last 5 attempts: 0
 Migration Results for the last 24 hours:
 No recent migrations
 databases:
 { "_id" : "admin", "partitioned" : false, "primary" : "config" }
 { "_id" : "enron_mail", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "mydocs", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "sbtest", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "ycsb", "partitioned" : true, "primary" : "rs01" }
 ycsb.usertable
 shard key: { "_id" : 1 }
 chunks:
 rs01 522
 rs02 522
 too many chunks to print, use verbose if you want to force print
 { "_id" : "test", "partitioned" : false, "primary" : "rs02" }

Additional Links: