Tag Archives: XCP

Using Ansible to deploy ELK stack on Nutanix

Just recently my colleague Andrew Nelson (@vmwnelson) posted an article on setting up Ansible on the Nutanix platform. I am also using Ansible to develop playbooks and the like to deploy the ELK stack components (Elasticsearch-Logstash-Kibana) on a block here at Nutanix. My initial aim is to setup a single index in an Elasticsearch (single node for now) cluster and use Logstash to pipe in data to be indexed. On top of that I intend to use Kibana and the Marvel plugin to measure at which point my index begins to struggle (based on stuff like OS level resource consumption, etc) as viewed from Marvel.

From a virtual machine perspective I have a Fedora 22 based gold image. From this base image I clone one VM to be the Ansible master that I will run playbooks (orchestration) from, and another VM which I will deploy my ELK stack to. This second “target” VM has had 7 vDisks added to it. The idea here being that Elasticsearch (ES) can use a comma separated list of vDisks (in my case I created them as six Linear LVM volumes). These are written to in a round robin fashion by ES and so the data gets “striped”. Nutanix vDisks are already redundant so we are getting a kind of RAID 10 for free! Here’s how my disk layout looks once configured and mounted (I am using XFS as my filesystem) on the target VM:

[root@elkhost01 ~]# df -h
/dev/mapper/esdata05-esdata05 200G 271M 200G 1% /esdata/data05
/dev/mapper/esdata03-esdata03 200G 291M 200G 1% /esdata/data03
/dev/mapper/esdata04-esdata04 200G 273M 200G 1% /esdata/data04
/dev/mapper/esdata02-esdata02 200G 271M 200G 1% /esdata/data02
/dev/mapper/esdata06-esdata06 200G 291M 200G 1% /esdata/data06
/dev/mapper/eslog-eslog 100G 150M 100G 1% /var/log/elasticsearch
/dev/mapper/esdata01-esdata01 200G 279M 200G 1% /esdata/data01

and

[root@elkhost01 ~]# lvs
 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
 esdata01 esdata01 -wi-ao---- 200.00g
 esdata02 esdata02 -wi-ao---- 200.00g
 esdata03 esdata03 -wi-ao---- 200.00g
 esdata04 esdata04 -wi-ao---- 200.00g
 esdata05 esdata05 -wi-ao---- 200.00g
 esdata06 esdata06 -wi-ao---- 200.00g
 eslog eslog -wi-ao---- 100.00g

The next step is to install and configure Ansible. First off, configure an ansible user on both the orchestration host and target host and sync ssh keys between the two – (there’s a module that does ssh key exchange in Ansible and I will cover that at some stage)  – like so:

on both VMs :

useradd ansible
passwd ansible

# generate pub and priv keys ....
ssh-keygen -t rsa

If using strictmodes (default) in sshd_config file 
then ensure correct perms on .ssh directory and files 

chmod 700 ~/.ssh 
chmod 600 ~/.ssh/authorized_keys
[ansible@elkhost01 ~]$ ls -l ~/.ssh
total 12
-rw-------. 1 ansible ansible 404 Oct 1 13:38 authorized_keys
-rw-------. 1 ansible ansible 1675 Oct 1 13:31 id_rsa
-rw-------. 1 ansible ansible 402 Oct 1 13:31 id_rsa.pub

Exchange public keys (copy into remote hosts authorized_keys file) 
for passwordless access

[ansible@ansible-host01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub 10.68.64.117
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
ansible@10.68.64.126's password:

Number of key(s) added: 1

Now try logging into the machine, with: "ssh '10.68.64.117'"
and check to make sure that only the key(s) you wanted were added.

[ansible@ansible-host01 ~]$ 

[ansible@ansible-host01 ~]$ ssh 10.68.64.117
Last login: Thu Oct 1 13:38:35 2015 from 10.68.64.113
[ansible@elkhost01 ~]$

Once you have passwordless ssh configured between your hosts – go ahead and install Ansible on the orchestration host:

# yum install ansible -y

Once installed, there are a few post install steps and tests to make sure that Ansible is working. First off set up a Ansible hosts inventory file that will eventually contain all the hostnames broken out by deployment type. The default location for this file is  /etc/ansible/hosts. In this instance I have chosen to specify a non standard name/location in order keep my hosts file within my proposed playbook.

[ansible@ansible-host01 elk]$ pwd
/home/ansible/elk
[ansible@ansible-host01 elk]$ cat production
# file: production

[elastic-hosts]
10.68.64.117

[kibana-hosts]
10.68.64.117

[nginx-hosts]
10.68.64.126

And if the passwordless ssh setup is correct – we can test as follows :

[ansible@ansible-host01 elk]$ ansible all -m ping
10.68.64.117 | success >> {
 "changed": false,
 "ping": "pong"
}

Ansible machine configuration is done via playbooks, which are based on YAML syntax. There’s a great best practice guide here. I have followed that same best practice guide on playbook directory layout below …

elk
├── elastic.yml
├── group_vars
├── host_vars
├── kibana.yml
├── production 
├── roles
│   ├── common
│   │   ├── files
│   │   ├── handlers
│   │   ├── tasks
│   │   │   └── main.yml
│   │   ├── templates
│   │   └── vars
│   │   └── main.yml
│   ├── elastic
│   │   ├── files
│   │   │   └── elasticsearch.repo
│   │   ├── handlers
│   │   │   └── main.yml
│   │   ├── tasks
│   │   │   └── main.yml
│   │   ├── templates
│   │   │   ├── elasticsearch.default.j2
│   │   │   ├── elasticsearch.in.sh.j2
│   │   │   └── elasticsearch.yml.j2
│   │   └── vars
│   │   └── main.yml
│  └-- kibana
│      ├── files
│      ├── handlers
│      │   └── main.yml
│      ├── tasks
│      │   └── main.yml
│      ├── templates
│      │   └── kibana4.service.j2
│      └── vars
│      └── main.yml
├── site.yml

I am going to cover the individual roles for elasticsearch, logstash and kibana in subsequent posts. For now there’s a main site wide playbook :

[ansible@ansible-host01 elk]$ cat site.yml
---
# file: site.yml
- include: elastic.yml
- include: kibana.yml
- include: logstash.yml
#- include: log-forwarder.yml
#- include: redis.yml
#- include: nginx.yml

Which is then broken up into individual service specific playbooks :

[ansible@ansible-host01 elk]$ cat elastic.yml
---
#file: elastic.yml
- hosts: elastic-hosts
 roles:
 - common
 - elastic
[ansible@ansible-host01 elk]$ cat kibana.yml
---
#file: kibana.yml
- hosts: kibana-hosts
 roles:
 - kibana

I will discuss the individual roles and their associated tasks etc next time. For now this should be enough to get basic Ansible functionality going.

Sharded MongoDB config on Nutanix (2) : High Availability

One of the prime availability considerations for any horizontal scale out application, like a MongoDB cluster, is how that cluster behaves under a failure event. We have seen (in the MongoDB case) how replica sets are configured with additional secondary instances to handle the failure of a primary instance in a replica set. We also create a mini quorum of configuration database servers and query routers to give redundancy to the cluster “infrastructure”. However, the Nutanix XCP environment provides further protection through certain features of the Acropolis management interface. Your key VMs need to be enabled to run under high availability. This is so that when the underlying hypervisor host fails for any reason, these VM’s failover to another host with sufficient CPU and RAM resources. The screenshot below shows how this (Tech Preview) feature can be enabled (pre NOS 4.5) on a per-VM basis :

enable-HA

The underlying migration functionality is also used for the manual placement of key VMs. As an example, let’s consider the following layout, where two of the configdb VMs in a MongoDB cluster are co-located on the same AHV host:

mongodb-colocated-vms

Notice in the screen capture above, there are two configdb VMs on host “D”. This means that ideally we want to migrate a MongoDB Config DB to another AHV host. Let’s move the VM mongo-configdb02 to AHV host “C”…

mongodb-migrate-VM

Note that the migration process could have automatically chosen an appropriate AHV host to receive the VM. In the above case however, we have instead specified the desired host ourselves.

We can monitor the progress and duration of any migration via the VM tasks frame in Prism:

mongodb-vm-tasks-migration

As always, this workflow can also be done manually (or scripted) through the acli interface. In this example I am migrating the VM running a query router (mongos process)….

<acropolis> vm.migrate mongos01 host=10.68.64.41 
mongos01: complete 
<acropolis>

As of the time of writing this post. Acropolis Base Software (NOS) 4.5 has been released and this feature has become part of general availability (GA). It can now be enabled cluster wide:

ha-enable-menu-4

enable-ha-4

Nutanix customers are strongly recommended to enable this feature when they require HA functionality for their VMs.

In my next post I will be completing this short blog series on sharded MongoDB configs on Nutanix. I intend to cover how Nutanix Acropolis managed snapshots and cloning are employed to create backups and then use them to perform rapid build out of potential dev/QA type environments. Stay tuned.

 

 

 

Sharded MongoDB config on Nutanix (1) : Deployment

So far I have posted on MongoDB deployments either as standalone or as part of a replica set. This is fine when you can size your VM memory to hold the entire database working set. However, if your VM’s RAM will not accommodate the working set in memory, you will need to shard to aggregate RAM from multiple replica sets and form a MongoDB cluster.

Having already discussed using clones of gold image VMs to create members for a replica set, then the most basic of MongoDB clusters requires at least two replica sets. On top of which we need a number of MongoDB “infrastructure” VMs that make MongoDB cluster operation possible. These entail a minimum of three (3) Configuration Databases (mongod –configsvr) per cluster and around one (1) Query Router (mongos) for every two shards. Here is the layout of a cluster deployment on my lab system:

2shard-system

In the above lab deployment, for availability considerations, I avoid co-locating any primary replica VM on the same physical host, and likewise any of the Query Router or ConfigDB VMs. One thing to bear in mind is that sharding is done on a per collection basis. Simply put, the idea behind sharding is that you split the collections across the replica sets and then by connecting to a mongos process you are routed to the appropriate shard holding the part of the collection that can serve your query. The following commands show the syntax to create one of the three required configdb’s (ran on three separate VMs, and need to be started first), and a Query Router, or mongos process (where we add the IP addresses of each configdb server VM) :

Config DB Servers – each ran as:
mongod --configsvr --dbpath /data/configdb --port 27019

Query Router - ran as:
mongos --configdb 10.68.64.142:27019,10.68.64.143:27019,10.68.64.145:27019

- the above IP addresses in mongos command line are the addresses of each config DB.

This brings up an issue if you are not cloning replica VMs from “blank” gold VMs. By cloning a new replica set from a current working replica set, ie: so that you essentially have each replica set holding a full copy of all your databases and their collections. Then when you come to add such a replica set as a shard, you generate the error condition shown below.

Here’s the example of what can happen when you attempt to shard and your new replica set (rs02)  is simply cloned off a current running replica set (rs01):

mongos> sh.addShard("rs02/192.168.1.52")
{s
 "ok" : 0,
 "errmsg" : "can't add shard rs02/192.168.1.52:27017 because a local database 'ycsb' 
exists in another rs01:rs01/192.168.1.27:27017,192.168.1.32:27017,192.168.1.65:27017"
}

This is the successful workflow adding both shards (the primary of each replica set) via the mongos router VM:

$ mongo --host localhost --port 27017
MongoDB shell version: 3.0.3
connecting to: localhost:27017/test
mongos>
 
mongos> sh.addShard("rs01/10.68.64.111")
{ "shardAdded" : "rs01", "ok" : 1 }
mongos> sh.addShard("rs02/10.68.64.110")
{ "shardAdded" : "rs02", "ok" : 1 }

We next need to enable sharding on the database and subsequently shard on the collection we want to distribute across the replica sets available. The choice of shard key is crucial to future MongoDB cluster performance. Issues such as read and write scaling, cardinality etc are covered here. For my test cluster I am using the _id field for demonstration purposes.

mongos> sh.enableSharding("ycsb")
{ "ok" : 1 }

mongos> sh.shardCollection("ycsb.usertable", { "_id": 1})
{ "collectionsharded" : "ycsb.usertable", "ok" : 1 }

The balancer process will run for the period of time needed to migrate data between the available shards. This can take anywhere from a number of hours to a number of days depending on the size of the collection, the number of shards, the current workload etc. Once complete however, this results in the following sharding status output. Notice  the “chunks” of the usertable collection held in the ycsb database are now shared across both shards (522 chunks in each shard) :

 mongos> sh.status()
--- Sharding Status ---
 sharding version: {
 "_id" : 1,
 "minCompatibleVersion" : 5,
 "currentVersion" : 6,
 "clusterId" : ObjectId("55f96e6c5dfc4a5c6490bea3")
}
 shards:
 { "_id" : "rs01", "host" : "rs01/10.68.64.111:27017,10.68.64.131:27017,10.68.64.144:27017" }
 { "_id" : "rs02", "host" : "rs02/10.68.64.110:27017,10.68.64.114:27017,10.68.64.137:27017" }
 balancer:
 Currently enabled: yes
 Currently running: no
 Failed balancer rounds in last 5 attempts: 0
 Migration Results for the last 24 hours:
 No recent migrations
 databases:
 { "_id" : "admin", "partitioned" : false, "primary" : "config" }
 { "_id" : "enron_mail", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "mydocs", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "sbtest", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "ycsb", "partitioned" : true, "primary" : "rs01" }
 ycsb.usertable
 shard key: { "_id" : 1 }
 chunks:
 rs01 522
 rs02 522
 too many chunks to print, use verbose if you want to force print
 { "_id" : "test", "partitioned" : false, "primary" : "rs02" }

Additional Links:

 

 

 

 

 

 

Using Nutanix clones to deploy MongoDB replica set

In this post I am going to look at setting up a replica set to support high availability in a MongoDB environment. Replica sets contain a primary MongoDB database and a number of additional secondary replica databases. Any one of the allowed replicas can become primary in the event that the original primary fails for whatever reason. Replica set membership count is usually an odd number in order that new primary elections are not tied.

Building out an HA MongoDB setup on Nutanix is relatively easy to do. Each MongoDB instance is hosted in a separate, sandboxed environment. In our case a virtual machine (VM). Each VM is then located on a separate physical hypervisor host. I have a gold image VM that has a MongoDB instance installed along recommended best practice guidelines. This VM gets cloned as required when I need to build out a new MongoDB environment. So for a 3 member replica set I need 3 clones.

three-replicaset

From a cluster CVM node type:

$ acli 

<acropolis> vm.clone mongodb01,mongodb02,mongodb03 clone_from_vm=mongodb30-gold
mongodb01: complete
mongodb02: complete
mongodb03: complete
 
<acropolis> vm.list
...
mongodb01: 2b9498c1-502e-454e-93c8-931a45a321b6
mongodb02: 9a445d26-caf9-4ddf-9d8e-296ea8b6e19e
mongodb03: 9a5512fa-3d19-4ddc-8cac-11721f999459
...

<acropolis> vm.on mongodb01,mongdb02,mongodb03
mongodb01: complete
mongodb01: complete
mongodb01: complete

After powering on the VMs, check that mongod starts correctly on default port 27017 on each VM. First thing to make sure is that the mongod process is listening on the correct address. I have set my VMs to use DHCP and this is the address that the service needs to listen on.

# ip a

2: eth0: <broadcast,multicast,up,lower_up> mtu 1500 qdisc pfifo_fast state UP qlen 1000
 link/ether 52:54:00:db:17:76 brd ff:ff:ff:ff:ff:ff
 inet 10.68.64.111/24 brd 10.68.64.255 scope global eth0


# cat /etc/mongod.conf | grep -i bind_ip
 bind_ip=127.0.0.1,10.68.64.111

# service mongod restart
# service mongod status

Once all of the VMs are up and running on their respective address:port tuples, make sure that we enable firewall access via iptables. Each VM, that will form part of the replica set, needs to allow access to the other members via mongod port 27017. So for a replica set with members 10.68.64.111, 10.68.64.114, 10.68.64.113, then for each member, in this example 10.68.64.111, run…

# iptables -A INPUT -s 10.68.64.113 -p tcp --destination-port 27017 -m state \
--state NEW,ESTABLISHED -j ACCEPT
# iptables -A INPUT -s 10.68.64.114 -p tcp --destination-port 27017 -m state \
--state NEW,ESTABLISHED -j ACCEPT

# service iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:[ OK ]
# service iptables reload

abridged iptables -L output after the above changes….

Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT tcp -- 10.68.64.113 anywhere tcp dpt:27017 state NEW,ESTABLISHED
ACCEPT tcp -- 10.68.64.114 anywhere tcp dpt:27017 state NEW,ESTABLISHED

Check access by performing a series of bi-directional tests between all the replica set members:

<10.68.64.111>$ mongo --host 10.68.64.113 --port 27017
MongoDB shell version: 3.0.3
connecting to: 10.68.64.113:27017/test
>
> quit()

Should any of the connection tests fail then revisit the iptables entries. Usual troubleshooting applies with telnet or nc, netstat etc.

In order to create the replica set, connect via ssh to each VM and edit the mongod.conf to include the replSet functionality:

$ grep -i replSet /etc/mongod.conf
replSet=rs01

Restart the mongod process (sudo service mongod restart) and then start a mongo shell session, the first member of the set (primary) needs to run :

$ mongo
MongoDB shell version: 3.0.3
connecting to: test
> rs.initiate()
{
 "info2" : "no configuration explicitly specified -- making one",
 "me" : "10.68.64.111:27017",
 "ok" : 1
}
rs01:PRIMARY>

You can use the shell commands rs.conf() and rs.status() to check the replica set at any point. We’ll look at one of these outputs after completing the replica set creation. Next, from the same mongo shell session, add the other two replica nodes:

rs01:PRIMARY> rs.add("10.68.64.113")
{ "ok" : 1 }

rs01:PRIMARY> rs.add("10.68.64.114")
{ "ok" : 1 }

Potential error scenarios

  •  if you didn’t clone the VMs for the replica set from a blank gold image but rather from a VM already running a replicated mongodb configuration. Then the replication commands report errors similar to this :
{
 "info2" : "no configuration explicitly specified -- making one",
 "me" : "10.68.64.111:27017",
 "info" : "try querying local.system.replset to see current configuration",
 "ok" : 0,
 "errmsg" : "already initialized",
 "code" : 23
}

On the proviso that this is a greenfield install, delete the local db config files in the data directory and re-run the rs.initiate()

  • if the firewall rules are not set correctly then the following error message is thrown:
 "errmsg" : "Quorum check failed because not enough voting nodes responded; 
required 2 but only the following 1 voting nodes responded: 10.68.64.111:27017; 
the following nodes did not respond affirmatively: 
10.68.64.131:27017 failed with Failed attempt to connect to 10.68.64.131:27017; 
couldn't connect to server 10.68.64.131:27017 (10.68.64.131), 
connection attempt failed",

Ensure that the firewall rules allow proper access between the VM’s.

  • if replication is not enabled correctly in the mongod configuration files on each host of the replica set :
"errmsg" : "Quorum check failed because not enough voting nodes responded; 
required 2 but only the following 1 voting nodes responded: 10.68.64.110:27017; 
the following nodes did not respond affirmatively: 
10.68.64.114:27017 failed with not running with --replSet",

Once the replica set configuration is complete, check the setup by running rs.status() or rs.conf() to confirm :

rs01:PRIMARY> rs.conf()
{
 "_id" : "rs01",
 "version" : 3,
 "members" : [
 {
 "_id" : 0,
 "host" : "10.68.64.111:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {

 },
 "slaveDelay" : 0,
 "votes" : 1
 },
 {
 "_id" : 1,
 "host" : "10.68.64.113:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {

 },
 "slaveDelay" : 0,
 "votes" : 1
 },
 {
 "_id" : 2,
 "host" : "10.68.64.114:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {

 },
 "slaveDelay" : 0,
 "votes" : 1
 }
 ],
 "settings" : {
 "chainingAllowed" : true,
 "heartbeatTimeoutSecs" : 10,
 "getLastErrorModes" : {

 },
 "getLastErrorDefaults" : {
 "w" : 1,
 "wtimeout" : 0
 }
 }
}

From the output above we can see the full replica set membership, both the member function and status. Things like priority settings and whether or not the replica is hidden to user applications queries etc. Also, whether a replica is a full mongod instance or an arbiter (simply there to mitigate against primary election ties). Or, if any of the replicas have a delay enabled (used for backup/reporting duties).

In an earlier post I have shown the available mongo shell commands to calculate the working set for the database. For read intensive workloads, where your working set is sized to fit available RAM in the mongod server VMs; a replica set deployment can be used to run MongoDB and support high availability.

That ‘One Click’ upgrade again, in full

One way of demonstrating the concept of ‘Invisible Infrastructure’ is the ability to complete a full system upgrade with minimal service interruption. In this post I will show the “One Click” upgrade facility that’s available on the Nutanix platform.  This facility allows the admin to upgrade the Nutanix Operating System (NOS), the hypervisor, any required storage firmware and appropriate version of  Nutanix Cluster Check (NCC) for the target NOS release.

You can choose to either upload the NOS upgrade tarball or have it automatically downloaded to a landing area. Just check the Enable Automatic downloads box. Here I am uploading the software to the platform.

You can choose to either upload the NOS upgrade tarball or have it automatically downloaded to a landing area. Just check the Enable Automatic Download box. Here I am uploading the software to the platform.

Similar to the NOS version, the hypervisor can also be upgraded to a newer version when available.

Similar to the NOS version, the hypervisor can also be upgraded to a newer version when available.

You can either select to run the preupgrade checks standalone without performing an upgrade or just select to upgrade directly, in which case the checks are run prior.

You can either select to run the preupgrade checks standalone without performing an upgrade or just select to upgrade directly, those same checks will be run before the start of the upgrade in any case.

Selecting upgrade will show the progress of the various stages of the upgrade as they occur.

Selecting upgrade will show the progress of the various stages of the upgrade as they occur. CVMs are upgraded sequentially and only one CVM is rebooted at a time. A CVM is always back in the cluster membership before the next CVM is restarted.

kvm-preupgrade

You can choose to upgrade the underlying hypervisor as well at this stage.

You can choose to upgrade the underlying hypervisor as well at this stage.

As always you can monitor progress in the Prism main window. Here we see the upgrade process has been completed successfully.

As always you can check progress in the Prism main window. Here we see the upgrade process has completed successfully.

kvm-upgrade-events

Nutanix Prism also shows the individual task info ie task stage, CVM/host involved, time taken etc.

Nutanix Prism also shows the individual task info ie task stage, CVM/host involved, time taken etc.

The Nutanix platform upgrade takes care of all the intermediate steps and just works, regardless of the size of the cluster. There’s minimal impact and disruption as the upgrade takes place and it enables you to carry out such tasks within normal working hours, and not losing a weekend to the usual rigours of a traditional hardware upgrade cycle.

Webscalin’ – adding Nutanix nodes

Most modern web-scale applications (NoSQL, Search, Big Data, etc) are achieving massive elastic scale though horizontal scale out techniques. The admins for such apps require the ability to add nodes and storage for the required scale out without interruption to service. The workflow for adding a node to a Nutanix cluster allows such seamless addition, without any of the complex storage operations such as multipathing, zoning/masking, etc. A node is simply added to the chassis, the autodiscovery service detects the new node and the user is then simply asked to push a button to complete the process. The following are some screenshots of the prescribed workflow…

Connect to the nodes lights out management or IPMI webapp via a browser (enter the IPMI address) and login using the ADMIN credentials. You may need enable java im yor browser and configure java to allow the IPMI address.

After inserting the new node into the chassis slot, connect to the nodes lights out management or IPMI webapp via a browser (enter the IPMI address) and login using the ADMIN credentials. You may need enable Java in your browser and configure Java to allow the IPMI address.

Launch the remote console to access the Hypervisor

Launch the Console to enable remote access the Hypervisor.

Using the menu bar power on the node (if needed) otherwise login and configure network addressing.

Using the ‘Power Control’ drop down on the Menu bar across the top of the frame- Power On the node (if needed). You can at this point set up any L2 networking such vlan tagging etc.

Select 'Expand Cluster' from the right drop down menus in the Prism GUI. The node should be auto-discovered.

Select ‘Expand Cluster’ from the right drop down menus in the Prism GUI. The node should be auto-discovered.

Configure the required network addresses and select 'Save' to add the node to the cluster.

Configure the required network addresses and select ‘Save’ to add the node to the cluster.

The progress of the node addition can be monitored in the Prism GUI. Note that the hypervisor was automatically upgraded in order to maintain the same software functionality across the cluster nodes.

The progress of the node addition can be monitored in the Prism GUI. Note that the hypervisor was automatically upgraded in order to maintain the same software functionality across the cluster nodes.

That’s it, once the node is added and the metadata is re-balanced across all the nodes,  then the new nodes storage (HDD/SSD) is added to the storage pool with the rest of the cluster nodes. At which point all containers (datastores) are automatically mounted onto the newly added host and the new host is ready to receive guests! This kind of ease of use story is becoming paramount in terms of  time to value for many webscale applications. Its all well and good having applications on top of NoSQL DBs that allow for rapid development and deployment. However, if the upfront planning for the underlying architecture holds everything back for days if not weeks, then modern DevOps style operations are much harder to achieve..

Switch to Simplicity …

With the recent announcement by Nutanix of the Xtreme Computing Platform (XCP) built on a KVM based hypervisor and the Acropolis management solution. I thought I would use this step change in technology as the basis for my inaugural blog! What I would like to highlight is how much simpler this has made deploying applications in virtual machines, particularly on a KVM platform. As most of us that have had some exposure to KVM, we know that KVM is in fact the amalgamation of three distinct open source projects. These are:

QEMU (Quick Emulator). An emulator and virtualizer for Linux.  KVM leverages QEMU specifically for CPU emulation, executing virtual machine operations directly on the host CPU to achieve near native performance.

KVM kernel modules: Loadable kernel components which provide the virtualization infrastructure (other than the CPU).  Specifically, kvm.ko provides the core virtualization infrastructure and a processor-specific module (kvm-intel.ko or kvm-amd.ko) interacts with QEMU.

libvirt: An API for the management of virtualization environments

Let’s take a look at how a VM is created using the Nutanix Prism GUI…

Selecting the Network Create box in the VM tab: we are assigning a vlan tag (64) and leaving the network to be externally managed – ie: the current (external to Nutanix) network infrastructure manages the network (such as DHCP etc.)

Selecting the Network Create box in the VM tab: we are assigning a vlan tag (64) and leaving the network externally managed – ie: the current (external to Nutanix) network infrastructure manages the network (such as DHCP etc.)

Next select +VM Create and fill out the details as required above. We will add a NIC, a boot Disk and attach the CDROM image in the next steps.

Next select +VM Create and fill out the details as required above. We will add a NIC, a boot Disk and attach the CDROM image in the next steps.

Add a NIC from the previously created L2 network (VLAN 64)

Add a NIC from the previously created L2 network (VLAN 64)

Attach the CDROM image by selecting CLONE FROM NDFS FILE and specifying the path to the image. Images are stored on a specifically created for the purpose NFS container.

Attach the CDROM image by selecting CLONE FROM NDFS FILE. Specify the PATH to the image. Images are stored on a NFS container – specifically created for that purpose

Add Disk – create a 100GB vDisk to act as the permanent boot disk that will be stored on DEFAULT-CTR.

Add Disk – create a 100GB vDisk to act as the permanent boot disk stored on DEFAULT-CTR.

Power the VM and launch the console from the Prism GUI. The VM should power on and install.

Power the VM and launch the console from the Prism GUI. The VM should power on and install.

The finished product (remember to “eject” the cdrom) …

The finished product (remember to “eject” the cdrom)

Next, I am going to step through the manual creation of a VM using the standard APIs and show how the complexity of which, has been abstracted by doing things the Nutanix way. First off, we are going to need a virtual disk image:

$ qemu-img create -f qcow2 libvirt-example.qcow 4G

Formatting ‘libvirt-example.qcow’, fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 lazy_refcounts=off

Here’s the syntax to create a very basic VM using the libvirt API. I am specifying the cdrom image, the virtual disk location, a name for the VM and the connection to the local libvirt instance:

$ sudo virt-install \
–cdrom=/var/lib/libvirt/images/ttylinuxvirtio_x86_64-16.1.iso \
–disk=/var/lib/libvirt/images/libvirt-example.qcow,format=qcow2 \
–name=libvirt-example –ram=512 –connect qemu:///system

You can obtain the above ttylinux image here. Note also that libvirt has created a default network for the VM:

$ sudo virsh net-list –all
Name                    State     Autostart             Persistent
————————————————————-
default                 active    yes                        yes

Next, we can create another VM but this time using the QEMU interface. In this example we create a VNC endpoint to connect to the VM after start up:

sudo qemu-system-x86_64 -enable-kvm -name qemu-example \
-m 1G -hda /var/lib/libvirt/images/qemu-example.qcow2 \
–cdrom /var/lib/libvirt/images/ttylinux-virtio_x86_64-16.1.iso \
-vnc 127.0.0.1:1

These images can of course be managed by utilities such as virt-manager, virt-viewer, etc. Equally, I have not shown the full complexity of the command line options, exposed by the standard KVM APIs. I have shown though, how the Nutanix software simplifies and abstracts away the complexity of these APIs that most provisioning and orchestration stacks have to deal with. The Nutanix platform does provide a management API and a command line syntax to build out your VMs but I will leave that for another post in the future. Thanks for reading.