Tag Archives: index

ELK on Nutanix : Kibana

It might seem like I am doing things out of sequence by looking at the visualisation layer of the ELK stack next. However, recall in my original post , that I wanted to build sets  of unreplicated indexes and then use Logstash to fire test workloads at them. Hence, I am covering Elasticsearch and Kibana initially. This brings me to another technical point that I need to cover. In order for a single set of indexes to be actually recoverable, when running on a single node, we need to invoke the following parameters in our Elasticsearch playbook :

So in file: roles/elastic/vars/main.yml
...
elasticsearch_gateway.recover_after_nodes: 1
elasticsearch_gateway.recover_after_time: 5m
elasticsearch_gateway.expected_nodes: 1
...

These are then set in the elasticsearch.yml.j2 file as follows:

# file: roles/elastic/templates/elasticsearch.yml.j2
#{{ ansible_managed }}

...

# Allow recovery process after N nodes in a cluster are up:
#
#gateway.recover_after_nodes: 2
{% if elasticsearch_gateway_recover_after_nodes is defined %}
gateway.recover_after_nodes : {{ elasticsearch_gateway_recover_after_nodes}}
{% endif %}

and so on ....

This allows the indexes to be recovered when there is only a single node in the cluster. See below for the state of my indexes after a reboot:

[root@elkhost01 elasticsearch]# curl -XGET http://localhost:9200/_cluster/health?pretty
{
 "cluster_name" : "nx-elastic",
 "status" : "yellow",
 "timed_out" : false,
 "number_of_nodes" : 1,
 "number_of_data_nodes" : 1,
 "active_primary_shards" : 4,
 "active_shards" : 4,
 "relocating_shards" : 0,
 "initializing_shards" : 0,
 "unassigned_shards" : 4,
 "delayed_unassigned_shards" : 0,
 "number_of_pending_tasks" : 0,
 "number_of_in_flight_fetch" : 0
}

Lets now look at the Kibana playbook I am attempting. Unfortunately, Kibana is distributed as a compressed tar archive. This means that the yum or dnf modules are no help here. There is however a very useful unarchive module, but first we need to download the tar bundle using get_url as follows :

- name: download kibana tar file
 get_url: url=https://download.elasticsearch.org/kibana/kibana/kibana-{{ kibana_version }}-linux-x64.tar.gz
 dest=/tmp/kibana-{{ kibana_version }}-linux-x64.tar.gz mode=755
 tags: kibana

I initially tried unarchiving the Kibana bundle into /tmp. I then intended to copy everything below the version specific directory (/tmp/kibana-4.0.1-linux-x64) into the Ansible created /opt/kibana directory. This proved problematic as neither the synchronize nor the copy modules seemed setup to do mass copy/transfer between one directory structure to another. Maybe I am just not getting it – I even tried using with_item loops but no joy as fileglobs are not recursive. Answers on a postcard are always appreciated? In the end I just did this :

- name: create kibana directory
 become: true
 file: owner=kibana group=kibana path=/opt/kibana state=directory
 tags: kibana

- name: extract kibana tar file
 become: true
 unarchive: src=/tmp/kibana-{{ kibana_version }}-linux-x64.tar.gz dest=/opt/kibana copy=no
 tags: kibana

The next thing to do was to create a systemd service unit. There isn’t one for Kibana as there is no rpm package available. Usual templating applies here :

- name: install kibana as systemd service
 become: true
 template: src=kibana4.service.j2 dest=/etc/systemd/system/kibana4.service owner=root \
           group=root mode=0644
 notify:
 - restart kibana
 tags: kibana

And the service unit file looked like:

[ansible@ansible-host01 templates]$ cat kibana4.service.j2
{{ ansible_managed }}

[Service]
ExecStart=/opt/kibana/kibana-{{ kibana_version }}-linux-x64/bin/kibana
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=kibana4
User=root
Group=root
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

This all seemed to work as I could now access Kibana via my browser. No indexes yet of course :

kibana_initial_install

There are one or two plays I would like still like to document. Firstly, the ‘notify’ actions in some of the plays. These are used to call – in my case – the restart handlers. Which in turn causes the service in question to be restarted – see the next section :

# file: roles/kibana/handlers

- name: restart kibana
 become: true
 service: name=kibana state=restarted

I wanted to document this next feature simply because it’s so useful – tags. I have assigned a tag to every play/task in the playbook so far you will have noticed. For testing purposes they allow you to run specific plays. You can then troubleshoot just that particular play and see what’s going on.

 ansible-playbook -i ./production site.yml --tags "kibana" --ask-sudo-pass

Now that I have the basic plays to get my Elasticsearch and Kibana services up and running via Ansible, it’s time to start looking at Logstash. Next time I post on ELK type stuff, I will try to look at logging and search use cases. Once I crack how they work of course.

ELK on Nutanix : Elasticsearch

In this second post on using Ansible to deploy the ELK stack on Nutanix, I will cover my initial draft at a playbook for Elasticsearch (ES).  Recall from my previous post, the playbook layout looks like:

[ansible@ansible-host01 roles]$ tree elastic
elastic
├── files
│   └── elasticsearch.repo
├── handlers
│   └── main.yml
├── tasks
│   └── main.yml
├── templates
│   ├── elasticsearch.default.j2
│   ├── elasticsearch.in.sh.j2
│   └── elasticsearch.yml.j2
└── vars
 └── main.yml

There’s also an additional role at play here, config – which is the basic config for the underlying VM guest OS, which we also need to look at :

[ansible@ansible-host01 roles]$ tree config
config
├── files
├── handlers
├── tasks
│   └── main.yml
├── templates
└── vars
 └── main.yml

the common role is where I set things via the Ansible sysctl module, or add entries to files (using lineinfile) in order to set max memory and ulimits etc. It’s generic system configuration, so for example:

#installing java runtime pkgs (pre-req for ELK)
- name: install java 8 runtime
 become: true
 yum: name=java state=installed
 tags: config

#set system max/min numbers...
- name: set maximum map count in sysctl/systemd
 become: true
 sysctl: name=vm.max_map_count value={{ os_max_map_count }} state=present
 tags: config

...

- name: set soft limits for open files
 become: true
 lineinfile: dest=/etc/security/limits.conf line="{{ elasticsearch_user }} soft nofile {{ elasticsearch_max_open_files }}" insertafter=EOF backup=yes
 tags: config

- name: set max locked memory
 become: true
 lineinfile: dest=/etc/security/limits.conf line="{{ elasticsearch_user }} - memlock {{ elasticsearch_max_locked_memory }}" insertafter=EOF backup=yes
 tags: config

...

Here might be a good time to touch upon how Ansible allows you to set variables. Within the directory of each role there’s a subdir called vars and all the variables needed for that role are contained in the YAML file (main.yml). Here’s a snippet:

# can use vars to set versioning and user 
elasticsearch_version: 1.7.0
elasticsearch_user: elasticsearch
...

# here's how we can specify the data volumes that ES will use 
elasticsearch_data_dir: /esdata/data01,/esdata/data02,/esdata/data03,/esdata/data04,/esdata/data05,/esdata/data06

...

# Virtual memory settings - ES heap is set to half my current VM RAM
# but no greater than 32GB for performance reasons
elasticsearch_heap_size: 16g
elasticsearch_max_locked_memory: unlimited
elasticsearch_memory_bootstrap_mlockall: "true"

....

# Good idea not to go with the ES default names of Franz Kafka etc
elasticsearch_cluster_name: nx-elastic
elasticsearch_node_name: nx-esnode01

# My initial nodes will be both cluster quorum members and data "workhorse" nodes.
# I will # separate duties as I scale. Also I set the min master nodes to 1 so that 
# my ES cluster comes up while initially testing a single index 
elasticsearch_node_master: "true"
elasticsearch_node_data: "true"
elasticsearch_discovery_zen_minimum_master_nodes: 1

We’ll see how we use these variables as we cover more features. Next up I used some nice features like shell and also register variables to be able to provide conditional behaviour for package install :

- name: check for previous elasticsearch installation
 shell: if [ -e /usr/share/elasticsearch/lib/elasticsearch-{{ elasticsearch_version }}.jar ]; then echo yes; else echo no; fi;
 register: version_exists
 always_run: True
 tags: elastic

- name: uninstalling previous version if applicable
 become: true
 command: yum erase -y elasticsearch
 when: version_exists.stdout == 'no'
 ignore_errors: true
 tags: elastic

and similarly for the marvel plugin :

- name: check marvel plugin installed
 become: true
 stat: path={{ elasticsearch_home_dir }}/plugins/marvel
 register: marvel_installed
 tags: elastic

- name: install marvel plugin
 become: true
 command: "{{ elasticsearch_home_dir }}/bin/plugin -i elasticsearch/marvel/latest"
 notify:
 - restart elasticsearch
 when: not marvel_installed.stat.exists
 tags: elastic

The Marvel plugin stanza above also makes use of the stat module – this is a really great module. It returns all kinds of goodness you would normally expect from a stat() system call and yet you are doing it in your Ansible playbook.  There are a couple more things I will cover and then leave the rest for when I talk about Kibana and Logstash in a follow up post. First up then are templates. Ansible uses Jinja2 templating in order to transform a file and install it on your host, you can create a file with appropriate templating as below. The variables in {{ .. }} are from the roles ../var directory containing the yaml file already described earlier.

Note : I stripped all comment lines for sake of brevity:

[ansible@ansible-host01 templates]$ pwd
/home/ansible/elk/roles/elastic/templates
[ansible@ansible-host01 templates]$ grep -v ^# elasticsearch.yml.j2
{% if elasticsearch_cluster_name is defined %}
cluster.name: {{ elasticsearch_cluster_name }}
{% endif %}

...

{% if elasticsearch_node_name is defined %}
node.name: {{ elasticsearch_node_name }}
{% endif %}

...

{% if elasticsearch_node_master is defined %}
node.master: {{ elasticsearch_node_master }}
{% endif %}
{% if elasticsearch_node_data is defined %}
node.data: {{ elasticsearch_node_data }}
{% endif %}

...

{% if elasticsearch_memory_bootstrap_mlockall is defined %}
bootstrap.mlockall: {{ elasticsearch_memory_bootstrap_mlockall }}
{% endif %}
....

The template  file when run in the play is then transformed using the provided variables and copied into place on my  ELK host target VM…

- name: copy elasticsearch defaults file
 become: true
 template: src=elasticsearch.default.j2 dest=/etc/sysconfig/elasticsearch owner={{ elasticsearch_user }} group={{ elasticsearch_group }} mode=0644
 notify:
 - restart elasticsearch
 tags: elastic

So let’s see how our playbook runs and what the output looks like

[ansible@ansible-host01 elk]$ ansible-playbook -i ./production site.yml \
--tags "config,elastic" --ask-sudo-pass
SUDO password:

PLAY [elastic-hosts] **********************************************************

GATHERING FACTS ***************************************************************
ok: [10.68.64.117]

TASK: [config | install java 8 runtime] ***************************************
ok: [10.68.64.117]

TASK: [config | set swappiness in sysctl/systemd] *****************************
ok: [10.68.64.117]

TASK: [config | set maximum map count in sysctl/systemd] **********************
ok: [10.68.64.117]

TASK: [config | set hard limits for open files] *******************************
ok: [10.68.64.117]

TASK: [config | set soft limits for open files] *******************************
ok: [10.68.64.117]

TASK: [config | set max locked memory] ****************************************
ok: [10.68.64.117]

TASK: [config | Install wget package (Fedora based)] **************************
ok: [10.68.64.117]

TASK: [elastic | install elasticsearch signing key] ***************************
changed: [10.68.64.117]

TASK: [elastic | copy elasticsearch repo] *************************************
ok: [10.68.64.117]

TASK: [elastic | check for previous elasticsearch installation] ***************
changed: [10.68.64.117]

TASK: [elastic | uninstalling previous version if applicable] *****************
skipping: [10.68.64.117]

TASK: [elastic | install elasticsearch pkgs] **********************************
skipping: [10.68.64.117]

TASK: [elastic | copy elasticsearch configuration file] ***********************
ok: [10.68.64.117]

TASK: [elastic | copy elasticsearch defaults file] ****************************
ok: [10.68.64.117]

TASK: [elastic | set max memory limit in systemd file (RHEL/CentOS 7+)] *******
changed: [10.68.64.117]

TASK: [elastic | set log directory permissions] *******************************
ok: [10.68.64.117]

TASK: [elastic | set data directory permissions] ******************************
ok: [10.68.64.117]

TASK: [elastic | ensure elasticsearch running and enabled] ********************
ok: [10.68.64.117]

TASK: [elastic | check marvel plugin installed] *******************************
ok: [10.68.64.117]

TASK: [elastic | install marvel plugin] ***************************************
skipping: [10.68.64.117]

NOTIFIED: [elastic | restart elasticsearch] ***********************************
changed: [10.68.64.117]

PLAY RECAP ********************************************************************
10.68.64.117 : ok=19 changed=4 unreachable=0 failed=0

[ansible@ansible-host01 elk]$

I can verify that my ES cluster is working by querying the Cluster API – note that the red status is down to the fact I have no other cluster nodes yet on which to replicate the index shards:

# curl -XGET http://localhost:9200/_cluster/health?pretty
{
 "cluster_name" : "nx-elastic",
 "status" : "red",
 "timed_out" : false,
 "number_of_nodes" : 1,
 "number_of_data_nodes" : 1,
 "active_primary_shards" : 0,
 "active_shards" : 0,
 "relocating_shards" : 0,
 "initializing_shards" : 0,
 "unassigned_shards" : 0,
 "delayed_unassigned_shards" : 0,
 "number_of_pending_tasks" : 0,
 "number_of_in_flight_fetch" : 0
}

You can use further API queries to verify that the desired configuration is in place and at that point you have a solid, repeatable deployment with a known outcome ie: you are doing DevOps.