Parsing mongostat data with Logstash

On my way to a complete MongoDB monitoring solution, I’ve been playing with mongostat to see what I can achieve with it. So I tested mongostat on a simple architecture made of two shards, each shard being a replica set composed of tree members. Of course, we also have tree configuration routers and one query router.

First, I discovered a few bugs in the tool when using it with the option discover. This parameter can be used to automatically retrieve statistics from all members of a replica set or a sharded cluster. Using it with version 2.4.9 of mongostat causes some other parameters to be ignored: rowcount and noheaders. So I dove in the code on Github to find that this bugs had been already corrected. We just need modifications to come to the stable version.

mongostat --host localhost:24000 --discover --noheaders -n 2 30 > mongostat.log

Here I’m connecting to my query router, asking mongostat to find other mongoDB instances by itself. Options noheaders and n don’t work at the moment but that’s not a problem. With this setup, I will receive logs every 30s.

There are two types of log: the ones coming from the mongos and the ones coming from the other.

localhost:21000        *0     *0     *0     *0       0     1|0       0   800m  1.04g    30m      0 local:0.0%          0       0|0     0|0   198b   924b    13 rs0  PRI   15:36:55
 localhost:21001        *0     *0     *0     *0       0     1|0       0   800m  1.01g    29m      0  test:0.0%          0       0|0     0|0   138b   359b     6 rs0  SEC   15:36:55
 localhost:21002        *0     *0     *0     *0       0     1|0       0   800m  1.01g    29m      0  test:0.0%          0       0|0     0|0   138b   359b     6 rs0  SEC   15:36:55
 localhost:21100        *0     *0     *0     *0       0     1|0       0   800m  1.05g    35m      0 local:0.0%          0       0|0     0|0   198b   924b    13 rs1  PRI   15:36:55
 localhost:21101        *0     *0     *0     *0       0     1|0       0   800m  1.01g    34m      0  test:0.0%          0       0|0     0|0   138b   359b     6 rs1  SEC   15:36:55
 localhost:21102        *0     *0     *0     *0       0     1|0       0   800m  1.01g    34m      0  test:0.0%          0       0|0     0|0   138b   359b     6 rs1  SEC   15:36:55
 localhost:24000         0      0      0      0       0       0                  174m     5m      0                                             2b    23b     2      RTR   15:36:55

Output from mongostat.

 

The last line coming from the mongos has empty fields. So we will need to deal with it when parsing the log. After having understood how mongostat works, it is now time to see if we can easily plug it in logstash. Let’s take a look at our logstash configuration file.

input {
  file {
    type => "mongostat"
    path => ["/path/to/mongostat.log"]
  }
}

We define where to find the log file.

 

filter {
  if [type] == "mongostat" {
    grok {
      patterns_dir => "./patterns"
      match => ["message", "%{HOSTNAME:host}:%{INT:port}%{SPACE}%{METRIC:insert}%{SPACE}%{METRIC:query}%{SPACE}%{METRIC:update}%{SPACE}%{METRIC:delete}%{SPACE}%{METRIC:getmore}%{SPACE}%{COMMAND:command}%{MONGOTYPE1}%{SIZE:vsize}%{SPACE}%{SIZE:res}%{SPACE}%{NUMBER:fault}%{MONGOTYPE2}%{SIZE:netIn}%{SPACE}%{SIZE:netOut}%{SPACE}%{NUMBER:connections}%{SPACE}%{USERNAME:replicaset}%{SPACE}%{WORD:replicaMember}%{SPACE}%{TIME:time}"]
    }
  }
  if [tags] == "_grokparsefailure" {
    drop { }
  }
  if [message] == "" {
    drop { }
  }
}

We apply filter on each message received if it comes from mongostat. If the message is empty or grok fails to parse it, we drop the log.

 

output {
  stdout { }
  elasticsearch_http {
    host => "127.0.0.1"
  }
}

Simple output. Logs are stored in Elasticsearch so that we can use Kibana to examine them later and are written to stdout for immediate debugging and verification.

Let’s consider a little bit the filter part. We give grok a directory for our personal patterns: ./patterns. This directory contains a file mongostat with the following patterns:

METRIC (\*%{NUMBER})|(%{NUMBER})
COMMAND (%{NUMBER}\|%{NUMBER})|(%{NUMBER})
SIZE (%{NUMBER}[a-z])|(%{NUMBER})
LOCKEDDB (%{WORD}\:%{NUMBER}%)
MONGOTYPE2 (%{SPACE}%{LOCKEDDB:lockedDb}%{SPACE}%{NUMBER:indexMissedPercent}%{SPACE}%{COMMAND:QrQw}%{SPACE}%{COMMAND:ArAw}%{SPACE})|%{SPACE}
MONGOTYPE1 (%{SPACE}%{NUMBER:flushes}%{SPACE}%{SIZE:mapped}%{SPACE})|%{SPACE}

The MONGOTYPE patterns are used to deal with empty fields from mongos log line.
The rest of the match directive is just about capturing each field from mongostat to produce a more readable and analysable output.

To create this, I used an online Grok debugger which is very useful because you needn’t to reload logstash every time you want to test your work. It also provides you instant feedback.

I’m know waiting for the bugs to be fixed in stable so that this solution could be more useful to monitor mongo and maybe use it in production.

List of available patterns on Github.

Easily deploy a local MongoDB cluster

When working with MongoDB, it surely comes a time when you want to try to make it scale. First, by adding replica set, then sharding and at least mix it together to create a simple cluster with replication on each shard. Then, you might want to test monitoring or other functionality, so you will need to launch your cluster, configure it.

If everything goes right, you’ll only need to start all your MongoDB instances. Not a big deal, easy to do it at each restart of your computer, just copy paste the right commands. But what if I need a fresh clean cluster to make my test? To answer this question, I wrote a simple script to easily deploy a local mongo cluster composed of one query router, three configuration routers and two shards, each being a replica set composed of one primary mongod and two secondary mongod.

This simple project is on my Github under the name simple-mongo-cluster. The readme describes on which port you’ll find which mongo. The script provides 5 operations: {init|start|configure|stop|clean}. I will quickly describe each of it.

The init operation creates all directories needed to store logs and data. All mongo instances will be started with the start parameter. To configure the cluster, use configure. For the cluster to be configured, you might need to call configure twice, because of shards sometimes not being added at first call. The stop operation is just about shutting down the whole cluster. At last, clean deletes all directories containing the cluster configuration files, data and logs so that you can start a fresh new cluster.

I hope it will help other developers who want to easily test and discover MongoDB cluster possibilities in a local environment as it helps me. If you have any suggestions, just let me know or fork the project, add your script and send me a pull request.

Découverte de la gestion de log avec ELK

Dans le cadre de mon stage, je m’intéresse actuellement au solution de monitoring et j’ai donc eu l’occasion de tester le triplet Elasticsearch Logstash Kibana connu sous l’abréviation ELK. Logstash permet d’agréger simplement des logs provenant de différentes sources, Elasticsearch s’occupe de les stocker et de les rendre disponibles et enfin Kibana les affiche sur un dashboard hautement personnalisable. Les instructions qui suivent m’ont donc permis d’avoir un rapide aperçu du fonctionnement de la solution ELK en local et dans un cas très simple de gestion de logs système.

Récupération des logiciels

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.0.1.tar.gz
wget https://download.elasticsearch.org/logstash/logstash/logstash-1.3.3-flatjar.jar
wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.0milestone5.tar.gz

Extraction

tar xvf elasticsearch-1.0.1.tar.gz
tar xvf kibana-3.0.0milestone5.tar.gz

Elasticsearch

cd elasticsearch-1.0.1/

La configuration elasticsearch.yml se situe dans config/. Il n’est pas nécessaire d’y toucher pour un test en local, on pourrait toutefois modifier les paramètres cluster.name et node.name pour personnaliser l’installation.

Démarrer Elasticsearch:

./bin/elasticsearch

Logstash

Création d’un fichier de configuration logstash.conf:

touch logstash.conf

Nous allons lire les fichiers de log du système, de ce fait, il pourrait être nécessaire de lancer Logstash en root pour que celui-ci puisse lire les fichiers de logs. Cette solution n’est à utiliser que pendant la phase de test.

Contenu du fichier:

input {
    file {
        type => "linux-syslog"
        path => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ]
    }
}
output {
    stdout { }
    elasticsearch_http {
        host => "127.0.0.1"
    }
}

Documentation pour le paramètre elasticsearch_http.

Démarrer Logstash:

sudo java -jar logstash-1.3.3-flatjar.jar agent -f logstash.conf

Les nouveaux logs devraient donc maintenant être récupérés par Logstash et stockés par Elasticsearch. Nous donc pouvoir les visualiser avec Kibana.

Kibana

cd kibana-3.0.0milestone5/

Éditer le fichier config.js et changer la ligne:

elasticsearch: "http://"+window.location.hostname+":9200",

en

elasticsearch: "http://127.0.0.1:9200",

Cette modification nous permet d’ouvrir le fichier index.html directement dans notre navigateur pour accéder à Kibana sans avoir besoin de mettre en place un serveur comme Apache pour servir les fichiers.

Résultat

KibanaL’ajout d’un mécanisme d’authentification pour l’accès à Kibana peut être réalisé simplement en utilisant le projet fangli/kibana-authentication-proxy.

Installation de Wallabag

Après avoir plusieurs fois entendu parler de wallabag, je me décide donc à l’installer pour le tester, m’en servir et pourquoi pas alléger mes marque-pages Firefox. Voici donc les différentes étapes et commandes qui m’ont permis de mener à bien l’installation. J’espère ne pas en avoir oublier lors de l’écriture de cet article.

Récupération de la dernière version de wallabag:

wget http://wllbg.org/latest

Décompression:

unzip latest

Récupération de composer et exécution dans le dossier de wallabag:

curl -s http://getcomposer.org/installer | php
php composer.phar install

Création de la base de donnée:

mysql -p -u root

mysql> CREATE DATABASE wallabag;

mysql> GRANT ALL PRIVILEGES ON `wallabag`.* TO 'wallabag'@'localhost' IDENTIFIED BY 'VotreMotdePasse';

mysql> exit

Après avoir créer notre base de données wallabag, injection du script de configuration MySQL:

mysql -p -u root wallabag < install/mysql.sql

Renommer le fichier de configuration:

mv config.inc.php.new config.inc.php

Édition de la configuration:

nano inc/poche/config.inc.php

On modifiera en particulier les lignes suivantes:

define ('SALT', 'Une chaîne de caractères fortes de votre composition');
define ('STORAGE', 'mysql');

define ('STORAGE_SERVER', 'localhost');
define ('STORAGE_DB', 'wallabag');
define ('STORAGE_USER', 'wallabag');
define ('STORAGE_PASSWORD', 'VotreMotdePasse');

Permission nécessaire sur les dossiers

chmod 777 -R assets/ cache/ db/

Fin de l’installation:

rm -rf install/

Et en cas de besoin:

chown -R www-data:www-data wallabag/

Et enfin, la configuration d’Apache:

<VirtualHost *:80>
  ServerAdmin webmaster@localhost
  ServerName wallabag.domain.org
  DocumentRoot /var/www/wallabag
  <Directory />
    Options FollowSymLinks
    AllowOverride None
  </Directory>
  <Directory /var/www/wallabag>
    Options Indexes FollowSymLinks MultiViews
    AllowOverride All
    Order allow,deny
    allow from ALL
  </Directory>
</VirtualHost>

Si tout c’est bien passé, il suffit de se rendre à l’adresse wallabag.domain.org pour arriver sur la page de création du compte utilisateur.