Friday, April 8, 2016

Setting up Consul Service Discovery for Mesos in 10 Minutes

This will be a short series on using Consul in your Microservices environment. Consul provides Service Discovery and many other nice features for Mircoservices which you can read more here. After you read it you will understand why it is such a popular choice for many people using any form of Microservice and anything else that requires Service Discovery for that matter. I have chosen to use Consul for my PaaS offering service backed with Apache Mesos with integration for a tool called consul-template and also for DNS for containers. Ill kick off a small series about different ways to utilize Consul for your Microservices architecture and how I have been utilizing it for Service Discovery and multiple other things for Docker. I wont talk much about it or try to explain how it works because it is best to read as much as possible on your own so for more information please see Consul Documentation:

More info on Consul: https://www.consul.io/

Documentation: https://www.consul.io/docs/index.html
Free Online Demo!! : http://demo.consul.io/ui/
MUST UNDERSTAND: https://www.consul.io/docs/guides/outage.html

We will start off by installing a cluster of 3 server nodes and 1 client with the UI and then end with creating systemd units for the entire cluster.


1) Pull down the Hashicorp Consul zip file to ALL nodes and unzip. The same package is used for server and client.

    cd /usr/local/bin/ && wget https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_linux_amd64.zip
    unzip consul*



2) Pull down the UI Package for the node that will act serve the Web UI for the cluster. Can be any but I chose the client. Unzip in desired directory.

    wget -O /opt/consul/web-ui.zip https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_web_ui.zip && cd /opt/consul/ && unzip web-ui.zip



3) Focusing on the server config first, create the initial files/directories on all servers. One of them will act as the bootstrap server initially until we get the cluster in quorum. 

    /etc/consul.d/bootstrap/config.json  ### This only gets created on 1 of the servers
    {
        "bootstrap": true,
        "server": true,
        "datacenter": "your-dc",
        "data_dir": "/var/lib/consul",
        "log_level": "INFO",
        "advertise_addr": "$BSTRAP_LOCAL_IP",
        "enable_syslog": true
    }

    
    /etc/consul.d/server/config.json
    {
        "bootstrap": false,
        "advertise_addr": "$LOCAL_IP",
        "server": true,
        "datacenter": "your-dc",
        "data_dir": "/var/lib/consul",
        "log_level": "INFO",
        "enable_syslog": true,
        "start_join": ["server1", "server2","server3"]
    }

    mkdir -pv /var/lib/consul   ### Used as our data directory


Also we can go ahead and create out systemd unit files on each server and enable on boot.

    /etc/systemd/system/consul-server.service
    [Unit]
    Description=Consul Server
    After=network.target
    
    [Service]
    User=root
    Group=root
    Environment="GOMAXPROCS=2"
    ExecStart=/usr/local/bin/consul agent -config-dir /etc/consul.d/server
    ExecReload=/bin/kill -9 $MAINPID
    KillSignal=SIGINT
    Restart=on-failure
    
    
    [Install]
    WantedBy=multi-user.target

      

    # systemctl enable consul-server



4) Run the following commands in order on each of the servers to get quorum. You will need a bootstrap server to start with (server1). You will need lots of terminals here.

On Server1:
    # consul agent -config-dir /etc/consul.d/bootstrap -advertise $BSTRAP_LOCAL_IP

On Server2 (-bootstrap-expect defines the number of servers to connect):
    # consul agent -config-dir /etc/consul.d/server -advertise $LOCAL_IP -bootstrap-expect 3

On Server3:
    # consul agent -config-dir /etc/consul.d/server -advertise $LOCAL_IP -bootstrap-expect 3

Back on Server1, do a CTRL+C to kill the consul process and then start as server.
    CTRL+C 
    # consul agent -config-dir /etc/consul.d/server -advertise $LOCAL_IP -bootstrap-expect 3

The servers should select a leader and sync to quorum. Each time you lose quorum, this is how you will have to restart it. A few other methods will have to be used along with it, see Outage documentation above for more reference.




5) Lets go ahead and get our client with the Web UI up and running before we do step 6 so we can watch from the UI what Consul looks like during service failures.

    /etc/consul.d/client/config.json
    {
        "server": false,
        "datacenter": "your-dc",
        "advertise_addr": "$LOCAL_IP",
        "client_addr": "$LOCAL_IP",
        "data_dir": "/var/lib/consul",
        "ui_dir": "/opt/consul/",
        "log_level": "INFO",
        "enable_syslog": true,
        "start_join": ["server1", "server2", "server3"]
    }


Create the systemd unit file.
    /etc/systemd/system/consul-client.service
    [Unit]
    Description=Consul Server
    After=network.target

    [Service]
    User=root
    Group=root
    Environment="GOMAXPROCS=2"
    ExecStart=/usr/local/bin/consul agent -config-dir /etc/consul.d/client
    ExecReload=/bin/kill -9 $MAINPID
    KillSignal=SIGINT
    Restart=on-failure


    [Install]
    WantedBy=multi-user.target

Start the service:
    # systemctl start consul-client && systemctl status consul-client -l

You should see "agent: synced nod info" in the output of status. Go to the UI:
    http://client:8500/ui/




You should see the above image if it was successful. You will see 3 passing. vimWatch the UI during the next step to see how it interacts for health checks. 


6) In order to get consul to use a backgound process instead of the current window you are in, we will need to kill the current process and reboot each of the servers 1 at a time and let them rejoin 1 at a time so not to lose quorum. DO NOT CTRL+C the current process but KILL the process! See OUTAGE doc above about graceful leaves. Yes,  you will need yet another terminal for this. Run the following one server at a time:

    # ps -ef  |grep consul | grep -v grep  ## to get pid of current consul process
    # kill -9 $consul_pid

Go to your Consul UI and take a look at the nodes and consul service. You will see the consul service has 1 failure. Pretty cool?! No worries it will come back after you restart it.





    # reboot 
    OR
    # systemctl start consul-server && systemctl status consul-server -l

You should see that your consul server has rejoined and you didn't lose quorum because the other 2 stayed online. 

Rinse and Repeat Step 6 for all servers and you have a working Consul cluster. Next we will discuss how to register services there and show some of the things I have been doing with integration with Apache Mesos. 







Friday, April 1, 2016

Multihost Docker Networking

One of the major issues that people have with running docker is the fact that docker and containers natively only support localhost networking capabilities. So this means that by default, only the localhost and its services know about the containers. Host A will not be able to communicate with Host B by default. Here is a quick demo of how to use CoreOS's networking project Flannel, so that you can have a multi-host docker environment where all the hosts and their containers can commuincate.
More info on Flannel: https://github.com/coreos/flannel
More info on ETCD: https://github.com/coreos/etcd

You will need to have 1 or more etcd server(s). We will be using a single node for this demo.

On etcd Server(s)
1. Install etcd
2. Configure etcd - /etc/etcd/etcd.conf
    
    # cat /etc/etcd/etcd.conf | grep -v '^#'
    ETCD_NAME=default
    ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
    ETCD_LISTEN_PEER_URLS="http://0.0.0.0:7001,http://localhost:2380"
    ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
    ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379"

3. Enable and start etcd 
    # systemctl enable etcd && systemctl start etcd

4. Define etcd network:
etcdctl mk /blah.com/network/config '{"Network":"172.17.0.0/16"}'

You should be able to get the json for that key.
    # curl -s -L http://ETCD_SERVER:2379/v2/keys/blah.com/network/config | python -m json.tool



On worker/slave/client(s)… any machine that is going to have flannel running
1.  Install flannel
2. Configure flannel - /etc/sysconfig/flanneld
    # cat /etc/sysconfig/flanneld | grep -v '^#'
    FLANNEL_ETCD="http://ETCD_SERVER:2379"
    FLANNEL_ETCD_KEY="/blah.com/network"

3. If Docker is already installed, stop and remove docker interface.
        systemctl stop docker
    ip link delete docker0
    systemctl start flanneld && systemctl enable flanneld
    systemctl start docker

Rinse and repeat for all other desired docker hosts: 

You should be able to see the networking configs and subnets being created.

# curl -s -L http://master:2379/v2/keys/blah.com/network/config | python -m json.tool

# curl -s -L http://master:2379/v2/keys/blah.com/network/subnets | python -m json.tool


You should now be able to ping containers between different hosts!


After starting a few containers on different hosts, try it out. Each host gets its own subnet if you notice below.

Host A:
# docker inspect f41cd57b4ef5 | grep -i ipaddress
        "IPAddress": "172.17.24.3",
        "SecondaryIPAddresses": null,

Host B:
# docker inspect 1b5b48c6be47 | grep -i ipaddress
        "IPAddress": "172.17.80.16",
        "SecondaryIPAddresses": null,


From Host A container, ping container 1b5b48c6be47 on Host B:
# docker exec -it f41cd57b4ef5 ping 172.17.80.16
PING 172.17.80.16 (172.17.80.16): 56 data bytes
64 bytes from 172.17.80.16: seq=0 ttl=62 time=2.336 ms
64 bytes from 172.17.80.16: seq=1 ttl=62 time=0.438 ms
64 bytes from 172.17.80.16: seq=2 ttl=62 time=0.506 ms
^C

Host A is even able to ping container on Host B:
# ping 172.17.80.16
PING 172.17.80.16 (172.17.80.16) 56(84) bytes of data.
64 bytes from 172.17.80.16: icmp_seq=1 ttl=63 time=0.386 ms
64 bytes from 172.17.80.16: icmp_seq=2 ttl=63 time=0.438 ms
^C


Host B is running a nginx container at port 80. Lets curl port 80 on that container from a container on Host A. (will only work if curl installed on container):
# docker exec -it f41cd57b4ef5 curl 172.17.80.16:80
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
100   612  100   612    0     0   321k      0 --:--:-- --:--:-- --:--:--  597k

And from Host A:
# curl 172.17.80.16:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>