Installation

Intended audience

Skyline is not really a localhost application, it needs lots of data, unless you have a localhost Graphite or pickle Graphite to your localhost.

Given the specific nature of Skyline, it is assumed that the audience will have a certain level of technical knowledge, e.g. it is assumed that the user will be familiar with the installation, configuration, operation and security practices and considerations relating to the following components:

  • Graphite
  • Redis
  • MySQL
  • Apache
  • memcached

This installation document is specifically related to the required installs and configurations of things that are directly related Skyline. For notes regarding automation and configuration management see the section at the end of this page.

What the components do

  • Graphite - sends metric data to Skyline Horizon via a pickle
  • Redis - stores mod:settings.FULL_DURATION seconds (usefully 24 hours worth) of timeseries data
  • MySQL - stores data about anomalies and timeseries features fingerprints and for learning things that are not anomalous.
  • Apache - serves the Skyline webapp via gunicorn and handles basic http auth
  • memcached - caches Ionosphere MySQL data

sudo

Use sudo appropriately for your environment wherever necessary.

Steps

Note

All the documentation and testing is based on running Skyline in a Python-2.7.12 virtualenv, if you choose to deploy Skyline another way, you are on your own. Although it is possible to run Skyline in a different type of environment, it does not lend itself to repeatability or a common known state.

  • Create a python-2.7.12 virtualenv for Skyline to run in see Running in Python virtualenv
  • Setup firewall rules to restrict access to the following:
    • settings.WEBAPP_IP - default is 127.0.0.1
    • settings.WEBAPP_PORT - default 1500
    • The IP address and port being used to reverse proxy the Webapp (if implementing) e.g. <YOUR_SERVER_IP_ADDRESS>:8080
    • The IP address and port being used by MySQL (if implementing)
    • The IP address and ports 2024 and 2025
    • The IP address and port being used by Redis
  • Install Redis - see Redis.io
  • Ensure Redis has socket enabled with the following permissions in your redis.conf
unixsocket /tmp/redis.sock
unixsocketperm 777

Note

The unixsocket on the apt redis-server package is /var/run/redis/redis.sock if you use this path ensure you change settings.REDIS_SOCKET_PATH to this path

  • Start Redis
  • Install memcached and start memcached see memcached.org
  • Make the required directories
mkdir /var/log/skyline
mkdir /var/run/skyline
mkdir /var/dump

mkdir -p /opt/skyline/panorama/check
mkdir -p /opt/skyline/mirage/check
mkdir -p /opt/skyline/crucible/check
mkdir -p /opt/skyline/crucible/data
mkdir /etc/skyline
mkdir /tmp/skyline
mkdir -p /opt/skyline/github
cd /opt/skyline/github
git clone https://github.com/earthgecko/skyline.git
  • Once again using the Python-2.7.12 virtualenv, install the requirements using the virtualenv pip, this can take a long time, the pandas install takes quite a while.

Warning

When working with virtualenv Python versions you must always remember to use the activate and deactivate commands to ensure you are using the correct version of Python. Although running a virtualenv does not affect the system Python, not using activate can result in the user making errors that MAY affect the system Python and packages. For example, a user does not use activate and just uses pip not bin/pip2.7 and pip installs some packages. User error can result in the system Python being affected. Get in to the habit of always using explicit bin/pip2.7 and bin/python2.7 commands to ensure that it is harder for you to err.

PYTHON_MAJOR_VERSION="2.7"
PYTHON_VIRTUALENV_DIR="/opt/python_virtualenv"
PROJECT="skyline-py2712"

cd "${PYTHON_VIRTUALENV_DIR}/projects/${PROJECT}"
source bin/activate

# Install the mysql-connector-python package first on its own as due to it
# having to be downloaded and installed from MySQL, if it is not installed
# an install -r will fail as pip cannot find mysql-connector-python
bin/"pip${PYTHON_MAJOR_VERSION}" install http://cdn.mysql.com/Downloads/Connector-Python/mysql-connector-python-1.2.3.zip#md5=6d42998cfec6e85b902d4ffa5a35ce86

# The MySQL download source can now be commented it out of requirements.txt
cat /opt/skyline/github/skyline/requirements.txt | grep -v "cdn.mysql.com/Downloads\|mysql-connector" > /tmp/requirements.txt

# This can take lots and lots of minutes...
bin/"pip${PYTHON_MAJOR_VERSION}" install -r /tmp/requirements.txt

# NOW wait at least 7 minutes (on a Linode 4 vCPU, 4GB RAM, SSD cloud node anyway)
# and once completed, deactivate the virtualenv

deactivate
  • Copy the skyline.conf and edit the USE_PYTHON as appropriate to your setup if it is not using PATH /opt/python_virtualenv/projects/skyline-py2712/bin/python2.7
cp /opt/skyline/github/skyline/etc/skyline.conf /etc/skyline/skyline.conf
vi /etc/skyline/skyline.conf # Set USE_PYTHON as appropriate to your setup
  • OPTIONAL but recommended, serving the Webapp via gunicorn with an Apache reverse proxy.
    • Setup Apache (httpd) and see the example configuration file in your cloned directory /opt/skyline/github/skyline/etc/skyline.httpd.conf.d.example modify all the <YOUR_ variables as appropriate for you environment - see Apache and gunicorn
    • Add a user and password for HTTP authentication, e.g.
htpasswd -c /etc/httpd/conf.d/.skyline_htpasswd admin

Note

Ensure that the user and password for Apache match the user and password that you provide in settings.py for settings.WEBAPP_AUTH_USER and settings.WEBAPP_AUTH_USER_PASSWORD

cd /opt/skyline/github/skyline/skyline
vi settings.py
  • If you are upgrading, at this point return to the Upgrading page.
  • Before you test Skyline by seeding Redis with some test data, ensure that you have configured the firewall/iptables with the appropriate restricted access.
  • Start the Skyline apps
/opt/skyline/github/skyline/bin/horizon.d start
/opt/skyline/github/skyline/bin/analyzer.d start
/opt/skyline/github/skyline/bin/webapp.d start
# And Panorama if you have setup in the DB at this stage
/opt/skyline/github/skyline/bin/panorama.d start
/opt/skyline/github/skyline/bin/ionosphere.d start
  • Check the log files to ensure things started OK and are running and there are no errors.

Note

When checking a log make sure you check the log for the appropriate time, Skyline can log lots fast, so short tails may miss some event you expect between that restart and tail.

# Check what the logs reported when the apps started
head -n 20 /var/log/skyline/*.log

# How are they running
tail -n 20 /var/log/skyline/*.log

# Any errors - each app
find /var/log/skyline -type f -name "*.log" | while read skyline_logfile
do
  echo "#####
# Checking for errors in $skyline_logfile"
  cat "$skyline_logfile" | grep -B2 -A10 -i "error ::\|traceback" | tail -n 60
  echo ""
  echo ""
done
  • Seed Redis with some test data.

Note

if you are UPGRADING and you are using an already populated Redis store, you can skip seeding data.

cd "${PYTHON_VIRTUALENV_DIR}/projects/${PROJECT}"
source bin/activate
bin/python2.7 /opt/skyline/github/skyline/utils/seed_data.py
deactivate
  • Check the Skyline Webapp frontend on the Skyline machine’s IP address and the appropriate port depending whether you are serving it proxied or direct, e.g http://YOUR_SKYLINE_IP:8080 or http://YOUR_SKYLINE_IP:1500. The horizon.test.udp metric anomaly should be in the dashboard after the seed_data.py is complete. If Panorama is set up you will be able to see that in the /panorama view and in the rebrow view as well.
  • Check the log files again to ensure things are running and there are no errors.
  • This will ensure that the Horizon service is properly set up and can receive data. For real data, you have some options relating to getting a data pickle from Graphite see Getting data into Skyline
  • Once you have your settings.ALERTS configured to test them see Alert testing
  • If you have opted to not setup Panorama, later see setup Panorama
  • For Mirage setup see Mirage
  • For Boundary setup see Boundary
  • For Ionosphere setup see Ionosphere

Automation and configuration management notes

The installation of packages in the requirements.txt can take a long time, specifically the pandas build. This will usually take longer than the default timeouts in most configuration management.

That said, requirements.txt can be run in an idempotent manner, however a few things need to be highlighted:

  1. A first time execution of bin/"pip${PYTHON_MAJOR_VERSION}" install -r /opt/skyline/github/skyline/requirements.txt will timeout on configuration management. Therefore consider running this manually first. Once pip has installed all the packages, the requirements.txt will run idempotent with no issue and be used to upgrade via a configuration management run when the requirements.txt is updated with any new versions of packages (with the possible exception of pandas). It is obviously possible to provision each requirement individually directly in configuration management and not use pip to install -r the requirements.txt, however remember the the virtualenv pip needs to be used and pandas needs a LONG timeout value, which not all package classes provide, if you use an exec of any sort, ensure the pandas install has a long timeout.
  2. The mysql-connector-python package is pulled directly from MySQL as no pip version exists. Therefore during the build process it is recommended to pip install the MySQL source package first and then the line out comment in requirements.txt. The mysql-connector-python==1.2.3 line then ensures the dependency is fulfilled.