skyline package

Subpackages

Submodules

skyline.algorithm_exceptions module

exception TooShort[source]: Bases: Exception

exception Stale[source]: Bases: Exception

exception Boring[source]: Bases: Exception

exception EmptyTimeseries[source]

Bases: Exception

An algorithms_exceptions class to handle metrics with empty time series

skyline.algorithm_scores_plot module

algorithm_scores_plot.py

get_algorithm_scores_plot(current_skyline_app, output_file, timeseries, algorithm, anomalous, anomalies, scores, anomaly_window=1, anomalies_in_window=None, unreliable=False, low_entropy_value=None)[source]

Creates a png graph image using the vortex results data.

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
output_file (str) – full path and filename to output where the png image is to be saved to
timeseries (list) – the time series
algorithm (str) – the algorithm
anomalous (boolean) – anomalous
anomalies (dict) – the anomalies dict
scores (list) – the scores list
anomaly_window (int) – the anomaly window
anomaly_in_window (int) – the number of anomalies in the anomaly_window
unreliable (boolean) – unreliable
low_entropy_value (float) – the spectral_entropy low_entropy_value if there is one

Returns:

file

Return type:

boolean|str

skyline.create_matplotlib_graph module

create_matplotlib_graph

create_matplotlib_graph(current_skyline_app, output_file, graph_title, timeseries, anomalies=[], monotonic_timeseries=[])[source]

Creates a png graph image using the features profile time series data provided or from the features profile time seires data in the DB if an empty list is provided.

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
output_file (str) – full path and filename to output where the png image is to be saved to
graph_title (str) – the graph image title
timeseries (list) – the time series
anomalies (list) – the anomaly timestamps [optional to plot anomalies]

Returns:

(status, file)

Return type:

(boolean, str)

skyline.custom_algorithms_to_run module

get_custom_algorithms_to_run(current_skyline_app, base_name, custom_algorithms, debug)[source]: Return a dictionary of custom algoritms to run on a metric determined from the settings.CUSTOM_ALGORITHMS dictionary.

skyline.database module

database.py

get_engine(current_skyline_app)[source]

# @added 20161209 - Branch #922: ionosphere # Task #1658: Patterning Skyline Ionosphere # Use SQLAlchemy, mysql.connector is still around but starting the # move to SQLAlchemy now that all the webapp Ionosphere SQLAlchemy patterns # work

Initialize a sqlalchemy engine.

Parameters:: current_skyline_app (str) – the app calling the function

engine_disposal(current_skyline_app, engine)[source]

Dispose of the sqlalchemy engine.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

ionosphere_table_meta(current_skyline_app, engine)[source]

Autoload the ionosphere table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

metrics_table_meta(current_skyline_app, engine)[source]

Autoload the metrics table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

anomalies_table_meta(current_skyline_app, engine)[source]

Autoload the anomalies table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

ionosphere_matched_table_meta(current_skyline_app, engine)[source]

Autoload the ionosphere_matched table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

ionosphere_layers_table_meta(current_skyline_app, engine)[source]

Autoload the ionosphere_layers table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

layers_algorithms_table_meta(current_skyline_app, engine)[source]

Autoload the layers_algorithms table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

ionosphere_layers_matched_table_meta(current_skyline_app, engine)[source]

Autoload the ionosphere_layers_matched table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

luminosity_table_meta(current_skyline_app, engine)[source]

Autoload the luminosity table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

snab_table_meta(current_skyline_app, engine)[source]

Autoload the snab table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

motifs_matched_table_meta(current_skyline_app, engine)[source]

Autoload the motifs_matched table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

not_anomalous_motifs_table_meta(current_skyline_app, engine)[source]

Autoload the not_anomalous_motifs table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

cloudburst_table_meta(current_skyline_app, engine)[source]

Autoload the cloudburst table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

cloudbursts_table_meta(current_skyline_app, engine)[source]

Autoload the cloudbursts table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

metric_group_table_meta(current_skyline_app, engine)[source]

Autoload the metric_group table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

metric_group_info_table_meta(current_skyline_app, engine)[source]

Autoload the metric_group_info table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

ionosphere_minmax_table_meta(current_skyline_app, engine)[source]

Autoload the ionosphere_minmax table.

Parameters:

current_skyline_app (str) – the app calling the function
engine (object) – the sqlalchemy engine object

Returns:

table_object, fail_msg, trace

Return type:

tuple

skyline.external_alert_configs module

external_alert_configs

get_external_alert_configs(current_skyline_app)[source]

Return a concatenated alerts configs from settings.EXTERNAL_ALERTS of any fetched external alert configs, a all_alerts list which is a concentated and deduplicated list of the and whether it was retrieved from cache or fetched source.

Parameters:: current_skyline_app (str) – the app calling the function so the function knows which log to write too.
Returns:: (external_alert_configs, all_alerts, external_from_cache)
Return type:: (dict, list, boolean)

skyline.features_profile module

features_profile.py

feature_name_id(current_skyline_app, feature_name)[source]

Determine the Skyline id of a tsfresh feature name

Parameters:: feature_name (str) – the tsfresh feature name
Returns:: id
Return type:: int

calculate_features_profile(current_skyline_app, timestamp, metric, context)[source]

Calculates a tsfresh features profile from a training data set

Parameters:

timestamp (str) – the timestamp of metric anomaly with training data
metric (str) – the base_name of the metric
context – the context

Returns:

(features_profile_csv_file_path, successful, fail_msg, traceback_format_exc, calc_time)

Return type:

int

Return type:

(str, boolean, str, str, str)

skyline.fp_match_plots module

Plot motif related graphs

plot_fp_match(current_skyline_app, metric, fp_id, fp_values, not_anomalous_timeseries, output_file, strip_prefix=False)[source]

Creates a png graph image using the features profile time seires data and the training data data, if it existing otherwise grab it from Graphite.

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
output_file (str) – full path and filename to output where the png image is to be saved to
graph_title (str) – the graph image title
timeseries (list) – the time series

Returns:

(status, file)

Return type:

(boolean, str)

skyline.ionosphere_functions module

fp_create_get_an_engine(current_skyline_app)[source]

fp_create_engine_disposal(current_skyline_app, engine)[source]

get_ionosphere_learn_details(current_skyline_app, base_name)[source]

Determines what the default IONOSPHERE_LEARN_DEFAULT_ values and what the specific override values are if the metric matches a pattern defined in settings.IONOSPHERE_LEARN_NAMESPACE_CONFIG. This is used in Panorama, webapp/ionosphere_backend

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
base_name (str) – thee base_name of the metric

Returns:

tuple

Returns:

(use_full_duration, valid_learning_duration, use_full_duration_days, max_generations, max_percent_diff_from_origin)

Return type:

(int, int, int, int, float)

create_fp_ts_graph(current_skyline_app, metric_data_dir, base_name, fp_id, anomaly_timestamp, timeseries)[source]

Creates a png graph image using the features profile time series data provided or from the features profile time seires data in the DB if an empty list is provided.

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
metric_data_dir (str) – the training_data or features profile directory were the png image is to be saved to
base_name (str) – the base_name of the metric
fp_ip – the feature profile id
anomaly_timestamp (int) – the anomaly timestamp
timeseries (list) – the time series

Returns:

boolean

Return type:

boolean

create_features_profile(current_skyline_app, requested_timestamp, data_for_metric, context, ionosphere_job, fp_parent_id, fp_generation, fp_learn, slack_ionosphere_job, user_id, label)[source]

Add a features_profile to the Skyline ionosphere database table.

Parameters:

current_skyline_app (str) – Skyline app name
requested_timestamp (int) – The timestamp of the dir that the features profile data is in
data_for_metric (str) – The base_name of the metric
context (str) – The context of the caller
ionosphere_job (str) – The ionosphere_job name related to creation request valid jobs are learn_fp_human, learn_fp_generation, learn_fp_learnt, learn_fp_automatic and learn_repetitive_patterns.
fp_parent_id (int) – The id of the parent features profile that this was learnt from, 0 being an original human generated features profile
fp_generation (int) – The number of generations away for the original human generated features profile, 0 being an original human generated features profile.
fp_learn (boolean) – Whether Ionosphere should learn at use_full_duration_days
slack_ionosphere_job (str) – The originating ionosphere_job name
user_id (int) – The user id of the user creating the features profile
label (str) – A label for the feature profile

Returns:

fp_id, fp_in_successful, fp_exists, fail_msg, traceback_format_exc

Return type:

str, boolean, boolean, str, str

get_correlations(current_skyline_app, anomaly_id)[source]

Get all the correlations for an anomaly from the database

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
anomaly_id (int) – the panorama anomaly id

Returns:

list

Returns:

[[metric_name, coefficient, shifted, shifted_coefficient],[metric_name, coefficient, …]]

Return type:

[[str, float, float, float]]

get_related(current_skyline_app, anomaly_id, anomaly_timestamp)[source]

Get all the related anomalies from the database

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
anomaly_id (int) – the panorama anomaly id
anomaly_timestamp (int) – the anomaly timestamp

Returns:

list

Returns:

[[metric_name, related_timestamp],[metric_name, related_timestamp],…], [[timestamp, label]]

Return type:

[[str, int]]

skyline.matched_or_regexed_in_list module

matched_or_regexed_in_list

matched_or_regexed_in_list(current_skyline_app, base_name, match_list, debug_log=False)[source]

Determine if a pattern is in a list as a: 1) absolute match 2) match been dotted elements 3) matched by a regex

Parameters:

current_skyline_app (str) – the app calling the function so the function knows which log to write too.
metric_name (str) – the metric name
match_list (list) – the list of items to try and match the metric name in

Returns:

False, {}

Return type:

(boolean, dict)

Returns (matched, matched_by)

skyline.motif_match_types module

motif_match_types_dict()[source]

skyline.motif_plots module

Plot motif related graphs

plot_motif_match(current_skyline_app, metric, metric_timestamp, fp_id, full_duration, generation_str, motif_id, index, size, distance, type_id, fp_motif, not_anomalous_motif_sequence, output_file, on_demand_motif_analysis=False)[source]

Creates a png graph image using the features profile time seires data and the training data data, if it existing otherwise grab it from Graphite.

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
output_file (str) – full path and filename to output where the png image is to be saved to
graph_title (str) – the graph image title
timeseries (list) – the time series

Returns:

(status, file)

Return type:

(boolean, str)

skyline.plot_motif_window module

Plot motif window overlaid on fp

plot_motif_window(current_skyline_app, metric, metric_timestamp, fp_id, full_duration, generation_str, motif_id, index, size, distance, type_id, fp_motif, fp_timeseries, matched_timeseries, output_file, strip_prefix=False)[source]

Creates a png graph image using the features profile time seires data and the training data data, if it exists otherwise if is fetched from Graphite.

Parameters:

current_skyline_app (str) – the Skyline app name calling the function
output_file (str) – full path and filename to output where the png image is to be saved to
graph_title (str) – the graph image title
timeseries (list) – the time series

Returns:

(status, file)

Return type:

(boolean, str)

skyline.settings module

Shared settings

IMPORTANT NOTE:

These settings are described with docstrings for the purpose of automated documentation. You may find reading some of the docstrings easier to understand or read in the documentation itself.

http://earthgecko-skyline.readthedocs.io/en/latest/skyline.html#module-settings

REDIS_SOCKET_PATH = '/tmp/redis.sock'

Variables:: REDIS_SOCKET_PATH (str) – The path for the Redis unix socket [USER_DEFINED]

REDIS_PASSWORD = None

Variables:: REDIS_PASSWORD (str) – The password for Redis, even though Skyline uses socket it is still advisable to set a password for Redis. If this is set to the boolean None Skyline will not use Redis AUTH [USER_DEFINED]

Note

Please ensure that you do enable Redis authentication by setting the requirepass in your redis.conf with a very long Redis password. See https://redis.io/topics/security and http://antirez.com/news/96 for more info.

SECRET_KEY = 'your-long_secret-key-to-encrypt_the_redis_password_in_url_parameters'

Variables:: SECRET_KEY (str) – A secret key that is used to encrypt the Redis password in the rebrow URL parameters.

LOG_PATH = '/var/log/skyline'

Variables:: LOG_PATH (str) – The Skyline logs directory. Do not include a trailing slash.

PID_PATH = '/var/run/skyline'

Variables:: PID_PATH (str) – The Skyline pids directory. Do not include a trailing slash.

SKYLINE_DIR = '/opt/skyline'

Variables:: SKYLINE_DIR (str) – The Skyline dir. All other Skyline.

SKYLINE_TMP_DIR = '/tmp/skyline'

Variables:: SKYLINE_TMP_DIR (str) – The Skyline tmp dir. Do not include a trailing slash. It is recommended you keep this in the /tmp directory which normally uses tmpfs.

FULL_NAMESPACE = 'metrics.'

Variables:: FULL_NAMESPACE (str) – Metrics must be prefixed with this value in Redis.

GRAPHITE_SOURCE = ''

Variables:: GRAPHITE_SOURCE (str) – The data source

ENABLE_DEBUG = False

Variables:: ENABLE_DEBUG (str) – Enable additional debug logging - useful for development only, this should definitely be set to False on production systems.

MINI_NAMESPACE = 'mini.'

Variables:: MINI_NAMESPACE (str) – The Horizon agent will make T’d writes to both the full namespace and the mini namespace. Oculus gets its data from everything in the mini namespace.

FULL_DURATION = 86400

Variables:: FULL_DURATION (int) – This is the rolling duration that will be stored in Redis. Be sure to pick a value that suits your memory capacity, your CPU capacity and your overall metrics count. Longer durations take a longer to analyze, but they can help the algorithms reduce the noise and provide more accurate anomaly detection. However, Mirage handles longer durations so ideally this should be 86400.

MINI_DURATION = 3600

Variables:: MINI_DURATION (int) – This is the duration of the ‘mini’ namespace, if you are also using the Oculus service. It is also the duration of data that is displayed in the Webapp ‘mini’ view.

VERIFY_SSL = True

Variables:: VERIFY_SSL (boolean) – Whether to verify SSL certificates requestsed endpoints. By defualt this is True, however this can be set to False to allow for the use of self signed SSL certificates.

GRAPHITE_AUTH_HEADER = False

Variables:: GRAPHITE_AUTH_HEADER (str) – the Authorization header for Graphite api

GRAPHITE_CUSTOM_HEADERS = {}

Variables:: GRAPHITE_CUSTOM_HEADERS (dict) – Dictionary with custom headers

GRAPHITE_HOST = 'YOUR_GRAPHITE_HOST.example.com'

Variables:: GRAPHITE_HOST (str) – If you have a Graphite host set up, set this variable to get graphs on Skyline and Horizon. Don’t include http:// since this can be used for CARBON_HOST as well. [USER_DEFINED]

GRAPHITE_PROTOCOL = 'http'

Variables:: GRAPHITE_PROTOCOL (str) – Graphite host protocol - http or https [USER_DEFINED]

GRAPHITE_PORT = '80'

Variables:: GRAPHITE_PORT (str) – Graphite host port - for a specific port if graphite runs on a port other than 80, e.g. ‘8888’ [USER_DEFINED]

GRAPHITE_CONNECT_TIMEOUT = 5

Variables:: GRAPHITE_CONNECT_TIMEOUT (int) – Graphite connect timeout - this allows for the gracefully failure of any graphite requests so that no graphite related functions ever block for too long.

GRAPHITE_READ_TIMEOUT = 10

Variables:: GRAPHITE_READ_TIMEOUT (int) – Graphite read timeout

GRAPHITE_GRAPH_SETTINGS = '&width=588&height=308&bgcolor=000000&fontBold=true&fgcolor=C0C0C0'

Variables:: GRAPHITE_GRAPH_SETTINGS (str) – These are graphite settings in terms of alert graphs - this is defaulted to a format that is more colourblind friendly than the default graphite graphs.

TARGET_HOURS = '7'

Variables:: TARGET_HOURS (str) – The number of hours data to graph in alerts.

GRAPHITE_RENDER_URI = 'render'

Variables:: GRAPHITE_RENDER_URI (str) – Base URI for graphite render, this can generally be render, api/datasources/1/render, api/datasources/proxy/1/render, etc, depending on how your Graphite (or grafana proxy) is set up.

GRAPH_URL = 'http://YOUR_GRAPHITE_HOST.example.com:80/render?width=1400&from=-7hour&target='

Variables:: GRAPH_URL (str) – The graphite URL for alert graphs will be appended with the relevant metric name in each alert.

Note

There is probably no neeed to change this unless you what a different size graph sent with alerts.

CARBON_HOST = 'YOUR_GRAPHITE_HOST.example.com'

Variables:: CARBON_HOST (str) – endpoint to send metrics that should reach graphite if the CARBON_HOST is a different host to the GRAPHITE_HOST set it here.

CARBON_PORT = 2003

Variables:: CARBON_PORT (int) – If you have a Graphite host set up, set its Carbon port. [USER_DEFINED]

SKYLINE_METRICS_CARBON_HOST = 'YOUR_GRAPHITE_HOST.example.com'

Variables:: SKYLINE_METRICS_CARBON_HOST (str) – If you want to send the Skyline metrics to a different host other that the GRAPHITE_HOST, declare it here and see the SKYLINE_METRICS_CARBON_PORT setting below.

SKYLINE_METRICS_CARBON_PORT = 2003

Variables:: SKYLINE_METRICS_CARBON_PORT (int) – If you want to send the Skyline metrics to a different SKYLINE_METRICS_CARBON_HOST host other than the GRAPHITE_HOST and it has a different port to the CARBON_PORT, declare it here.

OCULUS_HOST = ''

Variables:: OCULUS_HOST (str) – If you have Oculus set up, set this to http://<OCULUS_HOST>

If you do not want to use Oculus, leave this empty. However if you comment this out, Skyline will not work! Speed improvements will occur when Oculus support is disabled.

SERVER_METRICS_NAME = 'YOUR_HOSTNAME'

Variables:: SERVER_METRICS_NAME (str) – The hostname of the Skyline. [USER_DEFINED]

This is to allow for multiple Skyline nodes to send metrics to a Graphite instance on the Skyline namespace sharded by this setting, like carbon.relays. If you want multiple Skyline hosts, set the hostname of the skyline here and metrics will be as e.g. skyline.analyzer.skyline-01.run_time

SKYLINE_FEEDBACK_NAMESPACES = ['YOUR_HOSTNAME', 'YOUR_GRAPHITE_HOST.example.com']

Variables:

SKYLINE_FEEDBACK_NAMESPACES (list) –

This is a list of namespaces that can cause feedback in Skyline. If you are analysing the system metrics of the Skyline host (server or container), then if a lot of metrics become anomalous, the Skyline host/s are going to be working much more and pulling more data from the GRAPHITE_HOST, the Skyline mysql database metrics and Redis queries will all change substantially too. Although Skyline can be trained and learn them, when Skyline is in a known busy state, the monitoring of its own metrics and related metrics should take 2nd priority. In fact when the ionosphere_busy state is determined, Skyline will rate limit the analysis of any metrics in the namespaces declared here. This list works in the same way that Horizon SKIP_LIST does, it matches in the string or dotted namespace elements. [USER_DEFINED] For example:

SKYLINE_FEEDBACK_NAMESPACES = [
    SERVER_METRICS_NAME,
    'stats.skyline-docker-graphite-statsd-1',
    'stats.skyline-mysql']

DO_NOT_SKIP_SKYLINE_FEEDBACK_NAMESPACES = []

Variables:

DO_NOT_SKIP_SKYLINE_FEEDBACK_NAMESPACES (list) –

This is a list of namespaces or metrics that may be in the SKYLINE_FEEDBACK_NAMESPACES that you DO NOT want to skip as feedback metrics but always want analysed. [USER_DEFINED] Metrics will be evaluated against namespaces in this list using matched_or_regexed_in_list() which determines if a pattern is in a list as a: 1) absolute match 2) match by dotted elements 3) matched by a regex

For example:

DO_NOT_SKIP_SKYLINE_FEEDBACK_NAMESPACES = [
    'nginx',
    'disk.used_percent',
    'system.load15'
]

MIRAGE_CHECK_PATH = '/opt/skyline/mirage/check'

Variables:: MIRAGE_CHECK_PATH (str) – This is the location the Skyline analyzer will write the second order resolution anomalies to check to a file on disk - absolute path

CRUCIBLE_CHECK_PATH = '/opt/skyline/crucible/check'

Variables:: CRUCIBLE_CHECK_PATH (str) – This is the location the Skyline apps will write the anomalies to for crucible to check to a file on disk - absolute path

PANORAMA_CHECK_PATH = '/opt/skyline/panorama/check'

Variables:: PANORAMA_CHECK_PATH (str) – This is the location the Skyline apps will write the anomalies to for Panorama to check to a file on disk - absolute path

DATA_UPLOADS_PATH = '/tmp/skyline/data_uploads'

Variables:: DATA_UPLOADS_PATH (str) – The path that webapp writes uploaded data to and flux checks for data to process. Note the parent directory must be writable to the user that the Skyline processes are running as. This is related to the settings.FLUX_PROCESS_UPLOADS and settings.WEBAPP_ACCEPT_DATA_UPLOADS settings.

PANDAS_VERSION = '0.18.1'

Variables:: PANDAS_VERSION (str) – Pandas version in use (only applicable to Skyline < v2.0.0)

Declaring the version of pandas in use reduces a large amount of interpolating in all the skyline modules. There are some differences from pandas >= 0.18.0 however the original Skyline could run on lower versions of pandas.

ALERTERS_SETTINGS = True

Variables:: ALERTERS_SETTINGS (boolean) – just leave this as True

Note

Alerters can be enabled alerters due to that fact that not everyone will necessarily want all 3rd party alerters. Enable what 3rd alerters you require here. This enables only the alerters that are required to be imported and means that not all alerter related modules in requirements.txt have to be installed, only those you require.

SYSLOG_ENABLED = True

Variables:: SYSLOG_ENABLED (boolean) – enables Skyline apps to submit anomalous metric details to syslog. Being set to True makes syslog a kind of alerter, like a SMTP alert. It also results in all anomalies being recorded in the database by Panorama and this is the desired default.

HIPCHAT_ENABLED = False

Variables:: HIPCHAT_ENABLED (boolean) – [DEPRECATED] Enables the Hipchat alerter

PAGERDUTY_ENABLED = False

Variables:: PAGERDUTY_ENABLED (boolean) – Enables the Pagerduty alerter [USER_DEFINED]

SLACK_ENABLED = False

Variables:: SLACK_ENABLED (boolean) – Enables the Slack alerter [USER_DEFINED]

HTTP_ALERTERS_ENABLED = False

Variables:: HTTP_ALERTERS_ENABLED (boolean) – Enables the http alerter

START_IF_NO_DB = False

Variables:: START_IF_NO_DB (boolean) – This allows the Skyline apps to start if there is a DB issue and/or the DB is not available. By default the apps will not start if the DB is not available, but this is useful for testing and allowing the Skyline apps to continue to function and alert, even if it is in a limited fashion and defualts to nosiy 3-sigma alerting.

ANALYZER_ENABLED = True

Variables:: ANALYZER_ENABLED (boolean) – This enables analysis via Analyzer. For ADVANCED configurations only. If this is set to False, the Analyzer process can still be started and will process the metrics in the pipeline but it will NOT analyse them, therefore there will be no alerting, no feeding Mirage, etc. Analyzer will simply run as if there are 0 metrcis. This allows for an advanced modular set up for running multiple distributed Skyline instance.

ANALYZER_VERBOSE_LOGGING = True

Variables:: ANALYZER_VERBOSE_LOGGING (boolean) – As of Skyline 3.0, apps log notice and errors only. To have addtional info logged set this to True. Useful for debugging but less verbose than LOCAL_DEBUG.

ANOMALY_DUMP = 'webapp/static/dump/anomalies.json'

Variables:: ANOMALY_DUMP (str) – This is the location the Skyline agent will write the anomalies file to disk. It needs to be in a location accessible to the webapp.

ANALYZER_PROCESSES = 1

Variables:: ANALYZER_PROCESSES (int) – This is the number of processes that the Skyline Analyzer will spawn.

Analysis is a very CPU-intensive procedure. You will see optimal results if you set ANALYZER_PROCESSES to several less than the total number of CPUs on your server. Be sure to leave some CPU room for the Horizon workers and for Redis.
IMPORTANTLY bear in mind that your Analyzer run should be able to analyze all your metrics in the same resoluton as your metrics. So for example if you have 1000 metrics at a resolution of 60 seconds (e.g. one datapoint per 60 seconds), you are aiming to try and analyze all of those within 60 seconds. If you do not the anomaly detection begins to lag and it is no longer really near realtime. That stated, bear in mind if you are not processing 10s of 1000s of metrics, you may only need one Analyzer process. To determine your optimal settings take note of ‘seconds to run’ values in the Analyzer log.

ANALYZER_OPTIMUM_RUN_DURATION = 60

Variables:: ANALYZER_OPTIMUM_RUN_DURATION (int) – This is how many seconds it would be optimum for Analyzer to be able to analyze all your metrics in.

Note

In the original Skyline this was hardcorded to 5.

MAX_ANALYZER_PROCESS_RUNTIME = 180

Variables:: MAX_ANALYZER_PROCESS_RUNTIME (int) – What is the maximum number of seconds an Analyzer process should run analysing a set of assigned_metrics

How many seconds This is for Analyzer to self monitor its own analysis threads and terminate any threads that have run longer than this. Although Analyzer and mutliprocessing are very stable, there are edge cases in real world operations which can very infrequently cause a process to hang.

STALE_PERIOD = 500

Variables:: STALE_PERIOD (int) – This is the duration, in seconds, for a metric to become ‘stale’ and for the analyzer to ignore it until new datapoints are added. ‘Staleness’ means that a datapoint has not been added for STALE_PERIOD seconds.

CUSTOM_STALE_PERIOD = {}

Variables:: CUSTOM_STALE_PERIOD (dict) – This enables overriding the settings.STALE_PERIOD per metric namespace to become ‘stale’ and for the analyzer/boundary to ignore it until new datapoints are added. ‘Staleness’ means that a datapoint has not been added for the defined number of seconds. The namespaces can be absolute metric names, substrings (dotted elements) of a namespace or a regex of a namespace. First match wins so ensure metrics are not defined by multiple entries.

Example:

CUSTOM_STALE_PERIOD = {
    'cswellsurf.buoys': 7200,
}

ALERT_ON_STALE_METRICS = True

Variables:: ALERT_ON_STALE_METRICS (boolean) – Send a digest alert of all metrics that stop populating their time series data.

ALERT_ON_STALE_PERIOD = 300

Variables:: ALERT_ON_STALE_PERIOD (int) – This is the duration, in seconds, after which an alert will be sent for a metric if it stops sending data. The digest alert will only occur once while in the window between the ALERT_ON_STALE_PERIOD and the STALE_PERIOD (or CUSTOM_STALE_PERIOD).

MIN_TOLERABLE_LENGTH = 100

Variables:: MIN_TOLERABLE_LENGTH (int) – This is the minimum length of a timeseries, in datapoints, for the analyzer to recognize it as a complete series. It can be set at 1 but there is little point in analysing a timeseries with a single data point. It is set to 100 by default, but could be realistically brought down to 60 or 30. It is difficult for algorithms to work with data with such few samples. However if you have some very sparsely populated metrics for whatever reason the perhaps 100 is too high and you may miss some analysis on these types of metrics. Generally your densely populated metrics are the vast majority of your metric population and if not, there are other settings that can be configured to that handle sparsely populated metrics better. See: settings.ZERO_FILL_NAMESPACES, settings.FLUX_ZERO_FILL_NAMESPACES and settings.LAST_KNOWN_VALUE_NAMESPACES,

MAX_TOLERABLE_BOREDOM = 100

Variables:: MAX_TOLERABLE_BOREDOM (int) – Sometimes a metric will continually transmit the same number. There’s no need to analyze metrics that remain boring like this, so this setting determines the amount of boring datapoints that will be allowed to accumulate before the analyzer skips over the metric. If the metric becomes noisy again, the analyzer will stop ignoring it.

BOREDOM_SET_SIZE = 1

Variables:: BOREDOM_SET_SIZE (int) – By default, the analyzer skips a metric if it it has transmitted a single number settings.MAX_TOLERABLE_BOREDOM times.

Change this setting if you wish the size of the ignored set to be higher (ie, ignore the metric if there have only been two different values for the past settings.MAX_TOLERABLE_BOREDOM datapoints). This is useful for timeseries that often oscillate between two values.

IDENTIFY_AIRGAPS = False

Variables:: IDENTIFY_AIRGAPS (boolean) – Identify metrics which have air gaps and publish to the analyzer.airgapped_metrics Redis set which is exposed via the webapp on /api?airgapped_metrics [ADVANCED FEATURE]

Note that the implementation of this feature has a computational cost and will increase analyzer.run_time and CPU usage. If you are not back filling airgaps by via Flux or directly to carbon-relay DO NOT enable this.
Enabling this also enables the IDENTIFY_UNORDERED_TIMESERIES features which is a part of the IDENTIFY_AIRGAPS functionality.
If you do enable this, consider specifying specific metrics and/or namespaces in CHECK_AIRGAPS below to limit only checking metrics which will be back filled and not all your metrics.

MAX_AIRGAP_PERIOD = 21600

Variables:: MAX_AIRGAP_PERIOD (int) – If IDENTIFY_AIRGAPS is enabled Analyzer will only flag metric that have any air gaps in the last MAX_AIRGAP_PERIOD seconds as air gapped. [ADVANCED FEATURE]

CHECK_AIRGAPS = []

Variables:: CHECK_AIRGAPS (list) – If set to [] ALL metrics will be check. List metrics and namespaces that you explicitly want to identify airgaps in, this is only applicable if you have IDENTIFY_AIRGAPS enabled. If metrics and/or namespaces are listed here ONLY these will be checked. [ADVANCED FEATURE]

Seeing as IDENTIFY_AIRGAPS can be computationally expensive, this allows you to only check specific metrics for airgaps.

The CHECK_AIRGAPS are also matched just dotted namespace elements too, if a match is not found in the string, then the dotted elements are compared. For example if an item such as ‘skyline.analyzer.algorithm_breakdown’ was added it would macth any metric that matched all 3 dotted namespace elements, so it would match:

skyline.analyzer.skyline-1.algorithm_breakdown.histogram_bins.timing.median_time skyline.analyzer.skyline-1.algorithm_breakdown.histogram_bins.timing.times_run skyline.analyzer.skyline-1.algorithm_breakdown.ks_test.timing.times_run

Example:

CHECK_AIRGAPS = [
‘remote_hosts’, ‘external_hosts.dc1’,

]

SKIP_AIRGAPS = []

Variables:: SKIP_AIRGAPS (list) – List metrics that you you do not want to identify airgaps in, this is only applicable if you have IDENTIFY_AIRGAPS enabled. [ADVANCED FEATURE]

These are metrics that, for whatever reason, you do not want to check to see if any airgaps are present.

The SKIP_AIRGAPS are also matched just dotted namespace elements too, if a match is not found in the string, then the dotted elements are compared. For example if an item such as ‘skyline.analyzer.algorithm_breakdown’ was added it would macth any metric that matched all 3 dotted namespace elements, so it would match:

skyline.analyzer.skyline-1.algorithm_breakdown.histogram_bins.timing.median_time skyline.analyzer.skyline-1.algorithm_breakdown.histogram_bins.timing.times_run skyline.analyzer.skyline-1.algorithm_breakdown.ks_test.timing.times_run

Example:

SKIP_AIRGAPS = [
‘carbon’, ‘skyline’, ‘stats’,

]

IDENTIFY_UNORDERED_TIMESERIES = False

Variables:: IDENTIFY_UNORDERED_TIMESERIES (boolean) – Identify metrics that are not ordered correctly via time stamps for Analyzer to sort and deduplicate and recreate the Redis metric data with the correctly sorted time series data, in a manner as to not lose any data. [ADVANCED FEATURE]

Note that the implementation of this feature has a small computational cost. If enabled it uses a small part of the IDENTIFY_AIRGAPS feature described above.
If IDENTIFY_AIRGAPS is enabled this is enabled by default, even if IDENTIFY_UNORDERED_TIMESERIES = False
This was introduced as external sources sending metrics via Flux could send metric data out of order, which Graphite will handle but will pickle to Horizon and make the metric Redis data unordered. This definitely occurs if a metric is back filled via Flux or directly to carbon-relay. Although Flux identifies these metrics for Analyzer via flux.filled Redis keys, Analyzer can undertake this check on its own to handle any cases where for whatever reason any metric data becomes unordered, even if IDENTIFY_AIRGAPS is not enabled. [ADVANCED FEATURE]

CHECK_DATA_SPARSITY = True

Variables:: CHECK_DATA_SPARSITY (boolen) – ADVANCED FEATURE - in Analyzer metrics_manager determine how many metrics are fully populated (good), becoming increasingly sparsely populated (bad, not receiving data), becoming more densely populated (good) and the average data sparsity for the entire metric population.

SKIP_CHECK_DATA_SPARSITY_NAMESPACES = ['otel.traces', 'skyline.ionosphere.feature_calculation_time', 'skyline.mirage.run_time']

Variables:: SKIP_CHECK_DATA_SPARSITY_NAMESPACES (list) – ADVANCED FEATURE - if there are metrics in population that you expect to not send data all the time you can declare the namespaces if you do not want them influencing the data sparsity measurements. This is a list of absolute metric names, substrings (dotted elements) of a namespace or a regex of a namespace

FULLY_POPULATED_PERCENTAGE = 94.0

Variables:: FULLY_POPULATED_PERCENTAGE (float) – ADVANCED FEATURE - the percent of data points required in a time series for it to be considered as fully populated at settings.FULL_DURATION. Any time series with more than this is considered fully populated. Skyline calculates this value based on the metric resolution/frequency from the time series data that Skyline automatically calculates.

SPARSELY_POPULATED_PERCENTAGE = 40.0

Variables:: SPARSELY_POPULATED_PERCENTAGE (float) – ADVANCED FEATURE - the percent of data points required in a time series for it to be considered as sparse populated at settings.FULL_DURATION. Any time series with less than this is considered sparsely populate. Skyline calculates this value based on the metric resolution/frequency from the time series data that Skyline automatically calculates.

ANALYZER_CHECK_LAST_TIMESTAMP = False

Variables:: ANALYZER_CHECK_LAST_TIMESTAMP (boolen) – ADVANCED FEATURE - whether to make Analyzer record the last analyzed timestamp per metric and only submit the metric to be analysed if it has a new timestamp for the last (or is stale). Where high frequency metrics are used, this is generally not required, however it is useful and can substantially reduce the amount of analysis run if you have lots of lower frequency metrics.

BATCH_PROCESSING = False

Variables:: BATCH_PROCESSING (boolen) – Whether to apply batch processing to metrics which are recieved in batches. In general this should not be enabled for all metrics as it significantly increases the computational footprint and increases memory use and calls. It should only be enabled if you have metrics that are receieved in infrequent batches, metrics feed per minute do not require batch processing. For example if a metric/s are sent to Skyline every 15 minutes with a data point for each minute in the period, Analyzer’s default analysis would only analyse the latest data point against the metric time series data. With batch processing, Analyzer identifies batched metrics and when a batch of data is received Analyzer sends the metric/s to analyzer_batch to analyse. To ensure that this can be achieved as computationally cheap as possible the BATCH_PROCESSING_NAMESPACES list can be applied, to reduce the footprint of this functionality. ADVANCED FEATURE

BATCH_PROCESSING_STALE_PERIOD = 86400

Variables:: BATCH_PROCESSING_STALE_PERIOD (int) – This is the duration, in seconds, for a metric to be deemed as stale and for the analyzer_batch to ignore it until new datapoints are added. ADVANCED FEATURE

BATCH_PROCESSING_DEBUG = False

Variables:: BATCH_PROCESSING_DEBUG (boolen) – Whether to log batch processing info from Analyzer.

BATCH_PROCESSING_NAMESPACES = []

Variables:: BATCH_PROCESSING_NAMESPACES (list) – If BATCH_PROCESSING is eanbled to reduce the computational footprint of batch processing metric time series data, a list of metric namespaces which can be expected to be batch processed can be defined so that BATCH_PROCESSING keys, checks and resources are not applied to all metrics. This list works in the same way that SKIP_LIST does, it matches in the string or dotted namespace elements. ADVANCED FEATURE

METRICS_INACTIVE_AFTER = 82800

Variables:: METRICS_INACTIVE_AFTER (int) – Identify metrics as inactive after the defined seconds.

CANARY_METRIC = 'skyline.horizon.YOUR_HOSTNAME.worker.metrics_received'

Variables:: CANARY_METRIC (str) – The metric name to use as the CANARY_METRIC [USER_DEFINED]

The canary metric should be a metric with a very high, reliable resolution that you can use to gauge the status of the system as a whole. Like a metric in the carbon. or a Skyline namespace. metric like: CANARY_METRIC = ‘skyline.%s.worker.metrics_received’ % SERVER_METRICS_NAME
In the cluster context this is an ADVANCED_FEATURE due to the fact that it is more difficult to decide because the metric must be a metric assigned to the cluster node shard, meaning the metric must be available in this cluster node’s local Redis, it needs to be the authoritative_node. You can find one in /rebrow on the cluster node searching for metrics.skyline.* to use as the CANARY_METRIC per cluster node. Further if you are sending the skyline metrics from the cluster nodes to a different skyline for analysis then metrics.carbon.* or other.

ALGORITHMS = ['histogram_bins', 'first_hour_average', 'stddev_from_average', 'grubbs', 'ks_test', 'mean_subtraction_cumulation', 'median_absolute_deviation', 'stddev_from_moving_average', 'least_squares']

Variables:: ALGORITHMS (list) – These are the algorithms that the Analyzer will run. To add a new algorithm, you must both define the algorithm in algorithms.py and add it’s name here.

CONSENSUS = 6

Variables:: CONSENSUS (int) – This is the number of algorithms that must return True before a metric is classified as anomalous by Analyzer.

ANALYZER_ANALYZE_LOW_PRIORITY_METRICS = True

Variables:: ANALYZER_ANALYZE_LOW_PRIORITY_METRICS (boolean) – By default all low priority metrics are analysed normally like high priority metrics. For any type of analysis to be run against low priority metrics this must be set to True. Setting this value to False disables ALL analysis of low priority metrics, they are simply skipped (except in Luminosity correlations). To configure low priority metrics with any of the below LOW_PRIORITY_METRICS settings, this value must be set to True. ADVANCED FEATURE. See https://earthgecko-skyline.readthedocs.io/en/latest/analyzer.html#high-and-low-priority-metrics

ANALYZER_DYNAMICALLY_ANALYZE_LOW_PRIORITY_METRICS = False

Variables:: ANALYZER_DYNAMICALLY_ANALYZE_LOW_PRIORITY_METRICS (boolean) – ADVANCED FEATURE. This mode will attempt to dynamically analyse as many low priority metrics as possible in the available time, looping through the metrics on a best effort basis to analyse low priority metrics as frequently as possible without causing lag on the analysis of high priority metrics. ADVANCED FEATURE

ANALYZER_MAD_LOW_PRIORITY_METRICS = 0

Variables:: ANALYZER_MAD_LOW_PRIORITY_METRICS (int) – ADVANCED FEATURE. This is the number of data points on which to calculate MAD against to determine if a low priority metric should be analysed via the three-sigma algorithms. The default value of 0 disables this feature. If set, this should not be greater than 15 at the most as it will result in a performance loss, see https://earthgecko-skyline.readthedocs.io/en/latest/analyzer.html#ANALYZER_MAD_LOW_PRIORITY_METRICS Note if ANALYZER_DYNAMICALLY_ANALYZE_LOW_PRIORITY_METRICS is set to True, ANALYZER_MAD_LOW_PRIORITY_METRICS will be automatically set to 10 if it is set to 0 here. ADVANCED FEATURE

ANALYZER_SKIP = []

Variables:: ANALYZER_SKIP (list) – Namespaces to not analyse. These are metrics that you do not want Analyzer to analyse. It allows for disabling the analysis on certain namespaces. Works in the same way as SKIP_LIST works, it matches in the string or dotted namespace elements. If you never want to analyse a namespace rather add it to SKIP_LIST. ANALYZER_SKIP is more suited to allow for temporarily disabling analysis on a namespace but the metrics time series data still gets submitted to Redis. Whereas metrics in SKIP_LIST just get dropped. HOWEVER be aware that adding metrics to ANALYZER_SKIP for a short period means that anomalies will not be detected or recorded and if you have AUTOMATICALLY_LEARN_NORMAL enabled then this may result in automatic learning things that are not normal for the metrics that have been skipped. Therefore, if you have AUTOMATICALLY_LEARN_NORMAL enabled rather do not use this feature and rather consider adding the metrics via the MUTE_ALERTS_ON feature in the Skyline UI.

CUSTOM_ALGORITHMS = {}

Variables:: CUSTOM_ALGORITHMS (dict) – Custom algorithms to run. An empty dict {} disables this feature. Only available with analyzer, analyzer_batch and mirage. ADVANCED FEATURE.

For full documentation see https://earthgecko-skyline.readthedocs.io/en/latest/algorithms/custom-algorithms.html

CUSTOM_ALGORITHMS example:

CUSTOM_ALGORITHMS = {
    'abs_stddev_from_median': {
        'namespaces': ['telegraf.cpu-total.cpu.usage_system'],
        'algorithm_source': '/opt/skyline/github/skyline/skyline/custom_algorithms/abs_stddev_from_median.py',
        'algorithm_parameters': {},
        'max_execution_time': 0.09,
        'consensus': 6,
        'algorithms_allowed_in_consensus': [],
        'run_3sigma_algorithms': True,
        'run_before_3sigma': True,
        'run_only_if_consensus': False,
        'trigger_history_override': 0,
        'use_with': ['analyzer', 'analyzer_batch', 'mirage'],
        'debug_logging': False,
    },
    'last_same_hours': {
        'namespaces': ['telegraf.cpu-total.cpu.usage_user'],
        'algorithm_source': '/opt/skyline/github/skyline/skyline/custom_algorithms/last_same_hours.py',
        # Pass the argument 1209600 for the sample_period parameter and
        # enable debug_logging in the algorithm itself
        'algorithm_parameters': {
          'sample_period': 604800,
          'debug_logging': True
        },
        'max_execution_time': 0.3,
        'consensus': 6,
        'algorithms_allowed_in_consensus': [],
        'run_3sigma_algorithms': True,
        'run_before_3sigma': True,
        'run_only_if_consensus': False,
        'trigger_history_override': 0,
        # This does not run on analyzer as it is weekly data
        'use_with': ['mirage'],
        'debug_logging': False,
    },
    'detect_significant_change': {
        'namespaces': ['swell.buoy.*.Hm0'],
        # Algorithm source not in the Skyline code directory
        'algorithm_source': '/opt/skyline_custom_algorithms/detect_significant_change/detect_significant_change.py',
        'algorithm_parameters': {},
        'max_execution_time': 0.002,
        'consensus': 1,
        'algorithms_allowed_in_consensus': ['detect_significant_change'],
        'run_3sigma_algorithms': False,
        'run_before_3sigma': True,
        'run_only_if_consensus': False,
        'trigger_history_override': 0,
        'use_with': ['analyzer', 'crucible'],
        'debug_logging': True,
    },
    'skyline_matrixprofile': {
        'namespaces': ['telegraf'],
        'algorithm_source': '/opt/skyline/github/skyline/skyline/custom_algorithms/skyline_matrixprofile.py',
        'algorithm_parameters': {'windows': 5, 'k_discords': 20},
        'max_execution_time': 5.0,
        'consensus': 1,
        'algorithms_allowed_in_consensus': ['skyline_matrixprofile'],
        'run_3sigma_algorithms': True,
        'run_before_3sigma': False,
        'run_only_if_consensus': True,
        'trigger_history_override': 4,
        'use_with': ['mirage'],
        'debug_logging': False,
    },
    'skyline_ARTime': {
        'namespaces': ['telegraf', 'skyline'],
        'algorithm_source': '/opt/skyline/github/skyline/skyline/custom_algorithms/skyline_ARTime.py',
        'algorithm_parameters': {
            'windows': 16, 'probationary_period': 216, 'windows_per_pb': 13,
            'sstep': 1, 'discretize_chomp': 0.075, 'nlevels': 80,
            'mask_rho_after_anomaly': 80, 'trend_window': 20,
            'initial_rho': 0.80, 'learn_all': 'false'
        },
        'max_execution_time': 6.0,
        'consensus': 1,
        'algorithms_allowed_in_consensus': ['skyline_ARTime'],
        'run_3sigma_algorithms': True,
        'run_before_3sigma': False,
        'run_only_if_consensus': True,
        'trigger_history_override': 6,
        'use_with': ['mirage'],
        'debug_logging': False,
        'handler': 'flock',
    },
}

Each dictionary item needs to be named the same as the algorithm to be run
CUSTOM_ALGORITHMS dictionary keys and values are:

Parameters:

namespaces (list) – a list of absolute metric names, substrings (dotted elements) of a namespace or a regex of a namespace
algorithm_source (str) – the full path to the custom algorithm Python file
algorithm_parameters (dict) – a dictionary of any parameters to pass to the custom algorithm
max_execution_time – the maximum time the algorithm should run, any longer than this value and the algorithm process will be timed out and terminated. Bear in mind that algorithms have to run FAST, otherwise analysis stops being real time and the Skyline apps will terminate their own spawned processes. Consider that Skyline’s 3 sigma algorithms take on average 0.0023 seconds to run and all 9 are run on a metric in about 0.0207 seconds.
consensus (int) – The number of algorithms that need to trigger, including this one for a data point to be classed as anomalous, you can declare the same as the settings.CONSENSUS value or that +1 or simply 1 if you want as anomaly triggered because the custom algorithm triggered.
algorithms_allowed_in_consensus (list) – this is not implemented yet and is optional
run_3sigma_algorithms (boolean) – a boolean stating whether to run normal 3 sigma algorithms, this is optional and defaults to True if it is not passed in the dictionary. Read the full documentation referred to above to determine the effects of passing this as False.
run_before_3sigma (boolean) – a boolean stating whether to run the custom algorithm before the normal three-sigma algorithms, this defaults to True. If you want your custom algorithm to run after the three-sigma algorithms set this to False. Read the full documentation referred to above to determine the effects of passing this as False.
run_only_if_consensus (boolean) – a boolean stating whether to run the custom algorithm only if CONSENSUS or MIRAGE_CONSENSUS is achieved, it defaults to False. This only applies to custom algorithms that are run after three-sigma algorithms, e.g. with the parameter run_before_3sigma: False Currently this parameter only uses the CONSENSUS or MIRAGE_CONSENSUS setting and does not apply the consensus parameter above.
trigger_history_override (int) – If the 3-sigma algorithms have reached CONSENSUS this many times in a row, override a custom algorithm result of not anomalous. int
use_with (list) – a list of Skyline apps which should apply the algorithm if they handle the metric, it is only applied if the app handles the metric, generally set this to ['analyzer', 'analyzer_batch', 'mirage', 'crucible']
debug_logging (boolean) – whether to run debug logging on the custom algorithm, normally set this to False but for development and testing True is useful.
handler – this is an optional parameter and only needs to be set for algorithms that need to be run via flock.

DEBUG_CUSTOM_ALGORITHMS = False

Variables:: DEBUG_CUSTOM_ALGORITHMS (boolean) – a boolean to enable debugging.

RUN_OPTIMIZED_WORKFLOW = True

Variables:: RUN_OPTIMIZED_WORKFLOW (boolean) – This sets Analyzer to run in an optimized manner.

This sets Analyzer to run in an optimized manner in terms of using the CONSENSUS setting to dynamically determine in what order and how many algorithms need to be run be able to achieve CONSENSUS. This reduces the amount of work that Analyzer has to do per run. It is recommended that this be set to True in most circumstances to ensure that Analyzer is run as efficiently as possible, UNLESS you are working on algorithm development then you may want this to be False

ENABLE_ALGORITHM_RUN_METRICS = True

Variables:: ENABLE_ALGORITHM_RUN_METRICS (boolean) – This enables algorithm timing metrics to Graphite

This will send additional metrics to the graphite namespaces of: skyline.analyzer.<hostname>.algorithm_breakdown.<algorithm_name>.timings.median_time skyline.analyzer.<hostname>.algorithm_breakdown.<algorithm_name>.timings.times_run skyline.analyzer.<hostname>.algorithm_breakdown.<algorithm_name>.timings.total_time These are related to the RUN_OPTIMIZED_WORKFLOW performance tuning.

ENABLE_ALL_ALGORITHMS_RUN_METRICS = False

Variables:: ENABLE_ALL_ALGORITHMS_RUN_METRICS (boolean) – DEVELOPMENT only - run and time all

Warning

If set to True, Analyzer will revert to it’s original unoptimized workflow and will run and time all algorithms against all timeseries and override RUN_OPTIMIZED_WORKFLOW = True

ENABLE_SECOND_ORDER = False

Variables:: ENABLE_SECOND_ORDER (boolean) – This is to enable second order anomalies.

Warning

EXPERIMENTAL - This is an experimental feature, so it’s turned off by default and it does nothing currently.

ENABLE_ALERTS = True

Variables:: ENABLE_ALERTS (boolean) – This enables Analyzer alerting.

ENABLE_MIRAGE = False

Variables:: ENABLE_MIRAGE (boolean) – This enables Analyzer to output to Mirage [USER_DEFINED]

ENABLE_FULL_DURATION_ALERTS = False

Variables:: ENABLE_FULL_DURATION_ALERTS (boolean) – This enables Analyzer to alert on all FULL_DURATION anomalies.

This enables FULL_DURATION alerting for Analyzer, if True Analyzer will send ALL alerts on any alert tuple that have a SECOND_ORDER_RESOLUTION_HOURS value defined for Mirage in their alert tuple. If False Analyzer will only add a Mirage check and allow Mirage to do the alerting.

Note

If you have Mirage enabled and have defined SECOND_ORDER_RESOLUTION_HOURS values in the desired metric alert tuples, you want this set to False

ANALYZER_CRUCIBLE_ENABLED = False

Variables:: ANALYZER_CRUCIBLE_ENABLED (boolean) – This enables Analyzer to output to Crucible

This enables Analyzer to send Crucible data, if this is set to True ensure that settings.CRUCIBLE_ENABLED is also set to True in the Crucible settings block.

Warning

Not recommended will make a LOT of data files in the settings.CRUCIBLE_DATA_FOLDER and is for development.

ALERTS = (('skyline', 'smtp', 1800), ('skyline_test.alerters.test', 'smtp|http_alerter-mock_api_alerter_receiver', 1), ('horizon.test.udp', 'smtp|http_alerter-mock_api_alerter_receiver', 1), ('horizon.test.pickle', 'smtp|http_alerter-mock_api_alerter_receiver', 1), ('skyline_test.alerters.test', 'slack', 1))

Variables:: ALERTS (tuples) – This enables analyzer alerting and defines metrics to analyse with Mirage [USER_DEFINED]

This is the config for which metrics to alert on and which strategy to use for each. Alerts will not fire twice within EXPIRATION_TIME, even if they trigger again. NOTE any metrics you want to be analysed by Mirage must be covered by a stmp alert and must have a SECOND_ORDER_RESOLUTION_HOURS defined in the alert. NOTE smtp alerts must be declared first, all other alert tuples MUST be declared AFTER smtp alert tuples.

Tuple schema example:

ALERTS = (
    # ('<metric_namespace>', '<alerter>', EXPIRATION_TIME, SECOND_ORDER_RESOLUTION_HOURS),
    # With SECOND_ORDER_RESOLUTION_HOURS being optional for analysing metrics with Mirage
    ('metric1', 'smtp', 1800),
    ('important_metric.total', 'smtp', 600),
    ('important_metric.total', 'pagerduty', 1800),
    # Log all anomalies to syslog
    ('stats.', 'syslog', 1),
    # Wildcard namespaces can be used as well
    ('metric4.thing.*.requests', 'stmp', 900),
    # However beware of wildcards as the above wildcard should really be
    ('metric4.thing\..*.\.requests', 'stmp', 900),
    # mirage - SECOND_ORDER_RESOLUTION_HOURS - if added and Mirage is enabled
    ('metric5.thing.*.rpm', 'smtp', 900, 168),
    ('org_website.status_code.500', 'smtp', 1800),
    # NOTE: all other alert tuples MUST be declared AFTER smtp alert tuples
    ('metric3', 'slack', 600),
    ('stats.', 'http_alerter_external_endpoint', 30),
    # Send SMS alert via AWS SNS and to slack
    ('org_website.status_code.500', 'sms', 1800),
    ('org_website.status_code.500', 'slack', 1800),
)

Alert tuple parameters are:

Parameters:

metric (str) – metric name or pattern.
alerter (str) – the alerter name e.g. smtp, syslog, slack, pagerduty, http_alerter_<name> or sms.
EXPIRATION_TIME (int) – Alerts will not fire twice within this amount of seconds, even if they trigger again.
SECOND_ORDER_RESOLUTION_HOURS (int) – (optional) The number of hours that Mirage should surface the metric timeseries for when being analysed. Adding this is what enables metrics to be sent to Mirage for analysis of a longer timeframe.

Note

Consider using the default skyline_test.alerters.test for testing alerts with.

Note

All other alerts must be declared AFTER smtp alerts as other alerts rely on the smtp resources.

EXTERNAL_ALERTS = {}

Variables:: EXTERNAL_ALERTS (dict) – ADVANCED FEATURE - Skyline can get json alert configs from external sources.

See the External alerts documentation for the elements the are required in a json alert config and how external alerts are applied. https://earthgecko-skyline.readthedocs.io/en/latest/alerts.html#external-alert-configs

Example:

EXTERNAL_ALERTS = {
    'test_alert_config': {
        'url': 'http://127.0.0.1:1500/mock_api?test_alert_config',
        'method': 'GET',
    },
}

DO_NOT_ALERT_ON_STALE_METRICS = ['skyline.ionosphere.feature_calculation_time', 'skyline.mirage.run_time', 'otel.traces']

Variables:: DO_NOT_ALERT_ON_STALE_METRICS (list) – Metrics to not digest alert on if they are becoming stale.

These are metrics that you do not want a Skyline stale digest alert on. Works in the same way that SKIP_LIST does, it matches in the string or dotted namespace elements.

PLOT_REDIS_DATA = True

Variables:: PLOT_REDIS_DATA (boolean) – Plot graph using Redis timeseries data on with Analyzer alerts.

There are times when Analyzer alerts have no data in the Graphite graphs and/or the data in the Graphite graph is skewed due to retentions aggregation. This mitigates that by creating a graph using the Redis timeseries data and embedding the image in the Analyzer alerts as well.

Note

The Redis data plot has the following additional information as well, the 3sigma upper (and if applicable lower) bounds and the mean are plotted and reported too. Although less is more effective, in this case getting a visualisation of the 3sigma boundaries is informative.

MONOTONIC_METRIC_NAMESPACES = []

Variables:: MONOTONIC_METRIC_NAMESPACES (list) – You can declare any metric namespaces as strictly increasing monotonically metrics and always force analysis to always calculate the derivative values for matched metrics. Strictly increasing monotonicity is a metric that has a count value that always increases and sometimes resets to 0, for example when a service is restarted.

Skyline by default automatically converts strictly increasingly monotonically metric values to their derivative values by calculating the delta between subsequent datapoints. The function ignores datapoints that trend down. This is useful for metrics that increase over time and then reset.

Although strictly increasing monotonically metrics are automatically determined sometimes these metrics do not change very often and remain static for long periods which means that they may not be automatically classified as monotonic when a change occurs. You can specifically declare any namespaces that you know to have strictly increasingly monotonicity so that they are always analysed at their derivative values. This list is used with the matched_or_regexed_in_list and it matches in the string, dotted namespace elements or a regex pattern.

NON_DERIVATIVE_MONOTONIC_METRICS = []

Variables:: NON_DERIVATIVE_MONOTONIC_METRICS (list) – Strictly increasing monotonically metrics to not calculate the derivative values for OR metrics that may at times exhibit strictly increasing monotonicity but are not.

Skyline by default automatically converts strictly increasingly monotonically metric values to their derivative values by calculating the delta between subsequent datapoints. The function ignores datapoints that trend down. This is useful for metrics that increase over time and then reset.

Any strictly increasing monotonically metrics that you do not want Skyline to convert to the derivative values are declared here. Complete metric names are required.

ZERO_FILL_NAMESPACES = []

Variables:: ZERO_FILL_NAMESPACES (list) – A list of metric namespaces that should be analysed with 0s filling missing data points.

This is similar to settings.LAST_KNOWN_VALUE_NAMESPACES below and the following description of problems related to sparsely populated metrics is applicable to both.

Some metrics are very sparsely populated and only send data infrequently. Sparsely populated metrics are more difficult to use because the amount of data points present in any given period can vary significantly. This can limit certain functions in the analysis process, so where appropriate Skyline can 0 fill missing data points.

An example of a type of metric that would be suited being 0 filled would be a page view metric that was only submitted when the page was viewed and if the page in question is only viewed 1 or 2 times per day or a few times a week, this would result in a metric that had say 5 data points for an entire week. In terms of training and analysis there is not sufficient data there, however if that metric was 0 filled at runtime there would be a fully populated data set. There are many cases where instrumentation or telemetry is only sent if an event occurs, this allows Skyline to handle and work with very sparsely populated data.

In Graphite the raw data for these metrics will still display sparsely populated, but within Skyline contexts the data and graphs shown will be filled as Skyline will use the Graphite transformNull and a similar function in analysis.

Metrics that are declared in settings.MONOTONIC_METRIC_NAMESPACES should not be declared in ZERO_FILL_NAMESPACES.

Always look at your metrics and apply the transforms in Graphite to determine the desired outcomes will be achieved.

LAST_KNOWN_VALUE_NAMESPACES = ['otel.traces']

Variables:: LAST_KNOWN_VALUE_NAMESPACES (list) – A list of metric namespaces that should be analysed filling missing data points with the value of the last data point.

This is similar to settings.ZERO_FILL_NAMESPACES above and the same description of the problems with sparsely populated metrics described above applies here. Please read the entire above docstring for the above settings.ZERO_FILL_NAMESPACES

An example of a type of metric that would be suited to last known value filling would be a monotonically increasing metric that did not submit a data point at every interval. For example laptop or desktop metrics where the machine is suspended and count metrics are paused for a night or weekend, then resume at the same incrementing count without being reset to 0 as they would if a reboot occurred.

Metrics that are declared in settings.MONOTONIC_METRIC_NAMESPACES can be declared in LAST_KNOWN_VALUE_NAMESPACES.

SMTP_OPTS = {'default_recipient': ['you@your_domain.com'], 'embed-images': True, 'recipients': {'horizon.test': ['you@your_domain.com'], 'skyline': ['you@your_domain.com', 'them@your_domain.com'], 'skyline_test.alerters.test': ['you@your_domain.com']}, 'sender': 'skyline@your_domain.com', 'smtp_server': {'host': '127.0.0.1', 'password': None, 'port': 25, 'ssl': False, 'user': None}}

Variables:: SMTP_OPTS (dictionary) – Your SMTP settings. [USER_DEFINED]

It is possible to set the email addresses to no_email if you do not wish to receive SMTP alerts, but smtp alerters are required see https://earthgecko-skyline.readthedocs.io/en/latest/alerts.html

Note

For each alert tuple defined in settings.ALERTS you need a recipient defined that matches the namespace. The default_recipient acts as a catchall for any alert tuple that does not have a matching recipients defined.

HIPCHAT_OPTS = {'auth_token': 'hipchat_auth_token', 'color': 'purple', 'rooms': {'horizon.udp.test': (12345,), 'skyline': (12345,), 'skyline_test.alerters.test': (12345,)}, 'sender': 'hostname or identifier'}

Variables:: HIPCHAT_OPTS (dictionary) – [DEPRECATED] Your Hipchat settings.

HipChat alerts require python-simple-hipchat

SLACK_OPTS = {'bot_user_oauth_access_token': 'YOUR_slack_bot_user_oauth_access_token', 'channel_ids': {'#skyline': 'YOUR_slack_channel_id', '#testing': 'YOUR_slack_other_channel_id'}, 'channels': {'horizon.udp.test': ('#skyline', '#testing'), 'skyline': ('#skyline',), 'skyline_test.alerters.test': ('#skyline',)}, 'default_channel': 'YOUR_default_slack_channel', 'default_channel_id': 'YOUR_default_slack_channel_id', 'icon_emoji': ':chart_with_upwards_trend:', 'message_on_features_profile_created': True, 'message_on_features_profile_created_reaction_emoji': 'thumbsup', 'message_on_features_profile_disabled': True, 'message_on_features_profile_disabled_reaction_emoji': 'x', 'message_on_features_profile_learnt': True, 'message_on_features_profile_learnt_reaction_emoji': 'heavy_check_mark', 'message_on_training_data_viewed': True, 'message_on_training_data_viewed_reaction_emoji': 'eyes', 'message_on_validated_features_profiles': True, 'thread_updates': True}

Variables:: SLACK_OPTS (dictionary) – Your slack settings. [USER_DEFINED]

slack alerts require the slackclient package

PAGERDUTY_OPTS = {'auth_token': 'your_pagerduty_auth_token', 'key': 'your_pagerduty_service_api_key', 'subdomain': 'example'}

Variables:: PAGERDUTY_OPTS (dictionary) – Your PagerDuty settings. [USER_DEFINED]

PagerDuty alerts require the pygerduty package

SYSLOG_OPTS = {'ident': 'skyline', 'level': 'warn'}

Variables:: SYSLOG_OPTS (dictionary) – Your syslog settings.

syslog alerts requires an ident this adds a LOG_WARNING message to the LOG_LOCAL4 by default, which will ship to any syslog or rsyslog down the line. The EXPIRATION_TIME for the syslog alert method should be set to 1 to fire every anomaly into the syslog. The level key can be set to warn (4), notice (5) or info (6).

HTTP_ALERTERS_OPTS = {}

Variables:: HTTP_ALERTERS_OPTS (dictionary) – External alert endpoints - ADVANCED FEATURE.

Dictionary example:

HTTP_ALERTERS_OPTS = {
    'http_alerter-mock_api_alerter_receiver': {
        'enabled': True,
        'endpoint': 'http://127.0.0.1:1500/mock_api?alert_reciever',
        'token': None
    },
    'http_alerter-otherapp': {
        'enabled': False,
        'endpoint': 'https://other-http-alerter.example.org/alerts',
        'token': '1234567890abcdefg'
    },
}

All http_alerter alert names must be prefixed with http_alerter followed by the name you want to assign to it. HTTP_ALERTERS_OPTS is used by Analyzer (in settings.ALERTS) and Boundary (in settings.BOUNDARY_METRICS) in conjunction with http_alerter defines.

AWS_SNS_SMS_ALERTS_ENABLED = False

Variables:: AWS_SNS_SMS_ALERTS_ENABLED (boolean) – Enables SMS alerting via AWS SNS. If this is set to True settings.AWS_OPTS and boto3 must be configured [USER_DEFINED]

By default Skyline by default just uses the boto3

AWS_OPTS = {'use_boto3_defaults': True}

Variables:: AWS_OPTS (dictionary) – Your AWS settings. [USER_DEFINED]

For SMS alerts via AWS SNS using boto3. Skyline by default just uses the boto3 configuration which should be configured as normal. Internally boto3 uses ~/.aws/credentials and ~/.aws/config. If you run Skyline as the skyline user that has /opt/skyline as their $HOME directory then boto3 will expect these files /opt/skyline/.aws/credentials and /opt/skyline/.aws/config For documentation on configuring AWS SNS to send SMS and AWS users/IAM user and permissions see AWS docs.

SMS_ALERT_OPTS = {'namespaces': {}, 'recipients': {}}

Variables:: SMS_ALERT_OPTS (dict) – Define recipients and namespaces to route SMS alerts to. Both settings.AWS_SNS_SMS_ALERTS_ENABLED and settings.AWS_OPTS must be enabled and defined for SMS alerting. [USER_DEFINED]

SMS alerts requires AWS_OPTS to be set and are routed via AWS SNS.

Example:

SMS_ALERT_OPTS = {
    'recipients': {
        'alice': '+1098765432',
        'bob': '+12345678901',
        'pager': '+11111111111',
    },
    'namespaces': {
        'org_website.status_code.500': ['pager'],
        'disk.used_percent': ['pager', 'alice'],
        'skyline.analyzer.runtime': ['pager', 'bob']
    }
}

CUSTOM_ALERT_OPTS = {'analyzer_alert_heading': 'Analyzer', 'append_environment': '', 'boundary_alert_heading': 'Boundary', 'ionosphere_alert_heading': 'Ionosphere', 'ionosphere_link_path': 'ionosphere', 'main_alert_title': 'Skyline', 'mirage_alert_heading': 'Mirage'}

Variables:: CUSTOM_ALERT_OPTS (dictionary) – Any custom alert headings you want to use

Here you can specify any custom alert titles and headings you want for each alerting app. You also can use the append_environment option to append the environment from which the alert originated.

WORKER_PROCESSES = 2

Variables:: WORKER_PROCESSES (int) – This is the number of worker processes that will consume from the Horizon queue.

HORIZON_IP = 'YOUR_SKYLINE_INSTANCE_IP_ADDRESS'

Variables:: HORIZON_IP (str) – The IP address for Horizon to bind to. Skyline receives data from Graphite on this IP address. This previously defaulted to gethostname() but has been change to be specifically specified by the user. [USER_DEFINED]

PICKLE_PORT = 2024

Variables:: PICKLE_PORT (str) – This is the port that listens for Graphite pickles over TCP, sent by Graphite’s carbon-relay agent.

UDP_PORT = 2025

Variables:: UDP_PORT (str) – This is the port that listens for Messagepack-encoded UDP packets.

CHUNK_SIZE = 10

Variables:: CHUNK_SIZE (int) – This is how big a ‘chunk’ of metrics will be before they are added onto the shared queue for processing into Redis.

If you are noticing that Horizon is having trouble consuming metrics, try setting this value a bit higher.

MAX_QUEUE_SIZE = 50000

Variables:: MAX_QUEUE_SIZE (int) – Maximum allowable length of the processing queue

This is the maximum allowable length of the processing queue before new chunks are prevented from being added. If you consistently fill up the processing queue, a higher MAX_QUEUE_SIZE will not save you. It most likely means that the workers do not have enough CPU alotted in order to process the queue on time or there is too much I/O wait on the system. Try increasing settings.CHUNK_SIZE and decreasing settings.ANALYZER_PROCESSES or decreasing settings.ROOMBA_PROCESSES

ROOMBA_PROCESSES = 1

Variables:: ROOMBA_PROCESSES (int) – This is the number of Roomba processes that will be spawned by Horizon to trim timeseries in order to keep them at settings.FULL_DURATION. Keep this number small, as it is not important that metrics be exactly settings.FULL_DURATION all the time.

ROOMBA_GRACE_TIME = 600

Variables:: ROOMBA_GRACE_TIME – Seconds grace

Normally Roomba will clean up everything that is older than settings.FULL_DURATION if you have metrics that are not coming in every second, it can happen that you’ll end up with INCOMPLETE metrics. With this setting Roomba will clean up evertyhing that is older than settings.FULL_DURATION + settings.ROOMBA_GRACE_TIME

ROOMBA_OPTIMUM_RUN_DURATION = 60

Variables:: ROOMBA_OPTIMUM_RUN_DURATION (int) – Timeout in seconds

This is how often Horizon should run roomba to run and prune the time series data in Redis to settings.FULL_DURATION. This allows roomba to be tuned under heavy iowait conditions. Under heavy iowait conditions, the default 60 seconds can results in sustained CPU and IO on Redis and the horizon thread. Being able to adjust this to 300 allows for a reduction in IO under these conditions. Changing value can have an impact on Ionosphere echo features profiles. ADVANCED FEATURE if you need to use this you have a problem on your host.

ROOMBA_TIMEOUT = 100

Variables:: ROOMBA_TIMEOUT (int) – Timeout in seconds

This is the number seconds that a Roomba process can be expected to run before it is terminated. Roomba should really be expected to have run within 100 seconds in general. Roomba is done in a multiprocessing subprocess, however there are certain conditions that could arise that could cause Roomba to stall, I/O wait being one such edge case. Although 99.999% of the time Roomba is fine, this ensures that no Roombas hang around longer than expected.

ROOMBA_DO_NOT_PROCESS_BATCH_METRICS = False

Variables:: ROOMBA_DO_NOT_PROCESS_BATCH_METRICS (boolean) – Whether Horizon roomba should euthanize batch processing metrics.

This should be left as False unless you are backfilling batch metrics and do not want roomba removing data points before analyzer_batch has analyzed them. If this is set to True, analyzer_batch euthanizes batch metrics.

ROOMBA_BATCH_METRICS_CUSTOM_DURATIONS = []

Variables:: ROOMBA_BATCH_METRICS_CUSTOM_DURATIONS (list) – A list of lists of namespaces and and custom durations for batch metrics. Advanced feature for development and testing.

This allows for testing metrics via analyzer_batch with a different FULL_DURATION. It is only applied if settings.ROOMBA_DO_NOT_PROCESS_BATCH_METRICS is set to True. It allows for a metric to be feed to Skyline with historical data that is not aligned with the 1 data point per 60 seconds paradigm and a greater duration than settings.FULL_DURATION, 1 data point per 10 mins for example, allowing analyzer_batch to fake Mirage analysed for historical data. analyzer_batch roomba will use this setting for the euthanize older than value if the metric name is in a metric namespace found in a list.

List example:

ROOMBA_BATCH_METRICS_CUSTOM_DURATIONS = [
    ['test.app6.requests.10minutely', 604800],
]

BATCH_METRICS_CUSTOM_FULL_DURATIONS = {}

Variables:: BATCH_METRICS_CUSTOM_FULL_DURATIONS (dict) – This is ONLY applicable to metrics declared in settings.ROOMBA_BATCH_METRICS_CUSTOM_DURATIONS that are being back filled. Metrics that are being backfilled and not managed by roomba, but managed by analyzer_batch may have a very long custom duration period, but may have a shorter FULL_DURATION period. In order to backfill and analyse at the correct full duration for the metric, the metric full duration for analysis (not for analyzer batch roomba) can be declared here.

dict example:

BATCH_METRICS_CUSTOM_FULL_DURATIONS = {
    'test.app6.requests.10minutely': 259200,
}

MAX_RESOLUTION = 1000

Variables:: MAX_RESOLUTION (int) – The Horizon agent will ignore incoming datapoints if their timestamp is older than MAX_RESOLUTION seconds ago.

HORIZON_SHARDS = {}

Variables:: HORIZON_SHARDS (dict) – ADVANCED FEATURE - A dictionary of Skyline hostnames and there assigned shard value.

This setting is only applicable to running Skyline Horizon services on multiple servers (and Graphite instances) in a replicated fashion. This allows for all the Skyline servers to receive all metrics but only analyze those metrics that are assigned to the specific server (shard). This enables all Skyline servers that are running Horizon to receive the entire metric population stream from mulitple Graphite carbon-relays and drop (not submit to their Redis instance) any metrics that do not belong to their shard. Related settings are settings.REMOTE_SKYLINE_INSTANCES and settings.SYNC_CLUSTER_FILES

Example:

HORIZON_SHARDS = {
    'skyline-server-1': 0,
    'skyline-server-2': 1,
    'skyline-server-3': 2,
}

Shards are 0 indexed.

HORIZON_SHARD_PICKLE_PORT = 2026

Variables:: HORIZON_SHARD_PICKLE_PORT – ADVANCED FEATURE - This is the port that listens for Graphite pickles over TCP, sent by Graphite’s carbon-relay-b agent. When running Skyline clustered with multiple Horizon instances, an additional Graphite carbon-relay-b instances are required to be run to on the remote Graphite servers to forward metrics on to the remote Horizons. See https://earthgecko-skyline.readthedocs.io/en/latest/horizon.html#horizon-shards

HORIZON_SHARD_DEBUG = False

Variables:: HORIZON_SHARD_DEBUG (boolean) – For development only to log some sharding debug info not for general use.

SYNC_CLUSTER_FILES = False

Variables:: SYNC_CLUSTER_FILES (boolean) – ADVANCED FEATURE - If Skyline is running in a clustered configuration the settings.REMOTE_SKYLINE_INSTANCES can sync Ionosphere training data and features_profiles dirs and files between each other. This allows for the relevant Ionosphere data to be distributed to each instance in the cluster, so that if an instance fails or is removed, all the nodes have that nodes data. This also allows for an instance to be added to the cluster and it will self populate. This self populating is rate limited so that a new instance will not thunder against the other instances in the cluster to populate itself as quickly as possible, but rather eventual consistency is achieve. Related settings are settings.REMOTE_SKYLINE_INSTANCES, settings.HORIZON_SHARDS, settings.IONOSPHERE_DATA_FOLDER and settings.IONOSPHERE_PROFILES_FOLDER

SKIP_LIST = ['skyline.analyzer.', 'skyline.boundary.', 'skyline.ionosphere.', 'skyline.mirage.']

Variables:: SKIP_LIST (list) – Metrics to skip

These are metrics that, for whatever reason, you do not want to analyze in Skyline. The Worker will check to see if each incoming metrics contains anything in the skip list. It is generally wise to skip entire namespaces by adding a ‘.’ at the end of the skipped item - otherwise you might skip things you do not intend to. For example the default skyline.analyzer.anomaly_breakdown. which MUST be skipped to prevent crazy feedback.

These SKIP_LIST are also matched just dotted namespace elements too, if a match is not found in the string, then the dotted elements are compared. For example if an item such as ‘skyline.analyzer.algorithm_breakdown’ was added it would macth any metric that matched all 3 dotted namespace elements, so it would match:

skyline.analyzer.skyline-1.algorithm_breakdown.histogram_bins.timing.median_time skyline.analyzer.skyline-1.algorithm_breakdown.histogram_bins.timing.times_run skyline.analyzer.skyline-1.algorithm_breakdown.ks_test.timing.times_run

DO_NOT_SKIP_LIST = ['skyline.analyzer.run_time', 'skyline.boundary.run_time', 'skyline.analyzer.ionosphere_metrics', 'skyline.analyzer.mirage_metrics', 'skyline.analyzer.total_analyzed', 'skyline.analyzer.total_anomalies', 'skyline.analyzer.anomalous', 'skyline.analyzer.metrics_sparsity', 'skyline.exceptions', 'skyline.mirage.checks', 'skyline.logged_errors', 'skyline.mirage.run_time', 'skyline.ionosphere.features_calculation_time', 'skyline.analyzer.labelled_metrics.anomalous', 'skyline.analyzer.labelled_metrics.checked', 'skyline.horizon.prometheus.flux_received']

Variables:: DO_NOT_SKIP_LIST (list) – Metrics to skip

These are metrics that you want Skyline in analyze even if they match a namespace in the SKIP_LIST. Works in the same way that SKIP_LIST does, it matches in the string or dotted namespace elements.

THUNDER_ENABLED = True

Variables:: THUNDER_ENABLED (boolean) – Enable Thunder. Thunder monitors the internals and operational changes on the Skyline apps and metrics. Although the user can monitor Skyline metrics with Skyline, thunder is for convenience so that Skyline has some default monitoring of important Skyline metrics and operations without having to add specific alert tuples on Skyline metrics. However this does not stop users from adding their on alerts on THEIR own Skyline and other metrics once they have the system up and running an get to know it. WORK IN PROGRESS

THUNDER_CHECKS = {'analyzer': {'run_time': {'after_overruns': 5, 'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}, 'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}}, 'analyzer_batch': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': False}}, 'boundary': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': False}}, 'flux.listen': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': False}}, 'flux.worker': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': False}}, 'horizon': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}, 'worker.metrics_received': {'run': False}}, 'ionosphere': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}}, 'luminosity': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}}, 'mirage': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}}, 'panorama': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}}, 'redis': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}}, 'vista.fetcher': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': False}}, 'vista.worker': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': False}}, 'webapp': {'up': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 900, 'run': True}, 'webapp_features_profile': {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'expiry': 300, 'run': True}}}

Variables:: THUNDER_CHECKS (dict) – Enable/disable the checks, configure alert expiry times and define alert_via channels per Skyline app and check. If no alert_via_ items are defined the default is to alert_via_smtp for all thunder checks. Think of Thunder checks as built-in Skyline ALERTS and BOUNDARY_METRICS checks specifically for your Skyline instance metrics and external applications, without you have to know what Skyline metrics and things to watch and know the values of to configure. WORK IN PROGRESS [USER_DEFINED]

THUNDER_OPTS = {'alert_via_pagerduty': False, 'alert_via_slack': False, 'alert_via_smtp': True, 'slack_channel': '#skyline', 'smtp_recipients': ['you@your_domain.com', 'them@your_domain.com']}

Variables:: THUNDER_OPTS (dict) – Thunder can alert via any combination of the following routes SMTP, Slack and Pagerduty. These settings are very similar to Analyzer and Boundary alert related settings and the values may be the same. However Thunder alerts are meant for the Skyline administrator/s whereas Analyzer and Boundary related alerts can be routed to many different parties. [USER_DEFINED]

PANORAMA_ENABLED = True

Variables:: PANORAMA_ENABLED (boolean) – Enable Panorama [USER_DEFINED]

PANORAMA_PROCESSES = 1

Variables:: PANORAMA_PROCESSES – Number of processes to assign to Panorama, should never need more than 1

ENABLE_PANORAMA_DEBUG = False

Variables:: ENABLE_PANORAMA_DEBUG (boolean) – DEVELOPMENT only - enables additional debug logging useful for development only, this should definitely be set to False on production system as LOTS of output

PANORAMA_DATABASE = 'skyline'

Variables:: PANORAMA_DATABASE (str) – The database schema name

PANORAMA_DBHOST = '127.0.0.1'

Variables:: PANORAMA_DBHOST (str) – The IP address or FQDN of the database server [USER_DEFINED]

PANORAMA_DBPORT = '3306'

Variables:: PANORAMA_DBPORT (str) – The port to connet to the database server on

PANORAMA_DBUSER = 'skyline'

Variables:: PANORAMA_DBUSER (str) – The database user

PANORAMA_DBUSERPASS = 'the_user_password'

Variables:: PANORAMA_DBUSERPASS (str) – The database user password [USER_DEFINED]

NUMBER_OF_ANOMALIES_TO_STORE_IN_PANORAMA = 0

Variables:: NUMBER_OF_ANOMALIES_TO_STORE_IN_PANORAMA (int) – The number of anomalies to store in the Panaroma database, the default is 0 which means UNLIMITED. This does nothing currently.

PANORAMA_EXPIRY_TIME = 900

Variables:: PANORAMA_EXPIRY_TIME (int) – Panorama will only store one anomaly for a metric every PANORAMA_EXPIRY_TIME seconds.

This is the Panorama sample rate. Please bear in mind Panorama does not use the ALERTS time expiry keys or matching, Panorama records every anomaly, even if the metric is not in an alert tuple. Consider that a metric could and does often fire as anomalous every minute, until it no longer is.

PANORAMA_CHECK_MAX_AGE = 300

Variables:: PANORAMA_CHECK_MAX_AGE (int) – Panorama will only process a check file if it is not older than PANORAMA_CHECK_MAX_AGE seconds. If it is set to 0 it does all. This setting just ensures if Panorama stalls for some hours and is restarted, the user can choose to discard older checks and miss anomalies being recorded if they so choose to, to prevent Panorama stampeding against MySQL if something went down and Panorama comes back online with lots of checks.

PANORAMA_CHECK_INTERVAL = 20

Variables:: PANORAMA_CHECK_INTERVAL (int) – How often (in seconds) Panorama will check for anomalies to add to the database. This allows you to configure Panorama to insert anomalies into the database every second if you so wish to, however in most cases every 20 seconds is sufficient. However, NOTE, if SNAB_ENABLE is True this is automatically overridden to the value of 1 as snab needs anomaly ids available asap. This can increase disk I/O until such a time as Panorama changes from check files to Redis.

PANORAMA_INSERT_METRICS_IMMEDIATELY = True

Variables:: PANORAMA_INSERT_METRICS_IMMEDIATELY – Panorama will only insert a metric into the metrics database table when an anomaly is registered for the metric by default. To have Panorama insert every metric into the database as soon as it appears in unique_metrics, set this to True. This functionality is for development and testing purposes. With the addition of labelled_metrics in Skyline v4.0.0, the default for this value has become True. As of v4.0.0 this settings variable is no longer considered in Panorama, it now automatically checks for new metrics every 60 seconds.

MIRAGE_PROCESSES = 1

Variables:: MIRAGE_PROCESSES (int) – This is the number of processes that the Skyline Mirage will spawn to process checks. If you find the mirage checks.pending metric and checks.stale_discarded are above 0 too often, increase this accordingly. Adding additional processes can increase CPU and load on both Skyline and Graphite.

MIRAGE_DATA_FOLDER = '/opt/skyline/mirage/data'

Variables:: MIRAGE_DATA_FOLDER (str) – This is the path for the Mirage data folder where timeseries data that has been surfaced will be written - absolute path

MIRAGE_ALGORITHMS = ['first_hour_average', 'mean_subtraction_cumulation', 'stddev_from_average', 'stddev_from_moving_average', 'least_squares', 'grubbs', 'histogram_bins', 'median_absolute_deviation', 'ks_test']

Variables:: MIRAGE_ALGORITHMS (array) – These are the algorithms that the Mirage will run.

To add a new algorithm, you must both define the algorithm in mirage/mirage_algorithms.py and add it’s name here.

MIRAGE_STALE_SECONDS = 120

Variables:: MIRAGE_STALE_SECONDS (int) – The number of seconds after which a check is considered stale and discarded.

MIRAGE_CONSENSUS = 6

Variables:: MIRAGE_CONSENSUS (int) – This is the number of algorithms that must return True before a metric is classified as anomalous.

MIRAGE_ENABLE_SECOND_ORDER = False

Variables:: MIRAGE_ENABLE_SECOND_ORDER (boolean) – This is to enable second order anomalies.

Warning

EXPERIMENTAL - This is an experimental feature, so it’s turned off by default.

MIRAGE_ENABLE_ALERTS = False

Variables:: MIRAGE_ENABLE_ALERTS (boolean) – This enables Mirage alerting. [USER_DEFINED]

NEGATE_ANALYZER_ALERTS = False

Variables:: NEGATE_ANALYZER_ALERTS (boolean) – DEVELOPMENT only - negates Analyzer alerts

This is to enables Mirage to negate Analyzer alerts. Mirage will send out an alert for every anomaly that Analyzer sends to Mirage that is NOT anomalous at the SECOND_ORDER_RESOLUTION_HOURS with a SECOND_ORDER_RESOLUTION_HOURS graph and the Analyzer settings.FULL_DURATION graph embedded. Mostly for testing and comparison of analysis at different time ranges and/or algorithms.

MIRAGE_CRUCIBLE_ENABLED = False

Variables:: MIRAGE_CRUCIBLE_ENABLED (boolean) – This enables Mirage to output to Crucible

This enables Mirage to send Crucible data, if this is set to True ensure that settings.CRUCIBLE_ENABLED is also set to True in the Crucible settings block.

Warning

Not recommended, it will make a LOT of data files in the settings.CRUCIBLE_DATA_FOLDER

MIRAGE_PERIODIC_CHECK = True

Variables:: MIRAGE_PERIODIC_CHECK (boolean) – This enables Mirage to periodically check metrics matching the namespaces in settings.MIRAGE_PERIODIC_CHECK_NAMESPACES at every settings.MIRAGE_PERIODIC_CHECK_INTERVAL. Mirage should only be configured to periodically analyse key metrics. For further in depth details regarding Mirage periodic check and their impact, please see the Mirage Periodic Checks documentation at: https://earthgecko-skyline.readthedocs.io/en/latest/mirage.html#periodic-checks Further settings.MIRAGE_ONLY_METRICS are handled by

MIRAGE_PERIODIC_CHECK_INTERVAL = 3600

Variables:: MIRAGE_PERIODIC_CHECK_INTERVAL (int) – This is the interval in seconds at which Mirage should analyse metrics matching the namespaces in settings.MIRAGE_PERIODIC_CHECK_NAMESPACES

MIRAGE_PERIODIC_CHECK_NAMESPACES = []

Variables:: MIRAGE_PERIODIC_CHECK_NAMESPACES (list) – Mirage metric namespaces to periodically check with Mirage, even if Analyzer does not find them anomalous, Analyzer will ensure that these Mirage metric namespaces are analyzed by Mirage every settings.MIRAGE_PERIODIC_CHECK_INTERVAL seconds. This works in the same way that settings.SKIP_LIST does, it matches in the string or the dotted namespace elements.

MIRAGE_ALWAYS_METRICS = []

Variables:: MIRAGE_ALWAYS_METRICS (list) – These are metrics you want to always be checked by Mirage, every minute and not just by Analyzer. For this to be in effect, you must ensure that MIRAGE_PERIODIC_CHECK is set to True. This allows for a use case where you want to apply a specific settings.CUSTOM_ALGORITHMS algorithm on a metric, all the time. The metrics declared here must be absolute metric names, no element matching or regex is applied.

MIRAGE_AUTOFILL_TOOSHORT = False

Variables:: MIRAGE_AUTOFILL_TOOSHORT (boolean) – This is a convenience feature that allows Analyzer to send metrics that are classed TooShort to Mirage and have Mirage attempt to fetch the data from Graphite and populate Redis with the time series data. This is useful if for some reason Graphite stopped pickling data to Horizon or when Skyline is first run against a populated Graphite that already has metric data. Mirage will only try to fetch data from Graphite and populate Redis with data for a metric once in FULL_DURATION, so that any remote metrics from Vista are not attempt to be fetched over and over as this feature only works with the GRAPHITE_HOST, not remote data sources.

MIRAGE_CHECK_REPETITIVE_DAILY_PEAKS = 3

Variables:: MIRAGE_CHECK_REPETITIVE_DAILY_PEAKS (int) – This is the number of peaks that define how many peaks (in the same daily time window) need to exist for a metric to be deemed to exhibit expected daily peaks. This defaults to 3 which is the best performing setting in general. If set to 0 this analysis will be disabled. Setting it to 3 means for instance there must be at least 3 peaks that occur around the same time every day, for peaks to be expected around the same time each day. If there are 3 peaks in a 7 day period then determine if an anomaly is actually a normal peak in an acceptable range that is experienced on a daily basis around the same time or not. In Mirage it will apply the anomalous_daily_peak custom algorithm ONLY if a metric is is found to be anomalous and will override the anomalous result if the current peak values are within the bounds of other peak values that occur in the same period daily. This is a very useful method for things that occur on a somewhat regular basis and identifies a large number of false positives on these types of metrics. This is only run if the metric has more 5.25 days worth of data.

MIRAGE_SKIP_IRREGULAR_UNSTABLE = []

Variables:: MIRAGE_SKIP_IRREGULAR_UNSTABLE (list) – This is a list of namespaces, metrics or regex patterns of metric names to exclude from being analysed as irregular and unstable timeseries. Irregular and unstable metrics display a very low variance value. Metrics which can generally display this type of behaviour are metrics related to errors. For example HTTP status code 50x metrics may experience a number of errors, once or twice a week and when analysed at 7 days any prominent spikes will probably be deemed as anomalous.

MIRAGE_LONG_DURATION_ALGORITHMS = {'algorithms': {'adtk_level_shift': {'algorithm_parameters': {'anomaly_window': 1, 'c': 9.9, 'realtime_analysis': False, 'return_results': True, 'window': 10}, 'outlier_value': 1}, 'isolation_forest': {'algorithm_parameters': {'anomaly_window': 6, 'contamination': 0.01, 'return_results': True}, 'outlier_value': -1}, 'lof': {'algorithm_parameters': {'anomaly_window': 6, 'n_neighbors': 20, 'return_results': True}, 'outlier_value': -1}, 'm66': {'algorithm_parameters': {'anomaly_window': 1, 'minimum_sparsity': 70, 'nth_median': 6, 'return_results': True, 'sigma': 6, 'window': 5}, 'outlier_value': 1}, 'one_class_svm': {'algorithm_parameters': {'anomaly_window': 1, 'return_results': True}, 'outlier_value': -1}, 'pca': {'algorithm_parameters': {'anomaly_window': 6, 'return_results': True, 'threshold': 0.7}, 'outlier_value': 0.7}, 'sigma': {'algorithm_parameters': {'anomaly_window': 6, 'consensus': 6, 'return_results': True, 'sigma_value': 6}, 'outlier_value': 1}, 'spectral_residual': {'algorithm_parameters': {'anomaly_window': 6, 'return_results': True}, 'outlier_value': 1}}, 'consensus': 5}

Variables:: MIRAGE_LONG_DURATION_ALGORITHMS (dict) – The algorithms for Mirage to run on long duration analysis. Algorithm runtimes on 30 days of data (4320 data points): sigma (6-sigma, last 6 data points): 1.1062438488006592 spectral_residual: 0.12788629531860352 lof: 0.6984493732452393 isolation_forest: 1.3740370273590088 pca: 1.5815119743347168 one_class_svm: 4.011308908462524 m66: 0.18551158905029297 adtk_level_shift: 2.2769649028778076 adtk_persist: 0.9912087917327881 adtk_seasonal: 1.1874268054962158 adtk_volatility_shift: 1.0059242248535156

BOUNDARY_PROCESSES = 1

Variables:: BOUNDARY_PROCESSES (int) – The number of processes that Boundary should spawn.

Seeing as Boundary analysis is focused at specific metrics this should be less than the number of settings.ANALYZER_PROCESSES.

BOUNDARY_OPTIMUM_RUN_DURATION = 60

Variables:: BOUNDARY_OPTIMUM_RUN_DURATION – This is how many seconds it would be optimum for Boundary to be able to analyze your Boundary defined metrics in.

This largely depends on your metric resolution e.g. 1 datapoint per 60 seconds and how many metrics you are running through Boundary.

ENABLE_BOUNDARY_DEBUG = False

Variables:: ENABLE_BOUNDARY_DEBUG (boolean) – Enables Boundary debug logging

Enable additional debug logging - useful for development only, this should definitely be set to False on as production system - LOTS of output

BOUNDARY_ALGORITHMS = ['detect_drop_off_cliff', 'greater_than', 'less_than']

Variables:: BOUNDARY_ALGORITHMS (array) – Algorithms that Boundary can run

These are the algorithms that boundary can run. To add a new algorithm, you must both define the algorithm in boundary_algorithms.py and add its name here.

BOUNDARY_ENABLE_ALERTS = False

Variables:: BOUNDARY_ENABLE_ALERTS (boolean) – Enables Boundary alerting

BOUNDARY_CRUCIBLE_ENABLED = False

Variables:: BOUNDARY_CRUCIBLE_ENABLED (boolean) – Enables and disables Boundary pushing data to Crucible

This enables Boundary to send Crucible data, if this is set to True ensure that settings.CRUCIBLE_ENABLED is also set to True in the Crucible settings block.

Warning

Not recommended it will make a LOT of data files in the settings.CRUCIBLE_DATA_FOLDER

BOUNDARY_METRICS = (('skyline.logged_errors', 'greater_than', 3600, 0, 0, 1, 10, 'smtp'), ('skyline_test.alerters.test', 'greater_than', 1, 0, 0, 0, 1, 'smtp|http_alerter-mock_api_alerter_receiver'), ('skyline_test.alerters.test', 'detect_drop_off_cliff', 1800, 500, 3600, 0, 2, 'smtp|http_alerter-mock_api_alerter_receiver'), ('skyline_test.alerters.test', 'less_than', 3600, 0, 0, 15, 2, 'smtp|http_alerter-mock_api_alerter_receiver'), ('some_metric1', 'detect_drop_off_cliff', 1800, 500, 3600, 0, 2, 'smtp|slack|pagerduty'), ('some_metric2.either', 'less_than', 3600, 0, 0, 15, 2, 'smtp'), ('some_nometric.other', 'greater_than', 3600, 0, 0, 100000, 1, 'smtp'), ('some_nometric.another', 'greater_than', 3600, 0, 0, 100000, 1, 'http_alerter-external_endpoint'))

Variables:: BOUNDARY_METRICS (tuple) – definitions of metrics for Boundary to analyze

This is the config for metrics to analyse with the boundary algorithms. It is advisable that you only specify high rate metrics and global metrics here, although the algoritms should work with low rate metrics, the smaller the range, the smaller a cliff drop of change is, meaning more noise, however some algorithms are pre-tuned to use different trigger values on different ranges to pre-filter some noise.

Tuple schema:

BOUNDARY_METRICS = (
    ('metric1', 'algorithm1', EXPIRATION_TIME, MIN_AVERAGE, MIN_AVERAGE_SECONDS, TRIGGER_VALUE, ALERT_THRESHOLD, 'ALERT_VIAS'),
    ('metric2', 'algorithm2', EXPIRATION_TIME, MIN_AVERAGE, MIN_AVERAGE_SECONDS, TRIGGER_VALUE, ALERT_THRESHOLD, 'ALERT_VIAS'),
    # Wildcard namespaces can be used as well
    ('metric.thing.*.requests', 'algorithm1', EXPIRATION_TIME, MIN_AVERAGE, MIN_AVERAGE_SECONDS, TRIGGER_VALUE, ALERT_THRESHOLD, 'ALERT_VIAS'),
                    )

Metric parameters (all are required):

Parameters:

metric (str) – metric name or pattern.
algorithm (str) – algorithm name.
EXPIRATION_TIME (int) – Alerts will not fire twice within this amount of seconds, even if they trigger again.
MIN_AVERAGE (int) – the minimum average value to evaluate for boundary_algorithms.detect_drop_off_cliff(), this allows for tuning the algorithm to only check when certain conditions are present. When a drop_off_cliff event happens it is often sustained for a period and it is also possible for a metric to recover slightly and then drop again. In real world situations this could be something like a network partition that happens, everything drops and then it recovers for a very brief period. During that brief recover the metric could receive 10% of it’s normal expected data and drop again, thus firing again and again. These drop off cliff events often evolve like this. So the algorithm tuning allows to only check if a metric has dropped off a cliff if it is behaving in an expected manner. To disable the tuning simply set MIN_AVERAGE and MIN_AVERAGE_SECONDS to 0 on in boundary_algorithms.detect_drop_off_cliff(). The in the boundary_algorithms.greater_than() and boundary_algorithms.less_than() algorithm contexts set this to 0.
MIN_AVERAGE_SECONDS (int) – the seconds to calculate the minimum average value over in boundary_algorithms.detect_drop_off_cliff(). So if MIN_AVERAGE set to 100 and MIN_AVERAGE_SECONDS to 3600 a metric will only be analysed if the average value of the metric over 3600 seconds is greater than 100. For the boundary_algorithms.less_than() and boundary_algorithms.greater_than() algorithms set this to 0. To disable the tuning in boundary_algorithms.detect_drop_off_cliff() you can set both MIN_AVERAGE and MIN_AVERAGE_SECONDS to 0. The in the boundary_algorithms.greater_than() and boundary_algorithms.less_than() algorithm contexts set this to 0.
TRIGGER_VALUE (int) – the less_than or greater_than trigger value. Set this to 0 for boundary_algorithms.detect_drop_off_cliff()
ALERT_THRESHOLD (int) – alert after detected x times. This allows you to set how many times a timeseries has to be detected by the algorithm as anomalous before alerting on it. The nature of distributed metric collection, storage and analysis can have a lag every now and then due to latency, I/O pause, etc. Boundary algorithms can be sensitive to this not unexpectedly. This setting should be 1, maybe 2 maximum to ensure that signals are not being surpressed. Try 1 if you are getting the occassional false positive, try 2. Note - Any boundary_algorithms.greater_than() metrics should have this as 1.
ALERT_VIAS (str) – pipe separated alerters to send to, valid options smtp, pagerduty, http_alerter_<name> and slack. This could therefore be ‘smtp’, ‘smtp|slack’, ‘pagerduty|slack’, etc

Wildcard and absolute metric paths. Currently the only supported metric namespaces are a parent namespace and an absolute metric path e.g.

Examples:

('stats_counts.someapp.things', 'detect_drop_off_cliff', 1800, 500, 3600, 0, 2, 'smtp'),
('stats_counts.someapp.things.an_important_thing.requests', 'detect_drop_off_cliff', 600, 100, 3600, 0, 2, 'smtp|pagerduty'),
('stats_counts.otherapp.things.*.requests', 'detect_drop_off_cliff', 600, 500, 3600, 0, 2, 'smtp|slack'),

In the above all stats_counts.someapp.things* would be painted with a 1800 EXPIRATION_TIME and 500 MIN_AVERAGE, but those values would be overridden by 600 and 100 stats_counts.someapp.things.an_important_thing.requests and pagerduty added.

BOUNDARY_AUTOAGGRERATION = False

Variables:: BOUNDARY_AUTOAGGRERATION (boolean) – Enables autoaggregate a timeseries

This is used to autoaggregate a timeseries with autoaggregate_ts(), if a timeseries dataset has 6 data points per minute but only one data value every minute then autoaggregate can be used to aggregate the required sample.

BOUNDARY_AUTOAGGRERATION_METRICS = (('nometrics.either', 60),)

Variables:: BOUNDARY_AUTOAGGRERATION_METRICS (tuples) – The namespaces to autoaggregate

Tuple schema example:

BOUNDARY_AUTOAGGRERATION_METRICS = (
    ('stats_counts', AGGREGATION_VALUE),
    ('metric1', AGGREGATION_VALUE),
)

Metric tuple parameters are:

Parameters:

metric (str) – metric name.
AGGREGATION_VALUE (int) – window to aggregate in seconds.

Declare the namespace and aggregation value in seconds by which you want the timeseries aggregated. To aggregate a timeseries to minutely values use 60 as the AGGREGATION_VALUE, e.g. sum metric datapoints by minute

BOUNDARY_ALERTER_OPTS = {'alerter_expiration_time': {'pagerduty': 1800, 'slack': 1800, 'smtp': 60}, 'alerter_limit': {'pagerduty': 15, 'slack': 30, 'smtp': 100}}

Variables:: BOUNDARY_ALERTER_OPTS (dictionary) – Your Boundary alerter settings.

Note

Boundary Alerting Because you may want to alert multiple channels on each metric and algorithm, Boundary has its own alerting settings, similar to Analyzer. However due to the nature of Boundary and it algorithms it could be VERY noisy and expensive if all your metrics dropped off a cliff. So Boundary introduces alerting with the ability to limit overall alerts to an alerter channel. These limits use the same methodology that the alerts use, but each alerter is keyed too.

BOUNDARY_SMTP_OPTS = {'default_recipient': ['you@your_domain.com'], 'embed-images': True, 'graphite_graph_line_color': 'pink', 'graphite_previous_hours': 7, 'recipients': {'nometrics': ['you@your_domain.com', 'them@your_domain.com'], 'nometrics.either': ['you@your_domain.com', 'another@some-company.com'], 'skyline_test.alerters.test': ['you@your_domain.com']}, 'sender': 'skyline-boundary@your_domain.com'}

Variables:: BOUNDARY_SMTP_OPTS (dictionary) – Your SMTP settings. [USER_DEFINED]

BOUNDARY_HIPCHAT_OPTS = {'auth_token': 'hipchat_auth_token', 'color': 'purple', 'graphite_graph_line_color': 'pink', 'graphite_previous_hours': 7, 'rooms': {'nometrics': (12345,), 'skyline_test.alerters.test': (12345,)}, 'sender': 'hostname or identifier'}

Variables:: BOUNDARY_HIPCHAT_OPTS (dictionary) – [DEPRECATED] Your Hipchat settings.

HipChat alerts require python-simple-hipchat

BOUNDARY_PAGERDUTY_OPTS = {'auth_token': 'your_pagerduty_auth_token', 'key': 'your_pagerduty_service_api_key', 'subdomain': 'example'}

Variables:: BOUNDARY_PAGERDUTY_OPTS (dictionary) – Your PagerDuty settings. [USER_DEFINED]

PagerDuty alerts require pygerduty

BOUNDARY_SLACK_OPTS = {'bot_user_oauth_access_token': 'YOUR_slack_bot_user_oauth_access_token', 'channels': {'skyline': ('#general',), 'skyline_test.alerters.test': ('#general',)}, 'icon_emoji': ':chart_with_upwards_trend:'}

Variables:: BOUNDARY_SLACK_OPTS (dictionary) – Your slack settings. [USER_DEFINED]

slack alerts require slackclient

ENABLE_CRUCIBLE = True

Variables:: ENABLE_CRUCIBLE (boolean) – Enable Crucible.

CRUCIBLE_PROCESSES = 1

Variables:: CRUCIBLE_PROCESSES (int) – The number of processes that Crucible should spawn.

CRUCIBLE_TESTS_TIMEOUT = 60

Variables:: CRUCIBLE_TESTS_TIMEOUT (int) – # This is the number of seconds that Crucible tests can take. 60 is a reasonable default for a run with a settings.FULL_DURATION of 86400

ENABLE_CRUCIBLE_DEBUG = False

Variables:: ENABLE_CRUCIBLE_DEBUG (boolean) – DEVELOPMENT only - enables additional debug logging useful for development only, this should definitely be set to False on production system as LOTS of output

CRUCIBLE_DATA_FOLDER = '/opt/skyline/crucible/data'

Variables:: CRUCIBLE_DATA_FOLDER (str) – This is the path for the Crucible data folder where anomaly data for timeseries will be stored - absolute path

WEBAPP_SERVER = 'gunicorn'

Variables:: WEBAPP_SERVER (str) – Run the Webapp via gunicorn (recommended) or the Flask development server, set this to either 'gunicorn' or 'flask'. Flask is no longer supported.

WEBAPP_GUNICORN_WORKERS = 2

Variables:: WEBAPP_GUNICORN_WORKERS (int) – How many gunicorn workers to run for the webapp. The normal recommended value for gunicorn is generally be between 2-4 workers per core, however on a machine with lots of cores this is probably over provisioning for the webapp, depending on the load on the server. Since switching to the gevent worker_class there is no requirement for more that 2 workers.

WEBAPP_GUNICORN_BACKLOG = 2048

Variables:: WEBAPP_GUNICORN_BACKLOG (int) – The maximum number of pending connections. This refers to the number of clients that can be waiting to be served. Exceeding this number results in the client getting an error when attempting to connect. It should only affect servers under significant load. It must be a positive integer. Generally set in the 64-2048 range.

WEBAPP_IP = '127.0.0.1'

Variables:: WEBAPP_IP (str) – The IP address for the Webapp to bind to

WEBAPP_PORT = 1500

Variables:: WEBAPP_PORT (int) – The port for the Webapp to listen on, note that webapp_features_profile will also listen on 127.0.0.1 WEBAPP_PORT + 1, e.g 1501.

WEBAPP_AUTH_ENABLED = True

Variables:: WEBAPP_AUTH_ENABLED (boolean) – To enable pseudo basic HTTP auth

WEBAPP_AUTH_USER = 'admin'

Variables:: WEBAPP_AUTH_USER (str) – The username for pseudo basic HTTP auth [USER_DEFINED]

WEBAPP_AUTH_USER_PASSWORD = 'aec9ffb075f9443c8e8f23c4f2d06faa'

Variables:: WEBAPP_AUTH_USER_PASSWORD (str) – The user password for pseudo basic HTTP auth [USER_DEFINED]

WEBAPP_IP_RESTRICTED = True

Variables:: WEBAPP_IP_RESTRICTED (boolean) – To enable restricted access from IP address declared in settings.WEBAPP_ALLOWED_IPS

WEBAPP_ALLOWED_IPS = ['127.0.0.1']

Variables:: WEBAPP_ALLOWED_IPS (array) – The allowed IP addresses

WEBAPP_USER_TIMEZONE = True

Variables:: WEBAPP_USER_TIMEZONE (boolean) – This determines the user’s timezone and renders graphs with the user’s date values. If this is set to False the timezone in settings.WEBAPP_FIXED_TIMEZONE is used.

WEBAPP_FIXED_TIMEZONE = 'Etc/GMT+0'

Variables:: WEBAPP_FIXED_TIMEZONE (str) – You can specific a timezone you want the client browser to render graph date and times in. This setting is only used if the settings.WEBAPP_USER_TIMEZONE is set to False. This must be a valid momentjs timezone name, see: https://github.com/moment/moment-timezone/blob/develop/data/packed/latest.json

Note

Timezones, UTC and javascript Date You only need to use the first element of the momentjs timezone string, some examples, ‘Europe/London’, ‘Etc/UTC’, ‘America/Los_Angeles’. Because the Webapp is graphing using data UTC timestamps, you may may want to display the graphs to users with a fixed timezone and not use the browser timezone so that the Webapp graphs are the same in any location.

WEBAPP_ACCEPT_DATA_UPLOADS = False

Variables:: WEBAPP_ACCEPT_DATA_UPLOADS (boolean) – Enables the webapp to accept data uploads for Flux to process. This is related to settings.FLUX_PROCESS_UPLOADS and uploads are saved to settings.DATA_UPLOADS_PATH

WEBAPP_JAVASCRIPT_DEBUG = False

Variables:: WEBAPP_JAVASCRIPT_DEBUG (boolean) – Enables some javascript console.log when enabled.

ENABLE_WEBAPP_DEBUG = False

Variables:: ENABLE_WEBAPP_DEBUG (boolean) – Enables some app specific debugging to log.

WEBAPP_PREPROCESS_TIMESERIES = False

Variables:: WEBAPP_PREPROCESS_TIMESERIES (boolean) – Allow for the time series to be aggregated by median or sum per minute so that webapp can return a reasonable number of data points for dyngraph to load and display in the browser without causing lag. This is achieved by aggregating the time series using either the median of values or the sum as defined by settings.WEBAPP_PREPROCESS_AGGREGATE_BY. At the interval defined by settings. Not implemented - UNDER DEVELOPMENT

WEBAPP_PREPROCESS_AGGREGATE_BY = 'median'

Variables:: WEBAPP_PREPROCESS_AGGREGATE_BY (str) – The method by which to aggregate the time series by. Valid strings here are ‘median’ and ‘sum’. settings.WEBAPP_PREPROCESS_AGGREGATE_BY. Not implemented - UNDER DEVELOPMENT

IONOSPHERE_CHECK_PATH = '/opt/skyline/ionosphere/check'

Variables:: IONOSPHERE_CHECK_PATH (str) – This is the location the Skyline apps will write the anomalies to for Ionosphere to check to a file on disk - absolute path

IONOSPHERE_ENABLED = True

Variables:: IONOSPHERE_ENABLED (boolean) – Enable Ionosphere

IONOSPHERE_VERBOSE_LOGGING = False

Variables:: IONOSPHERE_VERBOSE_LOGGING (boolean) – As of Skyline 3.0, apps log notice and errors only. To have addtional info logged set this to True. Useful for debugging but less verbose than LOCAL_DEBUG.

IONOSPHERE_PROCESSES = 1

Variables:: IONOSPHERE_PROCESSES – Number of processes to assign to Ionosphere, however Ionosphere should never need more than 1 and is effectively hard coded as such currently. This variable is only declared for the purpose of maintaining a standard set up in each module and to possibly enable more than one processor on Ionosphere in the future, should there be a requirement for Ionosphere to analyse the metrics quicker. Running Ionosphere with more than one process is untested and currently it is hard coded to be 1 (https://github.com/earthgecko/skyline/issues/69)

IONOSPHERE_MAX_RUNTIME = 120

Variables:: IONOSPHERE_MAX_RUNTIME (int) – The maximum number of seconds an Ionosphere check should run for.

ENABLE_IONOSPHERE_DEBUG = False

Variables:: ENABLE_IONOSPHERE_DEBUG (boolean) – DEVELOPMENT only - enables additional debug logging useful for development only, this should definitely be set to False on production system as LOTS of output

IONOSPHERE_DATA_FOLDER = '/opt/skyline/ionosphere/data'

Variables:: IONOSPHERE_DATA_FOLDER (str) – This is the path for the Ionosphere data folder where anomaly data for timeseries and training will be stored - absolute path

IONOSPHERE_HISTORICAL_DATA_FOLDER = '/opt/skyline/ionosphere/historical_data'

Variables:: IONOSPHERE_HISTORICAL_DATA_FOLDER (str) – The absolute path for the Ionosphere historical data folder where anomaly data for timeseries and training will moved to when it reaches it purge age as defined for the namespace in settings.IONOSPHERE_CUSTOM_KEEP_TRAINING_TIMESERIES_FOR. Unless you are feeding in and analysing historical data, this can generally be ignored and is an ADVANCED FEATURE. Note if you do use this and settings.IONOSPHERE_CUSTOM_KEEP_TRAINING_TIMESERIES_FOR be advised that there is NO purge of this directory, it must be done MANUALLY when you have completed your historical analysis and training.

IONOSPHERE_PROFILES_FOLDER = '/opt/skyline/ionosphere/features_profiles'

Variables:: IONOSPHERE_PROFILES_FOLDER – This is the path for the Ionosphere data folder where anomaly data for timeseries will be stored - absolute path

IONOSPHERE_LEARN_FOLDER = '/opt/skyline/ionosphere/learn'

Variables:: IONOSPHERE_LEARN_FOLDER (str) – This is the path for the Ionosphere learning data folder where learning data for timeseries will be processed - absolute path

IONOSPHERE_CHECK_MAX_AGE = 300

Variables:: IONOSPHERE_CHECK_MAX_AGE (int) – Ionosphere will only process a check file if it is not older than IONOSPHERE_CHECK_MAX_AGE seconds. If it is set to 0 it does all. This setting just ensures if Ionosphere stalls for some hours and is restarted, the user can choose to discard older checks and miss anomalies being recorded if they so choose to, to prevent Ionosphere stampeding.

IONOSPHERE_KEEP_TRAINING_TIMESERIES_FOR = 259200

Variables:: IONOSPHERE_KEEP_TRAINING_TIMESERIES_FOR (int) – Ionosphere will keep timeseries data files for this long, for the operator to review and train on.

IONOSPHERE_CUSTOM_KEEP_TRAINING_TIMESERIES_FOR = []

Variables:: IONOSPHERE_CUSTOM_KEEP_TRAINING_TIMESERIES_FOR (list) – After settings.IONOSPHERE_KEEP_TRAINING_TIMESERIES_FOR has elapsed Ionosphere will move training data for metric namespaces declared in this list to the settings.ONOSPHERE_HISTORICAL_DATA_FOLDER. The metric namespaces defined here are only matched on a simple substring match, NOT on elements and/or a regex. If the substring is in the metric name it will be moved. Unless you are feeding in and analysed historical data, this can generally be ignored and is an advanced setting.

IONOSPHERE_MANAGE_PURGE = True

Variables:: IONOSPHERE_MANAGE_PURGE (boolean) – Ionosphere will manage purging the training_data and learn data directories. Under normal running conditions with a SSD hard drive this is perfectly acceptable, however if for some reason your file system is not purely SSD or you are using a distributed file system that is not extreme fast, managing purging via Ionosphere can cause Ionosphere analysis to lag. This can be set to False and you can manage purging with your own script or method and Ionosphere will not purge data.

IONOSPHERE_GRAPHITE_NOW_GRAPHS_OVERRIDE = False

Variables:: IONOSPHERE_GRAPHITE_NOW_GRAPHS_OVERRIDE – By defualt Graphite NOW graphs in training_data are retrieved from Graphite when the training data is viewed. Graphite NOW graphs are then generated using the current timestamp and plotting graphs at now - 7h, now - 24h, now - 7days and now - 30days, this can cause the actual anomalous period to not be shown in the Graphite NOW graphs if the training data is only loaded days later. This is especially true if metrics are batch processed with historic data. If this settings is set to True, Graphite NOW graphs will be loaded as Graphite THEN graphs, using the anomaly timestamp as now.

SKYLINE_URL = 'https://skyline.example.com'

Variables:: SKYLINE_URL (str) – The http or https URL (and port if required) to access your Skyline on (no trailing slash). For example if you were not using SSL termination and listening on a specific port it could be like http://skyline.example.com:8080’ [USER_DEFINED]

SERVER_PYTZ_TIMEZONE = 'UTC'

Variables:: SERVER_PYTZ_TIMEZONE (str) – You must specify a pytz timezone you want Ionosphere to use for the creation of features profiles and converting datetimes to UTC. This must be a valid pytz timezone name, see: https://github.com/earthgecko/skyline/blob/ionosphere/docs/development/pytz.rst http://earthgecko-skyline.readthedocs.io/en/ionosphere/development/pytz.html#timezones-list-for-pytz-version [USER_DEFINED]

IONOSPHERE_FEATURES_PERCENT_SIMILAR = 1.0

Variables:: IONOSPHERE_FEATURES_PERCENT_SIMILAR (float) – The percentage difference between a features profile sum and a calculated profile sum to result in a match.

IONOSPHERE_MINMAX_SCALING_ENABLED = True

Variables:: IONOSPHERE_MINMAX_SCALING_ENABLED (boolean) – Implement Min-Max scaling on features profile time series and an anomalous time series if the features profile sums do not match. This adds a form of standardization that significantly improves the Ionosphere features sum comparison technique of high range metrics within the IONOSPHERE_MINMAX_SCALING_RANGE_TOLERANCE boundaries.

IONOSPHERE_MINMAX_SCALING_RANGE_TOLERANCE = 0.15

Variables:: IONOSPHERE_MINMAX_SCALING_RANGE_TOLERANCE (float) – Min-Max scaling will only be implemented if the lower and upper ranges of both the features profile time series and the anomalous time series are within these margins. The default being 0.15 (or 15 percent). This prevents Ionosphere from Min-Max scaling and comparing time series that are in significantly different ranges and only applying Min-Max scaling comparisons when it is sensible to do so.

IONOSPHERE_ECHO_ENABLED = True

Variables:: IONOSPHERE_ECHO_ENABLED (boolean) – This enables Ionosphere to create and test features profiles for Mirage metrics but at settings.FULL_DURATION as well. Features profiles will be made on the fly for any existing, validated Mirage metric features profiles. Ionosphere’s matching performance is increased between 30 to 50 percent when Ionosphere echo is run.

IONOSPHERE_ECHO_MAX_FP_CREATE_TIME = 55

Variables:: IONOSPHERE_ECHO_MAX_FP_CREATE_TIME (int) – The maximum number of seconds an Ionosphere echo process should run creating FULL_DURATION features profiles for created Mirage features profiles. This setting is specifically relevant for Skyline implematations pre Ionosphere echo (v1.2.12) to prevent timeouts if Ionosphere echo needs to make > 30 echo features profiles for Mirage metrics with lots of existing features profiles.

IONOSPHERE_ECHO_FEATURES_PERCENT_SIMILAR = 2.5

Variables:: IONOSPHERE_ECHO_FEATURES_PERCENT_SIMILAR (float) – In terms of Ionosphere echo a value of 2.0 is the default. This default is above the normal IONOSPHERE_FEATURES_PERCENT_SIMILAR due to that fact that the the resolution of Ionosphere echo is at FULL_DURATION. During testing this value was tested at 1.0, 2 and 2.5, with 2.5 resulting in the most desirable results in terms of matching time series that are similarly not anomalous.

IONOSPHERE_ECHO_MINMAX_SCALING_FEATURES_PERCENT_SIMILAR = 3.5

Variables:: IONOSPHERE_ECHO_MINMAX_SCALING_FEATURES_PERCENT_SIMILAR (float) – In terms of Ionosphere echo Min-Max scaling percentage similar, a value of 3.5 is the default. This default is above the normal IONOSPHERE_FEATURES_PERCENT_SIMILAR due to that fact that the the resolution of Ionosphere echo is at FULL_DURATION and echo is using the normal IONOSPHERE_MINMAX_SCALING_RANGE_TOLERANCE to determine if Min-Max scaling should be run. During testing this value was tested at 1, 2 and 3.5, with 3.5 resulting in the most desirable results in terms of matching time series that are similarly not anomalous.

IONOSPHERE_LAYERS_USE_APPROXIMATELY_CLOSE = True

Variables:: IONOSPHERE_LAYERS_USE_APPROXIMATELY_CLOSE (boolean) – The D and E boundary limits will be matched if the value is approximately close to the limit. This is only implemented on boundary values that are > 10. The approximately close value is calculated as within 10 percent for limit values between 11 and 29 and within 5 percent when the limit value >= 30. It is only applied to the D layer with a ‘>’ or ‘>=’ condition and to the E layer with a ‘<’ or ‘<=’ condition.

IONOSPHERE_LEARN = True

Variables:: IONOSPHERE_LEARN (boolean) – Whether Ionosphere is set to learn

Note

The below IONOSPHERE_LEARN_DEFAULT_ variables are all overrideable in the IONOSPHERE_LEARN_NAMESPACE_CONFIG tuple per defined metric namespace further to this ALL metrics and their settings in terms of the Ionosphere learning context can also be modified via the webapp UI Ionosphere section. These settings are the defaults that are used in the creation of learnt features profiles and new metrics, HOWEVER the database is the preferred source of truth and will always be referred to first and the default or settings.IONOSPHERE_LEARN_NAMESPACE_CONFIG values shall only be used if database values are not determined. These settings are here so that it is easy to paint all metrics and others specifically as a whole, once a metric is added to Ionosphere via the creation of a features profile, it is painted with these defaults or the appropriate namespace settings in settings.IONOSPHERE_LEARN_NAMESPACE_CONFIG

Warning

Changes made to a metric settings in the database directly via the UI or your own SQL will not be overridden IONOSPHERE_LEARN_DEFAULT_ variables or the IONOSPHERE_LEARN_NAMESPACE_CONFIG tuple per defined metric namespace even if the metric matches the namespace, the database is the source of truth.

IONOSPHERE_LEARN_DEFAULT_MAX_GENERATIONS = 16

Variables:: IONOSPHERE_LEARN_DEFAULT_MAX_GENERATIONS (int) – The maximum number of generations that Ionosphere can automatically learn up to from the original human created features profile within the IONOSPHERE_DEFAULT_MAX_PERCENT_DIFF_FROM_ORIGIN Overridable per namespace in settings.IONOSPHERE_LEARN_NAMESPACE_CONFIG and via webapp UI to update DB

IONOSPHERE_LEARN_DEFAULT_MAX_PERCENT_DIFF_FROM_ORIGIN = 100.0

Variables:: IONOSPHERE_LEARN_DEFAULT_MAX_PERCENT_DIFF_FROM_ORIGIN (float) – The maximum percent that an automatically generated features profile can be from the original human created features profile, any automatically generated features profile with the a greater percent difference above this value when summed common features are calculated will be discarded. Anything below this value will be considered a valid learned features profile.

Note

This percent value will match -/+ e.g. works both ways x percent above or below. In terms of comparisons, a negative percent is simply multiplied by -1.0. The lower the value, the less Ionosphere can learn, to literally disable Ionosphere learning set this to 0. The difference can be much greater than 100, but between 7 and 100 is reasonable for learning. However to really disable learning, also set all max_generations settings to 1.

IONOSPHERE_LEARN_DEFAULT_FULL_DURATION_DAYS = 30

Variables:: IONOSPHERE_LEARN_DEFAULT_FULL_DURATION_DAYS (int) – The default full duration in in days at which Ionosphere should learn, the default is 30 days. Overridable per namespace in settings.IONOSPHERE_LEARN_NAMESPACE_CONFIG

IONOSPHERE_LEARN_DEFAULT_VALID_TIMESERIES_OLDER_THAN_SECONDS = 3661

Variables:: IONOSPHERE_LEARN_VALID_TIMESERIES_OLDER_THAN_SECONDS – The number of seconds that Ionosphere should wait before surfacing the metric timeseries for to learn from. What Graphite aggregration do you want the retention to run before querying it to learn from? Overridable per namespace in settings.IONOSPHERE_LEARN_NAMESPACE_CONFIG

IONOSPHERE_LEARN_NAMESPACE_CONFIG = (('skyline_test.alerters.test', 30, 3661, 16, 100.0), ('.*', 30, 3661, 16, 100.0))

Variables:: IONOSPHERE_LEARN_NAMESPACE_CONFIG – Configures specific namespaces at specific learning full duration in days. Overrides settings.IONOSPHERE_LEARN_DEFAULT_FULL_DURATION_DAYS, settings.IONOSPHERE_LEARN_DEFAULT_VALID_TIMESERIES_OLDER_THAN_SECONDS, settings.IONOSPHERE_MAX_GENERATIONS and settings.IONOSPHERE_MAX_PERCENT_DIFF_FROM_ORIGIN per defined namespace, first matched, used. Order highest to lowest namespace resoultion. Like settings.ALERTS, you know how this works now…

This is the config by which each declared namespace can be assigned a learning full duration in days. It is here to allow for overrides so that if a metric does not suit being learned at say 30 days, it could be learned at say 14 days instead if 14 days was a better suited learning full duration.

To specifically disable learning on a namespace, set LEARN_FULL_DURATION_DAYS to 0

Tuple schema example:

IONOSPHERE_LEARN_NAMESPACE_CONFIG = (
    # ('<metric_namespace>', LEARN_FULL_DURATION_DAYS,
    #  LEARN_VALID_TIMESERIES_OLDER_THAN_SECONDS, MAX_GENERATIONS,
    #  MAX_PERCENT_DIFF_FROM_ORIGIN),
    # Wildcard namespaces can be used as well
    ('metric3.thing\..*', 90, 3661, 16, 100.0),
    ('metric4.thing\..*.\.requests', 14, 3661, 16, 100.0),
    # However beware of wildcards as the above wildcard should really be
    ('metric4.thing\..*.\.requests', 14, 7261, 3, 7.0),
    # Disable learning on a namespace
    ('metric5.thing\..*.\.rpm', 0, 3661, 5, 7.0),
    # Learn all Ionosphere enabled metrics at 30 days
    ('.*', 30, 3661, 16, 100.0),
)

Namespace tuple parameters are:

Parameters:

metric_namespace (str) – metric_namespace pattern
LEARN_FULL_DURATION_DAYS (int) – The number of days that Ionosphere should should surface the metric timeseries for
LEARN_VALID_TIMESERIES_OLDER_THAN_SECONDS (int) – The number of seconds that Ionosphere should wait before surfacing the metric timeseries for to learn from. What Graphite aggregration do you want the retention at before querying it to learn from? REQUIRED, NOT optional, we could use the settings.IONOSPHERE_LEARN_DEFAULT_VALID_TIMESERIES_OLDER_THAN_SECONDS but that be some more conditionals, that we do not need, be precise, by now if you are training Skyline well you will understand, be precise helps :)
MAX_GENERATIONS (int) – The maximum number of generations that Ionosphere can automatically learn up to from the original human created features profile on this metric namespace.
MAX_PERCENT_DIFF_FROM_ORIGIN – The maximum percent that an automatically generated features profile can be from the original human created features profile for a metric in the namespace.

IONOSPHERE_AUTOBUILD = True

Variables:: IONOSPHERE_AUTOBUILD (boolean) – Make best effort attempt to auto provision any features_profiles directory and resources that have been deleted or are missing. NOT IMPLEMENTED YET

Note

This is highlighted as a setting as the number of features_profiles dirs that Ionosphere learn could spawn and the amount of data storage that would result is unknown at this point. It is possible the operator is going to need to prune this data a lot of which will probably never be looked at. Or a Skyline node is going to fail, not have the features_profiles dirs backed up and all the data is going to be lost or deleted. So it is possible for Ionosphere to created all the human interrupted resources for the features profile back under a best effort methodology. Although the original Redis graph image would not be available, nor the Graphite graphs in the resolution at which the features profile was created, however the fp_ts is available so the Redis plot could be remade and all the Graphite graphs could be made as best effort with whatever resoultion is available for that time period. This allows the operator to delete/prune feature profile dirs by possibly least matched by age, etc or all and still be able to surface the available features profile page data on-demand. NOT IMPLEMENTED YET

IONOSPHERE_UNTRAINABLES = []

Variables:: IONOSPHERE_UNTRAINABLES (list) – a list of metric names or namespaces that should be deemed as untrainable. For example you do not want to allow http.status.500 to be trained that an occassional 1 or 2 errors is normal and can be expected.

IONOSPHERE_PERFORMANCE_DATA_POPULATE_CACHE = False

Variables:: IONOSPHERE_PERFORMANCE_DATA_POPULATE_CACHE (boolean) – whether metrics_manager should populate the performance cache items at the beginning of the data. Under normal circumstances this is not required, it just make the Ionosphere performance graphs build quicker.

IONOSPHERE_PERFORMANCE_DATA_POPULATE_CACHE_DEPTH = 0

Variables:: IONOSPHERE_PERFORMANCE_DATA_POPULATE_CACHE_DEPTH (int) – namespace element depth to cache. If IONOSPHERE_PERFORMANCE_DATA_POPULATE_CACHE is enabled this is the number of namespace elements (0 indexed) to populate the cache to (including settings.FULL_NAMESPACE for which we shall use metrics. in the example below). For example take the metric namespaces, metrics.stats.* and metrics.telegraf.* if you wanted to cache the performance data for the metrics.stats.* namespace as a whole and the metrics.telegraf.* namespace as a whole, you would set this to 1. 0 would cache perfromance data for all metrics 1 would cache perfromance data for the sum of all metrics.*, and sum of all metrics.stats.* and sum of all metrics.telegraf.*. e.g. 0 is implied

IONOSPHERE_INFERENCE_MOTIFS_ENABLED = True

Variables:: IONOSPHERE_INFERENCE_MOTIFS_ENABLED (boolean) – Whether to have Ionosphere use the motif similarily matching method, based on mass-ts.

IONOSPHERE_INFERENCE_MOTIFS_SETTINGS = {'default_inference_batch_sizes': {180: {'find_exact_matches': False, 'max_area_percent_diff': 20.0, 'max_distance': 22, 'top_matches': 50}, 360: {'max_area_percent_diff': 20.0, 'max_distance': 23, 'top_matches': 50}, 720: {'max_area_percent_diff': 20.0, 'max_distance': 25, 'top_matches': 50}, 1440: {'find_exact_matches': False, 'max_area_percent_diff': 20.0, 'max_distance': 30, 'range_padding_percent': 10.0, 'top_matches': 50}}}

Variables:: IONOSPHERE_INFERENCE_MOTIFS_SETTINGS – variables required for motif analysis can be added per namespace. Each variable is explained below. If settings.IONOSPHERE_INFERENCE_MOTIFS_ENABLED is enabled a default_inference_batch_sizes must exist defining the default variables for all namespaces. Additional namespaces can be added with different variables if the defaults are not applicable to the namespace. It is important to note that the minimum batch_size should not be less than 180 data points so as to represent a large enough sample of the data to exclude repetitive volatility shifts. Any batch_size setting of less that 180 can result in false negatives.

IONOSPHERE_INFERENCE_MOTIFS_TOP_MATCHES = 50

Variables:: IONOSPHERE_INFERENCE_MOTIFS_TOP_MATCHES (int) – The total number of similar motifs to return. 10 is too little, 100 does not surface more, so 50 it is. Default if not defined.

IONOSPHERE_INFERENCE_MASS_TS_MAX_DISTANCE = 20.0

Variables:: IONOSPHERE_INFERENCE_MASS_TS_MAX_DISTANCE (float) – The maximum mass-ts distance value to consider a motif as similar. Any motif with a distance value above this is not similar enough to be used. 0.0 being exactly the same. Default if not defined.

IONOSPHERE_INFERENCE_MOTIFS_RANGE_PADDING = 10.0

Variables:: IONOSPHERE_INFERENCE_MOTIFS_RANGE_PADDING (float) – The amount of padding to be added to found similar motifs in terms of percentage of the max(y) and min(y) in the motif. If the values in a potentially anomalous motif fall within the padded range, the motif is matched as not anomalous. A lower value will result in less matches, a too high values will result in false negatives (not desirable). Default if not defined.

IONOSPHERE_INFERENCE_MOTIFS_SINGLE_MATCH = True

Variables:: IONOSPHERE_INFERENCE_MOTIFS_SINGLE_MATCH (boolean) – ADVANCED FEATURE. By default Ionosphere returns as not_anomalous on the best matching motif/shapelet that is found. However if this setting is set to False, Ionosphere return as not anomalous with ALL the matched shapelets. This is useful for testing and debugging when using settings.IONOSPHERE_INFERENCE_MOTIFS_TEST_ONLY

IONOSPHERE_INFERENCE_MOTIFS_TEST_ONLY = False

Variables:: IONOSPHERE_INFERENCE_MOTIFS_TEST – ADVANCED FEATURE. If this is set to True, inference will only record results for testing purposes, Ionosphere will not classify any checks as not anomalous even if similar motifs were found, inference will simply record the results in the database as normal in the motif_matched database table, no updated will be made to the ionosphere or the ionosphere_matched. If you do wish to run inference in test mode you can set the above IONOSPHERE_INFERENCE_MOTIFS_SINGLE_MATCH setting to False as well to enable inference to check ALL trained data at all durations and record all valid similar motifs and not return on the first match on found.

MEMCACHE_ENABLED = False

Variables:: MEMCACHE_ENABLED (boolean) – Enables the use of memcache in Ionosphere to optimise DB usage [USER_DEFINED]

MEMCACHED_SERVER_IP = '127.0.0.1'

Variables:: MEMCACHE_SERVER_IP (str) – The IP address of the memcached server

MEMCACHED_SERVER_PORT = 11211

Variables:: MEMCACHE_SERVER_PORT – The port of the memcached server

IONOSPHERE_LEARN_REPETITIVE_PATTERNS = False

Variables:: IONOSPHERE_LEARN_REPETITIVE_PATTERNS (boolean) – Whether to allow Ionosphere to learn repetetive, seasonal patterns from the existing training data (by default 3 days, defined by settings.IONOSPHERE_KEEP_TRAINING_TIMESERIES_FOR).

IONOSPHERE_FIND_REPETITIVE_PATTERNS = False

Variables:: IONOSPHERE_FIND_REPETITIVE_PATTERNS (boolean) – Whether to allow Ionosphere to find and learn repetetive patterns in anomalies over the previous 30 days.

IONOSPHERE_REPETITIVE_PATTERNS_MINMAX_AVG_VALUE = 0.0

Variables:: IONOSPHERE_REPETITIVE_PATTERNS_MINMAX_AVG_VALUE (float) – The threshold value on which to implement MinMax scaling on repetitive patterns learning. Setting this to 0.0 disables MinMax scaling. A sensible value is 1000.0

IONOSPHERE_REPETITIVE_PATTERNS_INCLUDE = {}

Variables:: IONOSPHERE_REPETITIVE_PATTERNS_INCLUDE (dict) – This is a convenience setting to allow for only certain metrics to be learnt from repetitive patterns. If defined it is evaluated before settings.IONOSPHERE_REPETITIVE_PATTERNS_EXCLUDE to filter only the metrics that match before the exclude is evaluated. This setting is to allow for testing repetitive learning with limited set of metrics before implementing on the entire metric population. It has the same data structure as settings.IONOSPHERE_REPETITIVE_PATTERNS_EXCLUDE below.

IONOSPHERE_REPETITIVE_PATTERNS_EXCLUDE = {'^carbon\\.': {'carbon': ['errors', 'droppedCreates', 'fullQueueDrops']}, '^skyline\\.': {'skyline': ['logged_errors', 'run_time', 'total_anomalies', 'exceptions', '.*_breakdown', 'metrics_sparsity', 'http_alerter', 'discarded']}, '_tenant_id="1"': {'http_duration_sum': {'code': ['.*code="2.*', '.*code="3.*'], 'instance': ['127.0.0.1', '"localhost:9090"'], 'job': ['prometheus']}, 'http_requests_total': {'code': ['.*code="4.*', '.*code="5.*'], 'instance': ['127.0.0.1', '"localhost:9090"'], 'job': ['prometheus']}}}

Variables:: IONOSPHERE_REPETITIVE_PATTERNS_EXCLUDE (dict) – A dictionary of metric names, namespace, metric elements, labels or values to match metrics which should be excluded from learning of repetetive patterns. Metrics most suited to being declared here are metrics that are related to errors, 50x status codes, access_denied, etc, things you do not want learnt. The dictionary is keyed on some element and then an additional dictionary of filters made up of strings, metric elements, labels or values can be passed which match the metrics to be excluded. This is a nested dictionary can be made up of a primary match filter and one or more secondary match filters all being dictionary keys and tertiary match filter which holds list values. Additionally there is a special key that can be use, _NOT. This key can either be a single list or a dict of keys each with a single list.

Annotated example - pay attention the comments regarding using values with labelled metrics:

IONOSPHERE_REPETITIVE_PATTERNS_EXCLUDE = {
    '^skyline\.': {  # The primary match filter, must be matched
        'skyline': [  # The secondary match filter, must be matched
            'logged_errors', 'run_time'
            ...,
        ], # A tertiary match filter list, at least one of these must be matched
    },
    '_tenant_id="1"': {  # The primary match filter, must be matched (with
                         # multiple secondary match filters)
        'http_requests_total': {  # A secondary match filter, must be matched
                                # with multiple tertiary match lists and
                                # at least one must match from each list
            'code': ['.*code="4.*', '.*code="5.*'],
            'job': ['prometheus'],
            'instance': ['127.0.0.1:9090', '"localhost:9090"'],
        },
        'prometheus_http_request_duration_seconds_sum': {
            'handler': ['.*handler="/api/v1/status/buildinfo".*', '.*handler="/static/\*filepath".*'],
            # NOTE WITH LABELLED METRICS if you declare a value it must be
            # defined by a valid regex including the label AND value,
            # enclosed by .* on either side.  If you just use a label and
            # not the value, then labels are matched without requiring a regex
            'job': ['prometheus'],
            'instance': ['127.0.0.1:9090', '"localhost:9090"'],
        },
    },
    'vm_http_request_errors_total': {
        'cluster': ['.*cluster="victoriametrics-cluster-dev-eu".*'],
        '_NOT': {  # An example of the _NOT key
            'component': ['.*component="vmselect".*'],
            'path': ['.*path="/select/{}/prometheus/api/v1/query_range".*'],
        }
    },
    'prometheus_target_scrape_pool_reloads_total': {
        'job': ['.*job=".*'],
        'NOT_': [':*job="loki".*']  # Example of a _NOT list
    },
}

IONOSPHERE_ENFORCE_DOWNSAMPLING = {}

Variables:: IONOSPHERE_ENFORCE_DOWNSAMPLING (dict) – Declare the period resolutions to enforce for the creation of features profiles. Features profiles work best on time series which are between 1000 and 5000 data points for two reasons, speed and accuracy. Although it is possible to use the method on high resolution time series data, it is much slower and more importantly less accurate. Enforcing downsampling is RECOMMENDED for optimium performance.

Example:

IONOSPHERE_ENFORCE_DOWNSAMPLING = {
    # duration, resolution
    86400: 60,
    604800: 600,
    2592000: 600,
}

LUMINOSITY_PROCESSES = 1

Variables:: LUMINOSITY_PROCESSES (int) – This is the number of Luminosity processes to run.

ENABLE_LUMINOSITY_DEBUG = False

Variables:: ENABLE_LUMINOSITY_DEBUG (boolean) – To allow luminosity debug logging.

LUMINOSITY_DATA_FOLDER = '/opt/skyline/luminosity'

Variables:: LUMINOSITY_DATA_FOLDER (str) – This is the path for luminosity data where classify_metrics plot images, etc are stored - absolute path.

OTHER_SKYLINE_REDIS_INSTANCES = []

Variables:: OTHER_SKYLINE_REDIS_INSTANCES (list) – This a nested list of any Redis instances that Skyline should query for correlation time series ONLY applicable if there are multiple Skyline instances each with their own Redis.

THIS IS TO BE DEPRECATED IN v.1.2.5, there is no longer a requirement to access Redis remotely between Skyline instances, this has been replaced by a api method which uses the REMOTE_SKYLINE_INSTANCES settings below as of v1.2.4. For example, the IP or FQDN as a string and the port as an int and the Redis password as a str OR if there is no password the boolean None: OTHER_SKYLINE_REDIS_INSTANCES = [[‘192.168.1.10’, 6379, ‘this_is_the_redis_password’], [‘192.168.1.15’, 6379, None]]

Note

If you run multiple Skyline instances and are going to run cross correlations and query another Redis please ensure that you have Redis authentication enabled. See https://redis.io/topics/security and http://antirez.com/news/96 for more info.

ALTERNATIVE_SKYLINE_URLS = []

Variables:: ALTERNATIVE_SKYLINE_URLS (list) – The alternative URLs of any other Skyline instances. This ONLY applicable if there are multiple Skyline instances each with their own Redis.

For example (note NO trailing slash): ALTERNATIVE_SKYLINE_URLS = [’http://skyline-na.example.com:8080’,’http://skyline-eu.example.com’]

REMOTE_SKYLINE_INSTANCES = []

Variables:: REMOTE_SKYLINE_INSTANCES (list) – ADVANCED FEATURE - This a nested list of any remote instances that Skyline should query for correlation time series this is ONLY applicable if there are multiple Skyline instances each with their own Redis data. This is for Skyline Luminosity to query other Skyline instances via the luminosity_remote_data API get the relevant time series fragments, by default the previous 12 minutes, for all the metrics on the other Skyline instance/s (gizpped) in order to run correlations in all metrics in the population. Related settings are settings.HORIZON_SHARDS and settings.SYNC_CLUSTER_FILES

For example, the IP or FQDN, the username, password and hostname as strings str:

REMOTE_SKYLINE_INSTANCES = [
    ['http://skyline-na.example.com:8080','remote_WEBAPP_AUTH_USER','remote_WEBAPP_AUTH_USER_PASSWORD', 'skyline-1'],
    ['http://skyline-eu.example.com', 'another_remote_WEBAPP_AUTH_USER','another_WEBAPP_AUTH_USER_PASSWORD', 'skyline-2']]

The hostname element must match the hostnames in settings.HORIZON_SHARDS for the instances to be able to determine which Skyline instance is authorative for a metric.

CORRELATE_ALERTS_ONLY = True

Variables:: CORRELATE_ALERTS_ONLY (boolean) – Only cross correlate anomalies the have an alert setting (other than syslog). This reduces the number of correlations that are recorded in the database. Non alerter metrics are still however cross correlated against when an anomaly triggers on an alerter metric.

LUMINOL_CROSS_CORRELATION_THRESHOLD = 0.9

Variables:: LUMINOL_CROSS_CORRELATION_THRESHOLD (float) – Only record Luminol cross correlated metrics where the correlation coefficient is > this float value. Linkedin’s Luminol library is hardcoded to 0.8, however with lots of testing 0.8 proved to be too low a threshold and resulted in listing many metrics that were not related. You may find 0.9 too low as well, it can also record a lot, however in root cause analysis and determining relationships between metrics 0.9 has proved more useful for in seeing the tress in the forest. This can be a value between 0.0 and 1.00000 - 1.0 being STRONGEST cross correlation

LUMINOSITY_RELATED_TIME_PERIOD = 240

Variables:: LUMINOSITY_RELATED_TIME_PERIOD (int) – The time period (in seconds) either side of the anomaly that should be checked to report possible related anomalies.

LUMINOSITY_CORRELATE_ALL = True

Variables:: LUMINOSITY_CORRELATE_ALL (boolean) – By default all metrics will be correlated with the entire metric population.

LUMINOSITY_CORRELATE_NAMESPACES_ONLY = []

Variables:: LUMINOSITY_CORRELATE_NAMESPACES_ONLY (list) – A list of namespaces that metrics in the same namespace should be correlated with. The default is an empty list which results in all metrics being correlated with all metrics. If namespaces are declared in the list, all metrics will be evaluated as to whether they are in the list. Metrics will be evaluated against namespaces in this list using matched_or_regexed_in_list() which determines if a pattern is in a list as a: 1) absolute match 2) match been dotted elements 3) matched by a regex. Metrics in the list will only be correlated with metrics in the same namespace and excluded from correlations within ANY other namespace, unless defined in the below mod:settings.LUMINOSITY_CORRELATION_MAPS method.

List example:

LUMINOSITY_CORRELATE_NAMESPACES_ONLY = [
    'aws.euw1',
    'aws.use1',
    'gcp.us-east4',
}

In the above example, metrics with the aws.euw1 namespace would only be correlated against other aws.euw1 metrics, likewise for aws.use1 and gcp.us-east4. This also means that if there were other metric namespaces like gcp.southamerica-east1 and gcp.asia-east1 they would not be correlated with any of the above example namespaces.

LUMINOSITY_CORRELATION_MAPS = {}

Variables:: LUMINOSITY_CORRELATION_MAPS (dictionary) – A dictionary of lists of metrics which should be correlated. These lists hold absolute metric names, to correlate using namespaces use the above mod:settings.LUMINOSITY_CORRELATE_NAMESPACES_ONLY method. Although both methods can be run simultaneosly, this method allows for only correlating specific groups of metrics. It be used to on its own only correlate certain metrics if nothing is defined in mod:settings.LUMINOSITY_CORRELATE_NAMESPACES_ONLY. Or it can also be used in conjunction with mod:settings.LUMINOSITY_CORRELATE_NAMESPACES_ONLY to also correlate metrics in different namespaces. For example say you defined aws.euw1 and aws.use1 as seperate namespaces in mod:settings.LUMINOSITY_CORRELATE_NAMESPACES_ONLY but you also want to correlate some specific group/s of metrics that occur in both aws.euw1 and aws.use1, you could define those here.

Dictionary example:

LUMINOSITY_CORRELATION_MAPS = {
    'aws.webservers.nginx': [
        'aws.euw1.webserver-1.nginx.apm.mainsite.avg_request_timing',
        'aws.euw1.webserver-2.nginx.apm.mainsite.avg_request_timing',
        'aws.use1.webserver-6.nginx.apm.mainsite.avg_request_timing',
        'aws.use1.webserver-8.nginx.apm.mainsite.avg_request_timing',
    ]
}

Note that if both methods are enabled, correlations will be done on a common result of both methods, meaning that a metric will be evaluated against both methods and the resulting list of metrics that it should be correlated against will be used.

LUMINOSITY_CLASSIFY_METRICS_LEVEL_SHIFT = False

Variables:: LUMINOSITY_CLASSIFY_METRICS_LEVEL_SHIFT (boolean) – ADVANCED FEATURE. Enable luminosity/classify_metrics to identify metrics that experience significant level shifts.

LUMINOSITY_LEVEL_SHIFT_SKIP_NAMESPACES = []

Variables:: LUMINOSITY_LEVEL_SHIFT_SKIP_NAMESPACES (list) – namespace to skip level shift classification.

These are metrics that you want luminosity to not attempt to classify as level shift metrics, generally sparsely populated namespaces. Works in the same way that SKIP_LIST works, it matches in the string or dotted namespace elements.

LUMINOSITY_CLASSIFY_ANOMALIES = False

Variables:: LUMINOSITY_CLASSIFY_ANOMALIES (boolean) – Whether to classify anomaly types.

LUMINOSITY_CLASSIFY_ANOMALY_ALGORITHMS = ['adtk_level_shift', 'adtk_volatility_shift', 'adtk_persist', 'adtk_seasonal']

Variables:: LUMINOSITY_CLASSIFY_ANOMALY_ALGORITHMS – ADVANCED FEATURE. List of custom algorithms to be used for classifying anomalies.

LUMINOSITY_CLASSIFY_ANOMALIES_SAVE_PLOTS = False

Variables:: LUMINOSITY_CLASSIFY_ANOMALIES_SAVE_PLOTS (boolean) – ADVANCED FEATURE. Whether to save anomaly classification plots in the training data.

LUMINOSITY_CLOUDBURST_ENABLED = False

Variables:: LUMINOSITY_CLOUDBURST_ENABLED (boolean) – Whether to enable Luminosity cloudburst to run and identify significant changepoints

LUMINOSITY_CLOUDBURST_PROCESSES = 1

Variables:: LUMINOSITY_CLOUDBURST_PROCESSES (int) – The number of processes that luminosity/cloudbursts should divide all the metrics up between to identify significant changepoints in the metrics. In a very large metric population you may need to set this to more than 1. As a guide a single process can generally analyse and identify potentially significant changepoints in about 1500 metrics in 60 seconds.

LUMINOSITY_CLOUDBURST_RUN_EVERY = 900

Variables:: LUMINOSITY_CLOUDBURST_RUN_EVERY (int) – This is how often to run luminosity/cloudbursts to identify significant changepoints in metrics. To disable luminosity/cloudbursts set this to 0.

LUMINOSITY_CLOUDBURST_SKIP_METRICS = []

Variables:: LUMINOSITY_CLOUDBURST_SKIP_METRICS (list) – A list of metric names, namespaces, metric elements, labels and values of metrics to skip running cloudburst analysis on.

LUMINOSITY_RELATED_METRICS = False

Variables:: LUMINOSITY_RELATED_METRICS (boolean) – Whether to enable Luminosity related_metrics to run and learn related metrics.

LUMINOSITY_RELATED_METRICS_MAX_5MIN_LOADAVG = 2.0

Variables:: LUMINOSITY_RELATED_METRICS_MAX_5MIN_LOADAVG (float) – The Luminosity related_metrics will ONLY run When the 5min loadavg on the machine running the process is BELOW this loadavg. Because the process is an offline process it can run as and when appropriate.

LUMINOSITY_RELATED_METRICS_MIN_CORRELATION_COUNT_PERCENTILE = 95.0

Variables:: LUMINOSITY_RELATED_METRICS_MIN_CORRELATION_COUNT_PERCENTILE (float) – The percentile of cross correlation counts to be included in the related metrics evaluation. This value should be in the range of >= 95.0 to ensure that metrics are only related when there is a high degree of confidence. This results in only assessing the metrics which have the highest number of cross correlations being considered to be related. Its purpose is to discard the outlier, numeric significance only recorded correlations.

LUMINOSITY_RELATED_METRICS_MINIMUM_CORRELATIONS_COUNT = 3

Variables:: LUMINOSITY_RELATED_METRICS_MINIMUM_CORRELATIONS_COUNT (int) – The minimum number of cross correlations recorded to include the metric for evaluation in get_cross_correlation_relationships. This number should not be less than 3 to ensure that metrics are only related when there is a high degree of confidence.

DOCKER = False

Variables:: DOCKER (boolean) – Whether Skyline is running on Docker or not

DOCKER_DISPLAY_REDIS_PASSWORD_IN_REBROW = False

Variables:: DOCKER_DISPLAY_REDIS_PASSWORD_IN_REBROW (boolean) – Whether to show the Redis password in the webapp Rebrow login page

DOCKER_FAKE_EMAIL_ALERTS = False

Variables:: DOCKER_FAKE_EMAIL_ALERTS (boolean) – Whether to make docker fake email alerts. At the moment docker has no support to send email alerts, however an number of Ionosphere resources are created when a email alert is sent. Therefore in the docker context email alerts are processed only the SMTP action is not run. If Skyline is running on docker, this must be set to True.

FLUX_IP = '127.0.0.1'

Variables:: FLUX_IP (str) – The IP to bind the gunicorn flux server on.

FLUX_PORT = 8000

Variables:: FLUX_PORT (int) – The port for the gunicorn flux server to listen on.

FLUX_WORKERS = 1

Variables:: FLUX_WORKERS (int) – The number of gunicorn flux workers. There are 2 processes spawned per gunicorn process, FLUX_WORKERS = 1 will result in two active flux workers and one primary workers which manages counts and flux worker sets, one active flux aggregator and x standby aggregator processes. FLUX_WORKERS = 2 will result in 4 active flux workers again with one primary worker for flux worker set management, etc. Should the primary worker or aggregator die one of the other processes will become the primary.

FLUX_VERBOSE_LOGGING = True

Variables:: FLUX_VERBOSE_LOGGING (boolean) – If set to True flux will log the data recieved in requests and the data sent to Graphite. It is sent to True by default as it was the default before this option was added. If flux is going to ingest 1000s of metrics consider setting this to False.

FLUX_SELF_API_KEY = 'YOURown32charSkylineAPIkeySecret'

Variables:: FLUX_SELF_API_KEY (str) – this is a 32 alphanumeric string that is used to validate direct requests to Flux. Vista uses it and connects directly to Flux and bypass the reverse proxy and authenticates itself. It can only be digits and letters e.g. [0-9][a-Z] [USER_DEFINED]

FLUX_API_KEYS = {}

Variables:: FLUX_API_KEYS (dict) – The flux /flux/metric_data and /flux/metric_data_post endpoints are controllered via API keys. Each API key can additionally specific a metric namespace prefix to prefix all metric names submitted with the key with the defined namespace prefix. Each API key must be a 32 character alphanumeric string [a-Z][0-9] The trailing dot of the namespace prefix must not be specified it will be automatically added as the separator between the namespace prefix and the metric name. For more see https://earthgecko-skyline.readthedocs.io/en/latest/upload-data-to-flux.html

Example:

FLUX_API_KEYS = {
    'ZlJXpBL6QVuZg5KL4Vwrccvl8Bl3bBjC': 'warehouse-1',
    'KYRsv508FJpVg7pr11vnZTbeu11UvUqR': 'warehouse-1'  # allow multiple keys for a namesapce to allow for key rotation
    'ntG9Tlk74FeV7Muy65EdHbZ07Mpvj7Gg': 'warehouse-2.floor.1'
}

FLUX_BACKLOG = 254

Variables:: FLUX_BACKLOG – The maximum number of pending connections. This refers to the number of clients that can be waiting to be served. Exceeding this number results in the client getting an error when attempting to connect. It should only affect servers under significant load. As per http://docs.gunicorn.org/en/stable/settings.html#backlog

FLUX_MAX_AGE = 3600

Variables:: FLUX_MAX_AGE (int) – The maximum age of a timestamp that flux will accept as valid. This can vary depending on retentions in question and this method may be altered in a future release.

FLUX_PERSIST_QUEUE = False

Variables:: FLUX_PERSIST_QUEUE (boolean) – By default flux does not persist the incoming queue on a flux restart any metrics in the queue will be lost. If flux is only accepting a small amount of data points, flux will probably get through the queue in seconds. However if 1000s of metrics are being sent to flux per minute and you do not want to lose any metrics and data points when flux is restarted you can set flux to persist the queue, which is done via the flux.queue Redis set. Persisting the queue has a computational cost and if the queue is large then when flux restarts it may lag and not get through all the queue for some time unless you adjust the amount of FLUX_WORKERS.

FLUX_CHECK_LAST_TIMESTAMP = True

Variables:: FLUX_CHECK_LAST_TIMESTAMP (boolean) – By default flux deduplicates data and only allows for one data point to be submitted per timestamp, however this has a cost in terms of requests and number of keys in Redis. If you have lots of metrics coming into flux consider setting this to False and ensure that the application/s submitting data to flux do not submit data unordered.

FLUX_SEND_TO_CARBON = True

Variables:: FLUX_SEND_TO_CARBON (boolean) – Whether to send metrics recieved by flux to Graphite.

FLUX_CARBON_HOST = 'YOUR_GRAPHITE_HOST.example.com'

Variables:: FLUX_CARBON_HOST (str) – The carbon host that flux should send metrics to if FLUX_SEND_TO_CARBON is enabled.

FLUX_CARBON_PORT = 2003

Variables:: FLUX_CARBON_PORT (int) – The carbon host port that flux should send metrics via FLUX_SEND_TO_CARBON is enabled.

FLUX_CARBON_PICKLE_PORT = 2004

Variables:: FLUX_CARBON_PICKLE_PORT (int) – The port for the Carbon PICKLE_RECEIVER_PORT on Graphite as per defined in Graphite’s carbon.conf

FLUX_GRAPHITE_WHISPER_PATH = '/opt/graphite/storage/whisper'

Variables:: FLUX_GRAPHITE_WHISPER_PATH (str) – This is the absolute path on your GRAPHITE server, it is required by flux to determine that a metric name and the path does not exceed the maximum maximum path of length of 4096 characters.

FLUX_PROCESS_UPLOADS = False

Variables:: FLUX_PROCESS_UPLOADS (boolean) – Whether flux is enabled to process uploaded data files in settings.DATA_UPLOADS_PATH. This is related to the settings.WEBAPP_ACCEPT_DATA_UPLOADS setting and files in settings.DATA_UPLOADS_PATH are processed and the data is sent to Graphite.

FLUX_SAVE_UPLOADS = False

Variables:: FLUX_SAVE_UPLOADS (boolean) – Whether flux should save processed upload data in settings.FLUX_SAVE_UPLOADS_PATH.

FLUX_SAVE_UPLOADS_PATH = '/opt/skyline/flux/processed_uploads'

Variables:: FLUX_SAVE_UPLOADS_PATH (str) – The path flux saves processed data to if settings.FLUX_SAVE_UPLOADS is True. Note that this directory must exist and be writable to the user that Skyline processes are running as. Or the parent directory must exist and be owned by the user that Skyline processes are running as.

FLUX_UPLOADS_KEYS = {}

Variables:: FLUX_UPLOADS_KEYS (dict) – For each parent_metric_namespace a key must be assigned to the namespace as the upload_data endpoint is not authenticated. For uploads via the webapp Flux page these are handled using the settings.FLUX_SELF_API_KEY key.

Example:

FLUX_UPLOADS_KEYS = {
    'remote_sites.warehouse.1': 'c65909df-9e06-41b7-a455-4f10b99aa741',
    'remote_sites.warehouse.2': '1e8b1c63-10d3-4a24-bb27-d2513861dbf6'
}

FLUX_ZERO_FILL_NAMESPACES = []

Variables:: FLUX_ZERO_FILL_NAMESPACES (list) – For each namespace or namespace elements declared in this list, flux will send 0 if no data is recieved for a metric in the namespace in the last 60 seconds. This enables Skyline to fill sparsely populated metrics with 0 where appropriate, continous data time series are better for analysing, especially in terms of features profiles. The namespaces declared here can be absolute metric names, elements or a regex of the namespace.

Example:

FLUX_ZERO_FILL_NAMESPACES = [
    'nginx.errors',
    'external_sites.www_example_com.avg_pageload',
    'sites.avg_pageload',
]

FLUX_LAST_KNOWN_VALUE_NAMESPACES = []

Variables:: FLUX_LAST_KNOWN_VALUE_NAMESPACES (list) – For each namespace or namespace elements declared in this list, flux will send the last value if no data is recieved for a metric in the namespace in the last 60 seconds. The namespaces declared here can be absolute metric names, elements or a regex of the namespace.

Example:

FLUX_LAST_KNOWN_VALUE_NAMESPACES = [
    'external_sites.www_example_com.users',
    'external_sites.shop_example_com.products',
]

FLUX_AGGREGATE_NAMESPACES = {'otel.traces': {'interval': 60, 'last_known_value': False, 'method': ['avg'], 'method_suffix': False, 'zero_fill': False}}

Variables:: FLUX_AGGREGATE_NAMESPACES (dict) – For each namespace or namespace elements declared in this dict, flux/listen will send data points received to flux/aggregator which will then submit metrics to flux/worker at every interval, aggregating by the defined method/s. If multiple methods are used flux will submit a metric per method defined.

Each namespace is defined in a dict with the following keys: method: a list of methods to apply to the metric aggregation, valid methods are avg, sum, min and max. More than one method can be applied which will result a metric being submitted for each method. If multiple methods are applied method_suffix must be set to True and if not is automatically set. interval: the interval in seconds at which to aggregate the metric data. If a datapoint is received for a metric every 5 seconds and the interval is set to 60, flux will submit the aggregated value/s for the metric to Graphite every 60 seconds. zero_fill: if this is set to True flux will submit a 0 to Graphite every interval seconds if no data is received for the metric in the interval period. Note a namespace can either be set to zero_fill or last_known_value, not both. last_known_value: if this is set to True, if flux does not receive data for a metric in the interval period, flux will submit the last value that it submitted to Graphite for the current interval period. This is like a gauge metric. Note a namespace can either be set to zero_fill or last_known_value, not both. method_suffix: if set to True, flux will suffix the metric name with the method, for example, mysite.events.pageloads.avg, if mulitple methods are declared this must be set to True and if not set will be automatically added otherwise the metric will have all the method values submitted to a single metric name.

Example:

FLUX_AGGREGATE_NAMESPACES = {
    'otel.traces': {
        'method': ['avg'],
        'interval': 60,
        'zero_fill': False,
        'last_known_value': False,
        'method_suffix': False},
    'mysite.events.loadtime': {
        'method': ['avg'],
        'interval': 60,
        'zero_fill': True,
        'last_known_value': False,
        'method_suffix': False},
    'mysite.events.pageloads': {
        'method': ['avg', 'sum', 'max', 'min'],
        'interval': 60,
        'zero_fill': False,
        'last_known_value': False,
        'method_suffix': True},
    'warehouse1.kwh.meter.reading': {
        'method': ['avg'],
        'interval': 60,
        'zero_fill': False,
        'last_known_value': False,
        'method_suffix': False},
}

FLUX_EXTERNAL_AGGREGATE_NAMESPACES = False

Variables:: FLUX_EXTERNAL_AGGREGATE_NAMESPACES (boolean) – If there are aggrgegate metrics defined in any external settings and there are none defined in the above FLUX_AGGREGATE_NAMESPACES the setting can force flux to check for aggregate metrics from the metrics_manager.

FLUX_NAMESPACE_QUOTAS = {}

Variables:: FLUX_NAMESPACE_QUOTAS (dict) – ADVANCED FEATURE. A top level namespace can be limited in terms of how many metrics flux will accept for the namespace. This only applies to metrics sent to flux with a FLUX_API_KEYS namespace. It cannot be applied to metrics sent to flux using the FLUX_SELF_API_KEY and currently only applies to POSTs with multiple metrics.

Example:

FLUX_NAMESPACE_QUOTAS = {
    'warehouse-1': 300,
    'warehouse-2': 30,
}

FLUX_SEND_TO_STATSD = False

Variables:: FLUX_SEND_TO_STATSD (boolean) – Whether to send metrics recieved by flux to statsd.

FLUX_STATSD_HOST = ''

Variables:: FLUX_STATSD_HOST (str) – The statsd host that flux should send metrics to if FLUX_SEND_TO_STATSD is enabled.

FLUX_STATSD_PORT = 8125

Variables:: FLUX_STATSD_PORT (int) – The statsd host port that flux should send metrics via FLUX_SEND_TO_STATSD is enabled.

FLUX_OTEL_ENABLED = False

Variables:: FLUX_OTEL_ENABLED (boolean) – EXPERIMENTAL FEATURE. Whether to accept opentelemetry OTLP traces and convert them into metrics.

FLUX_DROP_BUCKET_METRICS = True

Variables:: FLUX_DROP_BUCKET_METRICS (boolean) – Whether to drop Prometheus/VictoriaMetrics _bucket{ metrics. These are histogram metrics and in Skyline currently have little value as they are difficult to group and analyse. The _count and _sum metrics are sufficient for signals. _bucket metrics in themselves tend to make noise.

VISTA_ENABLED = False

Variables:: VISTA_ENABLED (boolean) – Enables Skyline vista

VISTA_VERBOSE_LOGGING = False

Variables:: VISTA_VERBOSE_LOGGING (boolean) – As of Skyline 3.0, apps log notice and errors only. To have addtional info logged set this to True. Useful for debugging but less verbose than LOCAL_DEBUG.

VISTA_FETCHER_PROCESSES = 1

Variables:: VISTA_FETCHER_PROCESSES (int) – the number of Vista fetcher processes to run. In all circumstances 1 process should be sufficient as the process runs asynchronous requests.

VISTA_FETCHER_PROCESS_MAX_RUNTIME = 50

Variables:: VISTA_FETCHER_PROCESS_MAX_RUNTIME (int) – the maximum number of seconds Vista fetcher process/es should run before being terminated.

VISTA_WORKER_PROCESSES = 1

Variables:: VISTA_WORKER_PROCESSES (int) – the number of Vista worker processes to run to validate and submit the metrics to Flux and Graphite.

VISTA_DO_NOT_SUBMIT_CURRENT_MINUTE = True

Variables:: VISTA_DO_NOT_SUBMIT_CURRENT_MINUTE (boolean) – Do not resample or send data that falls into the current minute bin to Graphite. This means that Skyline will only analyse data 60 seconds behind. In terms of fetching high frequency data this should always be the default, so that Skyline is analysing the last complete data point for a minute and is not analysing a partially populated data point which will result in false positives.

VISTA_FETCH_METRICS = ()

Variables:: VISTA_FETCH_METRICS (tuple) – Enables Skyline vista

This is the config where metrics that need to be fetched are defined.

Tuple schema example:

VISTA_FETCH_METRICS = (
    # (remote_host, remote_host_type, frequency, remote_target, graphite_target, uri, namespace_prefix, api_key, token, user, password, (populate_at_resolution_1, populate_at_resolution_2, ...)),
    # Example with no authentication
    ('https://graphite.example.org', 'graphite', 60, 'stats.web01.cpu.user', 'stats.web01.cpu.user', '/render/?from=-10minutes&format=json&target=', 'vista.graphite_example_org', None, None, None, None, ('90days', '7days', '24hours', '6hours')),
    ('https://graphite.example.org', 'graphite', 60, 'sumSeries(stats.*.cpu.user)', 'stats.cumulative.cpu.user', '/render/?from=-10minutes&format=json&target=', 'vista.graphite_example_org', None, None, None, None, ('90days', '7days', '24hours', '6hours')),
    ('https://graphite.example.org', 'graphite', 3600, 'swell.tar.hm0', 'swell.tar.hm0', '/render/?from=-120minutes&format=json&target=', 'graphite_example_org', None, None, None, None, ('90days', '7days', '24hours', '6hours')),
    ('http://prometheus.example.org:9090', 'prometheus', 60, 'node_load1', 'node_load1', 'default', 'vista.prometheus_example_org', None, None, None, None, ('15d',))
    ('http://prometheus.example.org:9090', 'prometheus', 60, 'node_network_transmit_bytes_total{device="eth0"}', 'node_network_transmit_bytes_total.eth0', '/api/v1/query?query=node_network_transmit_bytes_total%7Bdevice%3D%22eth0%22%7D%5B5m%5D', 'vista.prometheus_example_org', None, None, None, None, None)
)

All the fetch tuple parameters are required to be present in each fetch tuple.

Parameters:

remote_host (str) – the remote metric host base URL, including the protocol and port.
remote_host_type (str) – the type of remote host, valid options are graphite and prometheus.
frequency (int) – the frequency with which to fetch data in seconds.
remote_target (str) – the remote target to fetch, this can be a single metric, a single metric with function/s applied or a series of metrics with function/s applied which result is a single derived time series.
graphite_target (str) – the absolute metric name to be used to store in Graphite this excludes the namespace_prefix set by the namespace_prefix param below.
uri (str) – the metric host endpoint URI used to retrieve the metric. FOR GRAPHITE: valid Graphite URIs are only in the form of from=-<period><minutes|hours|days>`, only minutes, hours and days can be passed otherwise the regexs for back filling any missing data will not work. FOR PROMETHEUS: the uri can be passed a ‘default’, this will dynamically generate a URI in terms of the URLENCODED_TARGET, start= and end= parameters that are passed to query_range, based on the current time. A Prometheus API URI can be passed, but the query should be URL encoded and cannot have any dynamic date based parameters.
namespace_prefix (str) – the Graphite namespace prefix to use for submitting metrics to, can be passed as ‘’ if you do not want to prefix the metric names. The namespace_prefix must NOT have a trailing dot.
api_key – an API key if one is required, otherwise pass the boolean None
token (str) – a token if one is required, otherwise pass the boolean None
user (str) – a username if one is required, otherwise pass the boolean None
password (str) – a password if one is required, otherwise pass the boolean None
populate_at_resolutions – if you want Vista to populate the metric with historic data this tuple allows you to declare at what resolutions to populate the data with. If you do not want to pre-populated then do not declare a tuple and simply pass an empty tuple (). NOTE - you cannot declare a resolution from Prometheus metrics which use a custom uri, historic metrics cannot currently be pulled with a custom uri. For a detailed description of this functionality please see the Vista documentation page at: https://earthgecko-skyline.readthedocs.io/en/latest/vista.html#pre-populating-metrics-with-historic-data https://earthgecko-skyline.readthedocs.io/en/latest/vista.html#populate_at_resolutions

VISTA_GRAPHITE_BATCH_SIZE = 20

Variables:: VISTA_GRAPHITE_BATCH_SIZE (int) – The number of metrics that Vista should retrieve from a Graphite host in a single request, if the metrics being requested are being requested with the same from parameter (timestamp).

SNAB_ENABLED = False

Variables:: SNAB_ENABLED (str) – Whether SNAB is enabled or nor,

SNAB_DATA_DIR = '/opt/skyline/SNAB'

Variables:: SNAB_DATA_DIR (str) – The directory where SNAB writes data files.

SNAB_anomalyScore = {}

Variables:: SNAB_anomalyScore (dict) – Each analysis app or all apps can record an anomalyScore for each analysis. This is an advanced feature for testing and development purposes. NOTE that the above settings.SNAB_ENABLED does not have to be set to True for the SNAB_anomalyScore function to work.

Examples:

SNAB_anomalyScore = {}

SNAB_anomalyScore = {
    'all': ['telegraf.test-server1'],
    'analyzer': ['telegraf.test-server1'],
    'analyzer_batch': ['telegraf.test-server1', 'test_batch_metrics.'],
    'mirage': ['telegraf.test-server1', 'test_batch_metrics.'],
    'SNAB': ['\.'],
}

SNAB_anomalyScore = {
    'all': [],
    'analyzer': ['telegraf.test-server1'],
    'analyzer_batch': ['telegraf.test-server1', 'test_batch_metrics.'],
    'mirage': ['telegraf.test-server1', 'test_batch_metrics.'],
}

SNAB_CHECKS = {}

Variables:: SNAB_CHECKS (dict) – ADVANCED FEATURE. A dictionary that defines the any SNAB checks for apps (mirage only) in terms of what namespaces should be submitted to snab to be checked by which algortihm/s. EXPERIMENTAL.

Example:

SNAB_CHECKS = {
    'mirage': {
        'testing': {
            'skyline_matrixprofile': {
                'namespaces': ['telegraf'],
                'algorithm_source': '/opt/skyline/github/skyline/skyline/custom_algorithms/skyline_matrixprofile.py',
                'algorithm_parameters': {'windows': 5, 'k_discords': 20},
                'max_execution_time': 10.0,
                'debug_logging': True,
                'alert_slack_channel': '#skyline'
            }
        }
    },
}

SNAB_LOAD_TEST_ANALYZER = 0

Variables:: SNAB_LOAD_TEST_ANALYZER (int) – ADVANCED and EXPERIMENTAL FEATURE. Declare the number of metrics you want to load test via Analyzer.

SNAB_FLUX_LOAD_TEST_ENABLED = False

Variables:: SNAB_FLUX_LOAD_TEST_ENABLED (boolean) – ADVANCED FEATURE. Run flux load testing with snab.

SNAB_FLUX_LOAD_TEST_METRICS = 0

Variables:: SNAB_FLUX_LOAD_TEST_METRICS (int) – ADVANCED FEATURE. Declare the number of metrics you want to load test via flux. 0 disables.

SNAB_FLUX_LOAD_TEST_METRICS_PER_POST = 480

Variables:: SNAB_FLUX_LOAD_TEST_METRICS_PER_POST (int) – ADVANCED FEATURE. Declare the number of metrics per POST to flux. This should ideally be less than the Graphite (not Skyline) setting of MAX_DATAPOINTS_PER_MESSAGE which is by default in Graphite 500

SNAB_FLUX_LOAD_TEST_NAMESPACE_PREFIX = 'test.snab.flux_load_test'

Variables:: SNAB_FLUX_LOAD_TEST_NAMESPACE_PREFIX (str) – ADVANCED FEATURE. Declare the namespace for the test metrics. This is the namespace that will appear in Graphite with randomly generated metric names. NO trailing dot is required.

EXTERNAL_SETTINGS = {'mock_api_external_settings': {'enabled': False, 'endpoint': 'http://127.0.0.1:1500/mock_api?test_external_settings', 'post_data': {'token': None}}}

Variables:: EXTERNAL_SETTINGS (dict) – ADVANCED FEATURE. Skyline can fetch settings for namespaces from an external source. For full details of this functionality please see the External Settings documentation page at: https://earthgecko-skyline.readthedocs.io/en/latest/external_settings.html

LOCAL_EXTERNAL_SETTINGS = {}

Variables:: LOCAL_EXTERNAL_SETTINGS (dict) – ADVANCED FEATURE. You can declare local external settings in a dict which will override or inject the values the are retrieved from EXTERNAL_SETTINGS endpoint. This functionality is for testing and is only documented in the code.

Example:

LOCAL_EXTERNAL_SETTINGS = {
    '_global': {
        'correlate_alerts_only': {
            'use_key': None,
            'override': True,
            'type': 'boolean',
            'value': True,
        },
        'correlate_namespaces_only': {
            'use_key': 'namespace',
            'override': False,
            'type': 'list',
            'value': None,
        },
    },
    'external-test_external_settings': {
        'skip_metrics': ['skyline-test-external-settings.1.cpu[0-9]'],
        'correlation_maps': {
            'aws.webservers.nginx': [
                'aws.euw1.webserver-1.nginx.apm.mainsite.avg_request_timing',
                'aws.euw1.webserver-2.nginx.apm.mainsite.avg_request_timing',
                'aws.use1.webserver-6.nginx.apm.mainsite.avg_request_timing',
                'aws.use1.webserver-8.nginx.apm.mainsite.avg_request_timing',
            ],
            'aws.euw1.webservers': [
                'aws.euw1.webserver-1', 'aws.euw1.webserver-2',
            ]
        },
        'override': False,
    }
}

In the above example if the external-test_external_settings from EXTERNAL_SETTINGS did not have a skip_metrics defined, the skip_metrics defined in LOCAL_EXTERNAL_SETTINGS would be added to the external-test_external_settings dict. With the above example, if skip_metrics was defined in EXTERNAL_SETTINGS then the skip_metrics in the LOCAL_EXTERNAL_SETTINGS would be ignored because override is set to False. However if override was set to True then the key values defined in LOCAL_EXTERNAL_SETTINGS for external-test_external_settings would override the EXTERNAL_SETTINGS values for external-test_external_settings.

PROMETHEUS_INGESTION = False

Variables:: PROMETHEUS_INGESTION (boolean) – ADVANCED FEATURE. Whether to enable Skyline to ingest remote writes from Prometheus. Skyline accepts both native Prometheus remote_write protobuf data and remote_storage_adapter writes with the influxdb format. Please see the Prometheus integration documentation page for full details: https://earthgecko-skyline.readthedocs.io/en/latest/prometheus.html

PROMETHEUS_SETTINGS = {}

Variables:: PROMETHEUS_SETTINGS (dict) – UNDER DEVELOPMENT (not functional) Prometheus can be enabled to push metrics to Skyline which Skyline will add to Redis and analyse. In the Mirage context Skyline will fetch data from Prometheus. For full details of this functionality please see the Prometheus integration documentation page at: https://earthgecko-skyline.readthedocs.io/en/latest/prometheus.html

Example:

PROMETHEUS_SETTINGS = {
    'prometheus.example.org': {
        'scheme': 'https',
        'port': '443',
        # Usernames and hashed passwords that have full access to the web
        # server via basic authentication. If empty, no basic authentication is
        # required. Passwords are hashed with bcrypt.
        'basic_auth_users': {
            'prometheus': '<bcrypt_password_str>',
        },
        'endpoint': 'api/v1',
        'format': '<influxdb|graphite>',  # default influxdb
        'learn_key_metrics': True,
        'analyse_mode': '<key_metrics_and_group|key_metrics|all>',
        'key_metrics_config': '<path_to_key_metrics_file>',
        'longterm_storage': '<promscale|thanos|cortex>',
        'longterm_storage_endpoint': '<promscale|thanos|cortex>',
    }
}

PROMETHEUS_METRIC_OPTS = {'_default': {'max_cardinality': 1000}, 'tenant_label': 'x_tenant_id', 'test.prometheus': {'max_cardinality': 1000, 'metrics': {'test.prometheus.'}}}: opentelemetry settings - EXPERIMENTAL and for DEVELOPMENT

OTEL_ENABLED = False

Variables:: OTEL_ENABLED (boolean) – EXPERIMENTAL FEATURE. Whether to enabled opentelemetry traces on Skyline apps.

OTEL_JAEGEREXPORTER_AGENT_HOST_NAME = '127.0.0.1'

Variables:: OTEL_JAEGEREXPORTER_AGENT_HOST_NAME (str) – EXPERIMENTAL FEATURE. The IP address or FQDN of Jaeger (or an opentelemetry collector - otelcol).

OTEL_JAEGEREXPORTER_AGENT_PORT = 26831

Variables:: OTEL_JAEGEREXPORTER_AGENT_PORT – EXPERIMENTAL FEATURE. The port for Jaeger or an opentelemetry collector (otelcol).

WEBAPP_SERVE_JAEGER = False

Variables:: WEBAPP_SERVE_JAEGER (boolean) – Whether to serve Jaeger via the webapp. This requires nginx configuration, see: https://github.com/earthgecko/skyline/blob/master/etc/skyline.nginx.conf.d.example

JULIA_OPTS = {'analyzer': {'enabled': False}, 'analyzer_batch': {'enabled': False}, 'mirage': {'enabled': False}}

Variables:: JULIA_OPTS (dict) – Options for running algorithms with julialang.

VICTORIAMETRICS_ENABLED = False

Variables:: VICTORIAMETRICS_ENABLED (boolean) – EXPERIMENTAL FEATURE. Whether victoriametrics is enabled as a backend store for labelled metrics from Prometheus, influxdb, etc.

VICTORIAMETRICS_OPTS = {'host': '127.0.0.1', 'jsonl_insert_path': '/api/v1/import', 'password': None, 'port': 8428, 'scheme': 'http', 'select_path': None, 'username': None}

Variables:: VICTORIAMETRICS_OPTS (dict) – EXPERIMENTAL FEATURE. A dictionary with the details for the victoriametrics backend store. The jsonl_insert_path and select_path defaults to the standalone VictoriaMetrics paths however for a clustered version of VictoriaMetrics these would be as in the cluster example below.

Example - local:

VICTORIAMETRICS_OPTS = {
    'scheme': 'http',
    'host': '127.0.0.1',
    'port': 8428,
    'username': None,
    'password': None,
    'jsonl_insert_path': '/api/v1/import',
    'select_path': None,
}

Example - cluster:

VICTORIAMETRICS_OPTS = {
    'scheme': 'https',
    'host': 'victoriametrics-cluser-1.example.org',
    'port': 443,
    'username': None,
    'password': None,
    'jsonl_insert_path': '/insert/0/api/v1/import',
    'select_path': '/select/0',
}

MEMRAY_ENABLED = False

Variables:: MEMRAY_ENABLED (boolean) – EXPERIMENTAL FEATURE AND DEVELOPMENT ONLY. Whether memray is enabled.

EXPOSE_PROMETHEUS_METRICS = False

Variables:: EXPOSE_PROMETHEUS_METRICS (boolen) – Whether to expose Skyline metrics in a Prometheus exporter sytle for scraping on /metrics. NOTE that the Prometheus skyline scrape config should have scrape_interval of 60s, any value less than 60s will not surface the desired metrics. Skyline metrics have a 60s resolution.

VORTEX_ENABLED = True

Variables:: VORTEX_ENABLED (boolean) – Whether to enable Skyline vortex to allow adhoc analysis of metrics.

VORTEX_TIMESERIES_JSON_TO_DISK = True

Variables:: VORTEX_TIMESERIES_JSON_TO_DISK (boolean) – By default Vortex saves submitted timeseries to disk to prevent Redis from exhausting memory if 1000s of timeseries are submitted at once.

VORTEX_FULL_DURATION_RESOLUTIONS = {86400: 60, 604800: 600}

Variables:: VORTEX_FULL_DURATION_RESOLUTIONS (dict) – The downsampling values which are required to ensure speed and that features profiles are effective.

VORTEX_ALGORITHMS = {'adtk_level_shift': {'algorithm_parameters': {'anomaly_window': 1, 'c': 9.9, 'realtime_analysis': False, 'return_results': True, 'window': 10}, 'outlier_value': 1}, 'adtk_persist': {'algorithm_parameters': {'anomaly_window': 1, 'c': 9.9, 'realtime_analysis': False, 'return_results': True, 'window': 10}, 'outlier_value': 1}, 'adtk_seasonal': {'algorithm_parameters': {'anomaly_window': 1, 'c': 9.9, 'realtime_analysis': False, 'return_results': True, 'window': 10}, 'outlier_value': 1}, 'adtk_volatility_shift': {'algorithm_parameters': {'anomaly_window': 1, 'c': 9.9, 'realtime_analysis': False, 'return_results': True, 'window': 10}, 'outlier_value': 1}, 'dbscan': {'algorithm_parameters': {'anomaly_window': 1, 'min_samples': 4, 'n_neighbors': 2, 'return_results': True}, 'outlier_value': -1}, 'default': {'consensus': [['sigma', 'spectral_residual']], 'sigma': {'anomaly_window': 1, 'consensus': 6, 'return_results': True, 'sigma_value': 3}, 'spectral_residual': {'algorithm_parameters': {'anomaly_window': 1, 'return_results': True}, 'outlier_value': 1}}, 'isolation_forest': {'algorithm_parameters': {'anomaly_window': 1, 'contamination': 0.01, 'return_results': True}, 'outlier_value': -1}, 'lof': {'algorithm_parameters': {'anomaly_window': 1, 'n_neighbors': 20, 'return_results': True}, 'outlier_value': -1}, 'm66': {'algorithm_parameters': {'anomaly_window': 1, 'minimum_sparsity': 70, 'nth_median': 6, 'return_results': True, 'sigma': 6, 'window': 5}, 'outlier_value': 1}, 'matrixprofile': {'algorithm_parameters': {'anomaly_window': 1, 'k_discords': 20, 'return_results': True, 'windows': 5}, 'outlier_value': 1}, 'mstl': {'algorithm_parameters': {'anomaly_window': 1, 'horizon': 1, 'level': 99, 'max_execution_time': 180, 'return_results': True, 'season_days': 7, 'season_hours': 24}, 'outlier_value': 1}, 'one_class_svm': {'algorithm_parameters': {'anomaly_window': 1, 'return_results': True}, 'outlier_value': -1}, 'pca': {'algorithm_parameters': {'anomaly_window': 1, 'return_results': True, 'threshold': 0.7}, 'outlier_value': 0.7}, 'prophet': {'algorithm_parameters': {'anomaly_window': 1, 'return_results': True}, 'outlier_value': 1}, 'sigma': {'algorithm_parameters': {'anomaly_window': 1, 'consensus': 6, 'return_results': True, 'sigma_value': 3}, 'outlier_value': 1}, 'spectral_residual': {'algorithm_parameters': {'anomaly_window': 1, 'return_results': True}, 'outlier_value': 1}}

Variables:: VORTEX_ALGORITHMS (dict) – The algorithms avaiable to Vortex and their default parameters.

NUMBA_CACHE_DIR = '/opt/skyline/.cache/numba'

Variables:: NUMBA_CACHE_DIR (str) – ADVANCED FEATURE. The default cache dir that numba uses to cache compiled jit files. Under the normal Skyline build on CentOS 8 numba defaults to /opt/skyline/.cache/numba WITHOUT this being set in an environment variable. Change with caution and only if you have a full understanding of the numba jit caching layout https://numba.readthedocs.io/en/stable/developer/caching.html

skyline.skyline_functions module

Skyline functions

These are shared functions that are required in multiple modules.

get_redis_conn(current_skyline_app)[source]

Get a Redis connection

Parameters:: current_skyline_app (str) – the skyline app using this function
Returns:: REDIS_CONN
Return type:: object

get_redis_conn_decoded(current_skyline_app)[source]

Get a Redis connection with decoded responses, to read sets

Parameters:: current_skyline_app (str) – the skyline app using this function
Returns:: REDIS_CONN_DECODED
Return type:: object

send_graphite_metric(current_skyline_app, metric, value)[source]

Sends the skyline_app metrics to the GRAPHITE_HOST if a graphite host is defined.

Parameters:

current_skyline_app (str) – the skyline app using this function
metric (str) – the metric namespace
value (str) – the metric value (as a str not an int)

Returns:

True or False

Return type:

boolean

mkdir_p(make_path)[source]

Create nested directories.

Parameters:: path (str) – directory path to create
Returns:: returns True

get_graphite_graph_image(current_skyline_app, url=None, image_file=None)[source]

Fetches a Graphite graph image of a metric and saves the image to the specified file.

Parameters:

current_skyline_app (str) – the app calling the function so the function knows which log to write too.
url (str) – the graph URL
image_file (str) – the absolute path and file name to save the graph png image as.

Returns:

True

Return type:

boolean

load_metric_vars(current_skyline_app, metric_vars_file)[source]

Import the metric variables for a check from a metric check variables file

Parameters:

current_skyline_app (str) – the skyline app using this function
metric_vars_file (str) – the path and filename to the metric variables files

Returns:

the metric_vars module object or False

Return type:

object or boolean

write_data_to_file(current_skyline_app, write_to_file, mode, data)[source]

Write date to a file

Parameters:

current_skyline_app (str) – the skyline app using this function
file (str) – the path and filename to write the data into
mode (str) – w to overwrite, a to append
data (str) – the data to write to the file

Returns:

True or False

Return type:

boolean

fail_check(current_skyline_app, failed_check_dir, check_file_to_fail)[source]

Move a failed check file.

Parameters:

current_skyline_app (str) – the skyline app using this function
failed_check_dir (str) – the directory where failed checks are moved to
check_file_to_fail (str) – failed check file to move

Returns:

True, False

Return type:

boolean

get_graphite_metric(current_skyline_app, metric, from_timestamp, until_timestamp, data_type, output_object, check_for_derivative=True)[source]

Fetch data from graphite and return it as object or save it as file

Parameters:

current_skyline_app (str) – the skyline app using this function
metric (str) – metric name
from_timestamp (str) – unix timestamp
until_timestamp (str) – unix timestamp
data_type (str) – image, json or list
output_object (str) – object or path and filename to save data as, if set to object, the object is returned
check_for_derivative (boolean) – check if the metric is a derivative and apply the nonNegativeDerivative Graphite function if it is

Returns:

timeseries string, True, False

Return type:

str or boolean

filesafe_metricname(metricname)[source]: Returns a file system safe name for a metric name in terms of creating check files, etc

send_anomalous_metric_to(current_skyline_app, send_to_app, timeseries_dir, metric_timestamp, base_name, datapoint, from_timestamp, triggered_algorithms, timeseries, full_duration, parent_id, algorithms_run=[])[source]: Assign a metric and timeseries to Crucible or Ionosphere.

RepresentsInt(s)[source]: As per http://stackoverflow.com/a/1267145 and @Aivar I must agree with @Triptycha > “This 5 line function is not a complex mechanism.”

mysql_select(current_skyline_app, select)[source]

Select data from mysql database

Parameters:

current_skyline_app (str) – the Skyline app that is calling the function
select (str) – the select string

Returns:

tuple

Return type:

tuple, boolean

Example usage:

from skyline_functions import mysql_select
query = 'select id, metric from anomalies'
results = mysql_select(query)

Example of the 0 indexed results tuple, which can hold multiple results:

>> print('results: %s' % str(results))
results: [(1, u'test1'), (2, u'test2')]

>> print('results[0]: %s' % str(results[0]))
results[0]: (1, u'test1')

Note

If the MySQL query fails a boolean will be returned not a tuple
- False
- None

nonNegativeDerivative(timeseries)[source]: This function is used to convert an integral or incrementing count to a derivative by calculating the delta between subsequent datapoints. The function ignores datapoints that trend down and is useful for metrics that increase over time and then reset. This based on part of the Graphite render function nonNegativeDerivative at: https://github.com/graphite-project/graphite-web/blob/1e5cf9f659f5d4cc0fa53127f756a1916e62eb47/webapp/graphite/render/functions.py#L1627

strictly_increasing_monotonicity(timeseries)[source]: This function is used to determine whether timeseries is strictly increasing monotonically, it will only return True if the values are strictly increasing, an incrementing count.

in_list(metric_name, check_list)[source]

Check if the metric is in list.

# @added 20170602 - Feature #2034: analyse_derivatives # Feature #1978: worker - DO_NOT_SKIP_LIST This is a part copy of the SKIP_LIST allows for a string match or a match on dotted elements within the metric namespace used in Horizon/worker

get_memcache_metric_object(current_skyline_app, base_name)[source]: Return the metrics_db_object from memcache if it exists.

get_memcache_fp_ids_object(current_skyline_app, base_name)[source]: Return the fp_ids list from memcache if it exists.

move_file(current_skyline_app, dest_dir, file_to_move)[source]

Move a file.

Parameters:

current_skyline_app (str) – the skyline app using this function
dest_dir (str) – the directory the file is to be moved to
file_to_move (str) – path and filename of the file to move

Returns:

True, False

Return type:

boolean

is_derivative_metric(current_skyline_app, base_name)[source]

Determine if a metric is a known derivative metric.

Parameters:

current_skyline_app (str) – the Skyline app that is calling the function
base_name (str) – The metric base_name

Returns:

boolean

Return type:

boolean

set_metric_as_derivative(current_skyline_app, base_name)[source]

Add the metric to the derivative_metrics Redis set and create a z_derivative_metrics Redis key.

Parameters:

current_skyline_app (str) – the Skyline app that is calling the function
base_name (str) – the metric base_name

Returns:

boolean

Return type:

boolean

get_user_details(current_skyline_app, desired_value, key, value)[source]

Determines the user details for a user given the desired_value, key and value. If you want the username of the user with id 1 then: get_user_details(‘username’, ‘id’, 1) # Will return ‘Skyline’ get_user_details(‘id’, ‘username’, ‘Skyline’) # Will return ‘1’

Parameters:

current_skyline_app (str) – the app calling the function so the function knows which log to write too.
desired_value (str) – the id or username
key (str) – the field for the value
value (str or int) – the value of the item you want to query in the key field

Returns:

tuple

Return type:

(boolean, str)

get_graphite_port(current_skyline_app)[source]: Returns graphite port based on configuration in settings.py

get_graphite_render_uri(current_skyline_app)[source]: Returns graphite render uri based on configuration in settings.py

get_graphite_custom_headers(current_skyline_app)[source]: Returns custom http headers

forward_alert(current_skyline_app, alert_data)[source]

Sends alert data to a HTTP endpoint

Parameters:

current_skyline_app (str) – the app calling the function so the function knows which log to write too.
alert_data (list) – a list containing the alert data

Returns:

tuple

Return type:

(boolean, str)

is_batch_metric(current_skyline_app, base_name)[source]

Determine if the metric is designated as an analyzer batch processing metric

Parameters:

current_skyline_app (str) – the app calling the function so the function knows which log to write too.
metric_name (str) – the metric name

Returns:

False

Return type:

boolean

is_check_airgap_metric(base_name)[source]

Check if a metric matches a metric namespace in CHECK_AIRGAPS or one in SKIP_AIRGAPS.

Parameters:: base_name (str) – the metric base_name
Returns:: False
Return type:: boolean

sort_timeseries(timeseries)[source]: This function is used to sort a time series by timestamp to ensure that there are no unordered timestamps in the time series which are artefacts of the collector or carbon-relay. So all Redis time series are sorted by timestamp before analysis.

historical_data_dir_exists(current_skyline_app, ionosphere_data_dir)[source]: This function is used to determine if an ionosphere data dir exists and if it does return the path.

add_panorama_alert(current_skyline_app, metric_timestamp, base_name)[source]

Send an event to Panorama to update an anomaly as alerted on.

Parameters:

current_skyline_app (str) – the Skyline app that is calling the function
anomaly_timestamp – The anomaly timestamp
base_name (str) – The metric base_name

Returns:

boolean

Return type:

boolean

update_item_in_redis_set(current_skyline_app, redis_set, original_item, updated_item, log=False)[source]

Update a list, dict or str item in a Redis set.

Parameters:

current_skyline_app (str) – the Skyline app calling the function
redis_set (str) – the Redis set to operate on
original_item (object (list, dict or a str)) – the original set item, this must be the object in the set if it is a list use the list, if it is a dict use the dict, only use a str if the item is a string.
updated_item (object (list, dict or a str)) – the updated set item, the list, dict or str
log (boolean) – whether to write the update details to log

Returns:

boolean

Return type:

boolean

sanitise_graphite_url(current_skyline_app, graphite_url)[source]

Transform any targets in the URL that need modifications like double encoded forward slash and return whether the URL was sanitised and the url.

Parameters:

current_skyline_app (str) – the Skyline app calling the function
graphite_url (str) – the URL

Returns:

sanitised, url

Return type:

tuple

encode_graphite_metric_name(current_skyline_app, metric)[source]

Transform a metric name into a Graphite compliant encoded url name

Parameters:

current_skyline_app (str) – the Skyline app calling the function
metric (str) – the base_name

Returns:

encoded_graphite_metric_name

Return type:

str

get_anomaly_type(current_skyline_app, anomaly_id)[source]

Given an anomaly id return any classified types

Parameters:

current_skyline_app (str) – the Skyline app calling the function
anomaly_id (int) – the anomaly id

Returns:

(anomaly_type_list, anomaly_type_id_list)

Return type:

tuple

mirage_load_metric_vars(current_skyline_app, metric_vars_file, return_dict=False)[source]

Load the mirage metric variables for a check from a metric check variables file

Parameters:: metric_vars_file (str) – the path and filename to the metric variables files
Returns:: the metric_vars list or False
Return type:: list

skyline.skyline_version module

version info

skyline.slack_functions module

slack functions

These are shared slack functions that are required in multiple modules.

slack_post_message(current_skyline_app, channel, thread_ts, message)[source]

Post a message to a slack channel or thread.

Parameters:

current_skyline_app (str) – the skyline app using this function
channel (str) – the slack channel
thread_ts (str or None) – the slack thread timestamp
message (str) – message

Returns:

slack response dict

Return type:

dict

slack_post_reaction(current_skyline_app, channel, thread_ts, emoji)[source]

Post a message to a slack channel or thread.

Parameters:

current_skyline_app (str) – the skyline app using this function
channel (str) – the slack channel
thread_ts (str) – the slack thread timestamp
emoji (str) – emoji e.g. thumbsup

Returns:

slack response dict

Return type:

dict

skyline.tsfresh_feature_names module

tsfresh_feature_names.py

TSFRESH_VERSION = '0.19.1'

Variables:: TSFRESH_VERSION (str) – The version of tsfresh installed by pip, this is important in terms of feature extraction baselines

TSFRESH_BASELINE_VERSION = '0.19.1'

Variables:: TSFRESH_BASELINE_VERSION (str) – The version of tsfresh that was used to generate feature extraction baselines on.

TSFRESH_FEATURES = [[1, 'value__symmetry_looking__r_0.65'], [2, 'value__first_location_of_maximum'], [3, 'value__absolute_sum_of_changes'], [4, 'value__large_number_of_peaks__n_1'], [5, 'value__large_number_of_peaks__n_3'], [6, 'value__large_number_of_peaks__n_5'], [7, 'value__last_location_of_minimum'], [8, 'value__mean_abs_change_quantiles__qh_0.4__ql_0.0'], [9, 'value__mean_abs_change_quantiles__qh_0.4__ql_0.2'], [10, 'value__mean_abs_change_quantiles__qh_0.4__ql_0.4'], [11, 'value__mean_abs_change_quantiles__qh_0.4__ql_0.6'], [12, 'value__mean_abs_change_quantiles__qh_0.4__ql_0.8'], [13, 'value__maximum'], [14, 'value__value_count__value_-inf'], [15, 'value__skewness'], [16, 'value__number_peaks__n_3'], [17, 'value__longest_strike_above_mean'], [18, 'value__number_peaks__n_5'], [19, 'value__first_location_of_minimum'], [20, 'value__large_standard_deviation__r_0.25'], [21, 'value__augmented_dickey_fuller'], [22, 'value__count_above_mean'], [23, 'value__symmetry_looking__r_0.75'], [24, 'value__percentage_of_reoccurring_datapoints_to_all_datapoints'], [25, 'value__mean_abs_change'], [26, 'value__mean_change'], [27, 'value__value_count__value_1'], [28, 'value__value_count__value_0'], [29, 'value__minimum'], [30, 'value__autocorrelation__lag_5'], [31, 'value__median'], [32, 'value__symmetry_looking__r_0.85'], [33, 'value__mean_abs_change_quantiles__qh_0.8__ql_0.4'], [34, 'value__symmetry_looking__r_0.05'], [35, 'value__mean_abs_change_quantiles__qh_0.8__ql_0.6'], [36, 'value__value_count__value_inf'], [37, 'value__mean_abs_change_quantiles__qh_0.8__ql_0.0'], [38, 'value__mean_abs_change_quantiles__qh_0.8__ql_0.2'], [39, 'value__large_standard_deviation__r_0.45'], [40, 'value__mean_abs_change_quantiles__qh_0.8__ql_0.8'], [41, 'value__autocorrelation__lag_6'], [42, 'value__autocorrelation__lag_7'], [43, 'value__autocorrelation__lag_4'], [44, 'value__last_location_of_maximum'], [45, 'value__autocorrelation__lag_2'], [46, 'value__autocorrelation__lag_3'], [47, 'value__autocorrelation__lag_0'], [48, 'value__autocorrelation__lag_1'], [49, 'value__autocorrelation__lag_8'], [50, 'value__autocorrelation__lag_9'], [51, 'value__range_count__max_1__min_-1'], [52, 'value__variance'], [53, 'value__mean'], [54, 'value__standard_deviation'], [55, 'value__mean_abs_change_quantiles__qh_0.6__ql_0.6'], [56, 'value__mean_abs_change_quantiles__qh_0.6__ql_0.4'], [57, 'value__mean_abs_change_quantiles__qh_0.6__ql_0.2'], [58, 'value__mean_abs_change_quantiles__qh_0.6__ql_0.0'], [59, 'value__symmetry_looking__r_0.15'], [60, 'value__ratio_value_number_to_time_series_length'], [61, 'value__mean_second_derivate_central'], [62, 'value__number_peaks__n_1'], [63, 'value__length'], [64, 'value__mean_abs_change_quantiles__qh_1.0__ql_0.0'], [65, 'value__mean_abs_change_quantiles__qh_1.0__ql_0.2'], [66, 'value__mean_abs_change_quantiles__qh_1.0__ql_0.4'], [67, 'value__time_reversal_asymmetry_statistic__lag_3'], [68, 'value__mean_abs_change_quantiles__qh_1.0__ql_0.6'], [69, 'value__mean_abs_change_quantiles__qh_1.0__ql_0.8'], [70, 'value__sum_of_reoccurring_values'], [71, 'value__abs_energy'], [72, 'value__variance_larger_than_standard_deviation'], [73, 'value__mean_abs_change_quantiles__qh_0.6__ql_0.8'], [74, 'value__kurtosis'], [75, 'value__approximate_entropy__m_2__r_0.7'], [76, 'value__approximate_entropy__m_2__r_0.5'], [77, 'value__symmetry_looking__r_0.25'], [78, 'value__approximate_entropy__m_2__r_0.3'], [79, 'value__percentage_of_reoccurring_values_to_all_values'], [80, 'value__approximate_entropy__m_2__r_0.1'], [81, 'value__time_reversal_asymmetry_statistic__lag_2'], [82, 'value__approximate_entropy__m_2__r_0.9'], [83, 'value__time_reversal_asymmetry_statistic__lag_1'], [84, 'value__symmetry_looking__r_0.35'], [85, 'value__large_standard_deviation__r_0.3'], [86, 'value__large_standard_deviation__r_0.2'], [87, 'value__large_standard_deviation__r_0.1'], [88, 'value__large_standard_deviation__r_0.0'], [89, 'value__large_standard_deviation__r_0.4'], [90, 'value__large_standard_deviation__r_0.15'], [91, 'value__mean_autocorrelation'], [92, 'value__binned_entropy__max_bins_10'], [93, 'value__large_standard_deviation__r_0.35'], [94, 'value__symmetry_looking__r_0.95'], [95, 'value__longest_strike_below_mean'], [96, 'value__sum_values'], [97, 'value__symmetry_looking__r_0.45'], [98, 'value__symmetry_looking__r_0.6'], [99, 'value__symmetry_looking__r_0.7'], [100, 'value__symmetry_looking__r_0.4'], [101, 'value__symmetry_looking__r_0.5'], [102, 'value__symmetry_looking__r_0.2'], [103, 'value__symmetry_looking__r_0.3'], [104, 'value__symmetry_looking__r_0.0'], [105, 'value__symmetry_looking__r_0.1'], [106, 'value__has_duplicate'], [107, 'value__symmetry_looking__r_0.8'], [108, 'value__symmetry_looking__r_0.9'], [109, 'value__value_count__value_nan'], [110, 'value__mean_abs_change_quantiles__qh_0.2__ql_0.8'], [111, 'value__large_standard_deviation__r_0.05'], [112, 'value__mean_abs_change_quantiles__qh_0.2__ql_0.2'], [113, 'value__has_duplicate_max'], [114, 'value__mean_abs_change_quantiles__qh_0.2__ql_0.0'], [115, 'value__mean_abs_change_quantiles__qh_0.2__ql_0.6'], [116, 'value__mean_abs_change_quantiles__qh_0.2__ql_0.4'], [117, 'value__number_cwt_peaks__n_5'], [118, 'value__number_cwt_peaks__n_1'], [119, 'value__sample_entropy'], [120, 'value__has_duplicate_min'], [121, 'value__symmetry_looking__r_0.55'], [122, 'value__count_below_mean'], [123, 'value__quantile__q_0.1'], [124, 'value__quantile__q_0.2'], [125, 'value__quantile__q_0.3'], [126, 'value__quantile__q_0.4'], [127, 'value__quantile__q_0.6'], [128, 'value__quantile__q_0.7'], [129, 'value__quantile__q_0.8'], [130, 'value__quantile__q_0.9'], [131, 'value__ar_coefficient__k_10__coeff_0'], [132, 'value__ar_coefficient__k_10__coeff_1'], [133, 'value__ar_coefficient__k_10__coeff_2'], [134, 'value__ar_coefficient__k_10__coeff_3'], [135, 'value__ar_coefficient__k_10__coeff_4'], [136, 'value__index_mass_quantile__q_0.1'], [137, 'value__index_mass_quantile__q_0.2'], [138, 'value__index_mass_quantile__q_0.3'], [139, 'value__index_mass_quantile__q_0.4'], [140, 'value__index_mass_quantile__q_0.6'], [141, 'value__index_mass_quantile__q_0.7'], [142, 'value__index_mass_quantile__q_0.8'], [143, 'value__index_mass_quantile__q_0.9'], [144, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_2"'], [145, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_2"'], [146, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_2__w_2"'], [147, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_3__w_2"'], [148, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_2"'], [149, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_2"'], [150, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_6__w_2"'], [151, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_2"'], [152, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_2"'], [153, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_2"'], [154, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_2"'], [155, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_2"'], [156, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_2"'], [157, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_2"'], [158, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_2"'], [159, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_5"'], [160, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_5"'], [161, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_2__w_5"'], [162, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_3__w_5"'], [163, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_5"'], [164, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_5"'], [165, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_6__w_5"'], [166, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_5"'], [167, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_5"'], [168, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_5"'], [169, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_5"'], [170, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_5"'], [171, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_5"'], [172, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_5"'], [173, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_5"'], [174, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_10"'], [175, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_10"'], [176, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_2__w_10"'], [177, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_3__w_10"'], [178, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_10"'], [179, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_10"'], [180, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_6__w_10"'], [181, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_10"'], [182, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_10"'], [183, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_10"'], [184, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_10"'], [185, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_10"'], [186, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_10"'], [187, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_10"'], [188, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_10"'], [189, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_20"'], [190, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_20"'], [191, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_2__w_20"'], [192, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_3__w_20"'], [193, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_20"'], [194, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_20"'], [195, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_6__w_20"'], [196, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_20"'], [197, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_20"'], [198, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_20"'], [199, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_20"'], [200, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_20"'], [201, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_20"'], [202, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_20"'], [203, '"value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_20"'], [204, 'value__spkt_welch_density__coeff_2'], [205, 'value__spkt_welch_density__coeff_5'], [206, 'value__spkt_welch_density__coeff_8'], [207, 'value__fft_coefficient__coeff_0'], [208, 'value__fft_coefficient__coeff_1'], [209, 'value__fft_coefficient__coeff_2'], [210, 'value__fft_coefficient__coeff_3'], [211, 'value__fft_coefficient__coeff_4'], [212, 'value__fft_coefficient__coeff_5'], [213, 'value__fft_coefficient__coeff_6'], [214, 'value__fft_coefficient__coeff_7'], [215, 'value__fft_coefficient__coeff_8'], [216, 'value__fft_coefficient__coeff_9']]

Variables:: TSFRESH_FEATURES (array) – This array defines the Skyline id for each known tsfresh feature.

Warning

This is array is linked to relational fields in the database and ids as such should be consider immutable objects that must not be modified after they are created. This array should only ever be extended.

Note

There is a helper script to generate is array for the feature names returned by current/running version of tsfresh and compare them to this array. The helper script outputs changes and the full generated array for diffing against this array of known feature names. See: skyline/tsfresh_features/generate_tsfresh_features.py

skyline.validate_settings module

validate_settings_variables(current_skyline_app)[source]

This function is used by the agent.py to validate the variables in settings.py are valid

Parameters:: current_skyline_app – the skyline app using this function
Returns:: True or False
Return type:: boolean

skyline.validate_settings_test module

Module contents

version info