Index

Overview

Data forecasting and anomaly detection utilize your data history to either project the way your network behavior is changing into the future (forecasting), or to compare the current network/hardware behavior to what is considered to be typical for your environment (anomaly detection). Statseeker Baselining exposes the underlying data model used to support these features, presenting your data history in an easily consumed, summarized format.

There are 3 components of a baseline configuration:

[top]

Baselining History Time Range

The default data history range used in generating your baseline is the last 180 days. If your Statseeker installation has less than 180 days’ worth of data available for a device/entity, then it will use what is available. This default history period is used to establish what is considered typical behavior for your network.

While defaulting to the previous 180 days, the scope of the baseline history time range can be specified. By setting this time range to a specific time period you are able to exclude data that you don't want to include in your comparisons.

When applied to data forecasting, a restricted baseline means that you can forecast based on recent changes to your network (changes to demand, topography or infrastructure). The impact of those changes wont be diluted by the months worth of data preceding those changes. Similar benefits can be seen with anomaly detection. Excluding known periods of anomalous data will prevent that data from skewing your baseline model of what is considered typical behavior.

Note: while you are able to restrict the range of data used to generate your baseline, a larger dataset will provide a more reliable, robust model. Where possible, keep your baseline timerange to a minimum of 6 weeks worth of data.

[top]

Baseline Range Editor

The Baseline Range Editor is the UI component used to specify your baseline history; the period of historical data used to generate your baseline. This control is available from the Baseline History Edit button in report, dashboard and threshold configurations.

To modify the baseline history:

  • Click the Baseline History Edit button
  • Modify the start/end points for the history via the slider or the range fields

If an exclusion period is needed for any date range within the baseline history:

  • Click the Exclude Add button
  • Specify the start/end points of the exclusion period via the calendar controls

Time Filter Ranges - Inclusive\Exclusive Boundaries

The Range component of a filter has an inclusive start and an exclusive finish. For example, a time filter for the month of June 2020 would be displayed in the Query Info field as range = 2020-6-01 to 2020-7-01, everything from, and including, the first second of June 1st until, and excluding, the first second of July 1st.

  • Click Update when the baseline range configuration is complete

The Query string can also be manually edited, but this requires a working knowledge of the Statseeker time filter syntax. For details on the syntax required see Timefilter & Range-Query Syntax.

When working with complex time filter ranges you have the option to pass the configuration off to the Advanced Time Filter Editor. This editor also features and option to test a time filter and output all dates/times included in the specified range. For details on using the Advanced Time Filter Editor, and testing time filter ranges, see the Advanced Time Filter Editor.

[top]

Baseline Percentiles

While the allowed values are universal (0-100%), the functionality of the Baseline Percentile varies depending on where it is used; in 'custom graphical reports', or 'anywhere else in Statseeker'.

Graphical Reports

When applied to custom graphical reports, the baseline percentile is always set to 50 (the median historical value), and the baseline percentile refers to the range of upper and lower baseline boundaries.

A Baseline Percentile value of 50% will set the upper/lower boundaries such that the shaded area contains 50% of the historical data values for that point in time. A setting of 95% will set boundaries so the shaded area contains 95% of the historical data values.

  • Baseline Percentile = 50:
    • Percentile = 50
    • upper boundary = 75
    • lower boundary = 25
  • Baseline Percentile = 95:
    • Percentile = 50
    • upper boundary = 97.5
    • lower boundary = 2.5


Tabular Reports, Dashboards and Thresholds

Anywhere in the product other than graphical reports, there are separate field formats for Baseline Percentile, Upper, and Lower Boundaries, and all are set individually.

The Baseline Percentile format defines the historical data percentile referenced in the report/dashboard/threshold:

  • 95 - the 95th percentile of all historical data within the baseline history used
  • 50 - the 50th percentile, or median, of all historical data within the baseline history used
  • 10 - the 10th percentile, of all historical data within the baseline history used

[top]

Baseline Formats

When using baselines in tabular reports, dashboards and thresholds and you are required to specify a Baseline Format. In dashboards and reports, this format is combined with an associated Baseline Percentile.

  • Baseline - baseline percentile values calculated from historical data
  • Average - the average of the specified baseline data percentile values
  • Comparison - the difference between the average of the specified baseline percentile values and the report data average
  • Percentage Comparison - percentage difference between the average of the specified baseline percentile values and the report data average
  • Lower Bound - baseline lower confidence interval values, specified by associated percentile value (for non-graph panels, the average of those values)
  • Upper Bound - list of baseline upper confidence interval values; that interval is specified by the associated percentile value. For non-graph panels, return the average of that list of values.


[top]

Adding Baseline Data to Reports

Baselines can be referenced in reports explicitly, presenting the baseline data itself. Baselines are also used in conjunction with:

  • Data Forecasting - the specified baseline data is used to generate the forecast data, see Data Forecasting for details
  • Anomaly Detection - the specified baseline data is compared to the report data to calculate anomaly metrics, see Anomaly Detection for details

Graphical Report

Baseline data can be displayed on graphical reports as an overlay for any timeseries metric.

  • The Baseline History - (default = last 180 days) is used to specify the period of data history used to generate the baseline
  • The Baseline Percentile (defaults to 95) determines the percentile range of historical data to display

To present baseline data:

  • Configure your report as needed (filters, content and presentation)
Note: the baseline data will be overlaid on top of your timeseries data. Presenting multiple timeseries metrics on a single graph may result in a graph which is difficult to interpret.
  • Edit the timeseries metric that will be baselined
  • Select Show Baseline
  • Specify the Baseline Percentile, we recommend the default value (95), as this will include all statistically 'typical' data within the baseline period and exclude any statistically anomalous data from that period
  • Specify the Baseline History, this is the period of historical data used to generate the baseline
  • Assign a color for the baseline data
  • Click Update

The baseline has been added and will be displayed on next run of the report

[top]

Tabular Reports

Both baseline data values and a comparison between those values and the average report data can be presented in tabular reports. For an example of baseline data in tabular reports, see the default Interfaces > Utilization Baselines report in the Console.

To add baseline data to tabular reports:

  • Assign the timeseries Attribute
  • Set Format = Baseline
  • Set the required Baseline Format (Average\Comparison)
  • Set the Baseline Percentile and History
  • Click Add to confirm the field configuration

Baseline History in Tabular Reports

The Baseline History (default value = last 180 days) is also used by Forecasting and Anomaly Metric/Strength in tabular reports. If not set, the Baseline History will use the default value of the previous 180 days. This period of historical data is used to:

  • Forecasting - generate the forecast data
  • Anomaly Detection - generate a comparison to the report data

By setting a reports’ baselining time range to a specific time period you are able to exclude data that you don't want to include in your comparisons. When used with data forecasting, this ability to set the period of data used to generate the baseline means that you can forecast based on recent changes to your network (changes to demand, topography or infrastructure). The impact of those recent changes wont be diluted by the months worth of data preceding those changes. When used with anomaly detection you are able to exclude known anomalous data, preventing it from skewing what is considered typical behavior.

[top]

Adding Baseline Data to Dashboards

Baselines can be applied to timeseries fields used in the Fields, Filters and Sort By sections of your dashboard configurations. Baseline formats differ from most other dashboard formats in that they require additional options set to specify both Baseline History and Baseline Percentile.

Note: for details on all aspects of dashboards use, configuration and management, see Statseeker Dashboards.

To add baseline data to your dashboards as either Field, Filter or Sort By content:

  • Set the field
  • Set the format to one of the baseline formats, see Dashboard Timeseries Data Formats for details
  • Use the Options button to specify Baseline History and baseline Percentile

Dashboard Baseline Graphs

To achieve the same style of baseline graph as seen in custom reports we actually need to plot multiple baseline series:

  • Baseline - setting to 50th percentile will give us the baseline median
  • Baseline Upper Bound - this will give us the upper baseline data range
  • Baseline Lower Bound - the lower baseline data range

The default percentile values for Baseline, Baseline Upper Bound and Baseline Lower Bound will deliver the same data range output as a custom graph report percentile set to 95.

To achieve the same visual style as the custom graph report, a number of series level display overrides are required:



Dashboard Baseline Tables

The configuration of baseline data in dashboard table panels does not differ from custom tabular reports, see Baseline Tabular Reports for details.

[top]

Referencing Baselines in Threshold Configurations

Baseline metrics can also be utilized in Threshold configuration. The baseline percentile is set to 50 for the purpose of threshold configuration.

Baseline Formats:

  • Average: the average of the 50th percentile values throughout the baseline history range
  • Comparison: difference between the baseline 50th percentile and the observed value

To apply baseline data in threshold configuration:

  • Set the Attribute
  • Set the Format to Baseline and specify whether to use the baseline Average or Comparison as the threshold trigger metric
  • Specify Baseline History as needed
  • Assign the threshold values, triggers and filters as needed, see Threshold Configuration for details

[top]