Configuration
Since 0.5.0, the config has been split into two different files.
System config is documented on the rest of this page, and has been moved
into bosun.toml
unless overriden on the command line. Changes to these
settings require a Bosun restart in order to take effect. The TOML
format is documented here.
The example config file is a good place to start when writing a configuration.
Warning
The file does not follow the tradition of commenting out defaults, which are instead mostly documented below.
This file is documented here. It includes settings that do not require a Bosun restart to take effect e.g. alerts, templates, notifications.
Bosun records state in a Redis store and much of this state is associated with particular host names. This works well when Bosun is run on a dedicated machine however if it’s run within a container orchestration platform, where instance names can change very frequently then it can lead to a low quality recording of state and excessive disk and memory use (as a new copy of this state is created every time the hostname changes). To overcome this issue you can define a custom hostname and this will mean that even when a container instance is recreated Bosun will continue to use the previously known state.
This setting will also be used during the construction of links in any template functions that need them.
Example:
Hostname = "bosun.example.com"
The URL that Bosun uses to construct its links. The common use case is in any template functions that construct links. the default is “http”
Example:
Scheme = "https"
HTTP IP and Port to listen on. The default if not specified is to
listen on :8070
. However, if HTTPSListen
is defined and HTTPListen
is not defined there will be no default, and Bosun will only listen on
HTTPS.
Example:
HTTPListen = ":8080"
HTTPS (SSL) IP and Port to listen on. This requires that TLSCertFile
and TLSKeyFile
are defined.
Path to the PEM encoded certificate Bosun should use for
HTTPSListen
. May contain intermediate certificates followed by the
leaf certificate to form a chain.
Example:
TLSCertFile = "/etc/ssl/certs/bosun.example.com.crt"
Path to the PEM encoded private key that corresponds to the TLSCertFile
.
Example:
TLSKeyFile = "/etc/ssl/certs/bosun.example.com.key"
Minimum interval that alert checks will run at on a schedule. The format
of the value is the same as Go’s duration
format. By default, alert
checks are run at every CheckFrequency
multiplied by the
DefaultRunEvery
value. This defaults to “5m”.
Example:
CheckFrequency = "1m"
By default, alert checks are run at every
CheckFrequency
multiplied by the
DefaultRunEvery
value. This can be overridden in an alert definition
with the runEvery
keyword. This defaults to
1.
e.g. If CheckFrequency = 1m
and a DefaultRunEvery = 5
, alerts by
default will run every 5 minutes. But you could have some run as
frequent as every “1m”, and others that run less often (any multiple of
“1m”).
Example:
DefaultRunEvery = 5
Selects algorithm to distribute alert checks to decrease system load spikes. There is no distribution by default. This means, if there are several checks with same period, they all will happen at the same points in time. This method is used if the option is not specified or equals to empty string.
The single alternative option is simple
. If specified, the alert checks with the same period will be uniformly distributed on second marks.
Example: AlertCheckDistribution = "simple"
Path to the file containing definitions of alerts, macros, lookups, templates, notifications, and global variables which are documented here. If saving is enabled, this file can be written to by Bosun via the API or Save UI. Mandatory.
Example: RuleFilePath = "dev.sample.conf"
If set, this will allow bosun to delete rendered templates from its’ data store. It will remove all rendered templates for alerts that have been closed for longer than this time (in days).
Example: MaxRenderedTemplateAge = 30 # retain old templates for only 30 days
Used to configure time zones that will be linked to in Bosun’s dashboard. It is an array of timeanddate.com zones (the page that gets linked to from Bosun’s UI.) It has no impact on what time zone Bosun operates in. Bosun is expected to use UTC and does support other timezones.
Example:
TimeAndDate = [ 202, 75, 179, 136 ]
Bosun’s UI can generate short links using Google’s goo.gl URL Shortener service. If you are hitting their API limits, you can get an API key and specify here, and that key will get used.
Example:
ShortURLKey = "aKey"
Bosun’s dashboard will use grouping logic on the dashboard if there are many similar alerts (in name, or tag/key pairs). MinGroupSize
sets the minimum number of alerts needed to create a group in the UI and defaults to 5
.
If you have a lot of grouping, it often means you should refactor the alert to have a less granular scope. Therefore, it is recommended that this is used as a “protection mechanism” from flooding the dashboard with too many alerts.
Example: MinGroupSize = 5
Bosun will group all unknowns in a single check cycle (alerts on the same CheckFrequency
and RunEvery
) into a single email. This sets how many unknowns would be sent in a single check cycle before a group is created. The default value is 5.
This is done because unknowns are generally caused by the data “disappearing”. So if your TSDB Provider is broken or behind, it can generate a lot of unknowns. This alleviates flooding in the specific case of unknowns.
Example: UnknownThreshold = 5
If set to true
, Bosun will ping every value of the host tag that it has indexed and record that value to your TSDB. It currently only support OpenTSDB style data input, which is means you must use either OpenTSDB or Influx with the OpenTSDB endpoint on Influx configured.
Example:
Ping = true
How long Bosun should wait before stopping to ping host tags it has
seen. e.g. If the value is the default of "24h"
, if Bosun has
not indexed any data points for that host
value, then it will stop
attempting to ping that host until it again sees data points that have that
tag.
Example:
PingDuration = "24h"
Controls how long autocomplete and items in UI will show up since being indexed by Bosun. The format of the value is the same as Go’s duration format and the default is 3 days. The goal is to make it so you don’t have old items showing up in the UI. However, if you are using OpenTSDB and graphing page, you can still query metrics that don’t autocomplete if you remember what they were (or look them up using OpenTSDB’s native UI autocomplete).
Example: SearchSince = "72h"
Enables saving via the web UI. It is disabled by default. When enabled, users will be able to save the rule configuration file via the UI and Bosun will then write to that file on the user’s behalf.
Example: EnableSave = true
Sets if reloading of the rule configuration should be enabled. If
EnableSave
is true
, then reloading gets enabled regardless of this
setting. Reloads can be triggered via /api/reload
in the REST
API as well.
Example:
EnableSave = true
When enabling saving, and a user issues a save, you have the option to run a executable or script by specifying this parameter. This allows you to do things like backup the file on writes or commit the file to a git repo.
This command is passed a filename, username, message, and vargs (vargs is currently not used). If the command exits a non-zero exit code, then the changes will be reverted (the file before the changes is copied back and Bosun doesn’t restart). When the configuration is saved via the user interface, any messages to standard error will be shown to the user when there is a non-zero exit code.
Example:
CommandHookPath = "/Users/kbrandt/src/hook/hook"
Current code documentation says:
// GetInternetProxy sets a proxy for outgoing network requests from Bosun. Currently it
// only impacts requests made for shortlinks to https://goo.gl/
But not sure I trust that.
All your key value pairs must be defined before any sections are defined. Sections are used for things that have multiple values to configure them. In particular the various time series database providers.
Allows definition of variables that will be turned into global
variables available to definitions under
$sys.
. This is designed for when you have some secrets you don’t want
exposed in the definitions file. The values of these variables can
still be accessed from Bosun’s rule editor and expression UI. This
merely hides them so you don’t accidentally include them in
screenshots or when copying and pasting your config.
Example:
[RuleVars]
CHAT = "https://chat.example.com/1?key=mySecretKey"
foo = "baz"
The above could then be used in the rule configuration like:
notification chat {
post = ${sys.CHAT}
runOnActions = false
}
They can also be accessed in templates with the V()
function
Defines what internal storage Bosun should use. There are currently two choices, a built-in redis-like server called ledis or redis. Redis is recommended for production setups.
The default is to use ledis. If Both Redis and ledis are defined, Redis will take preference and the ledis configuration will be ignored. Ledis is the default, so if RedisHost
is not specified ledis will be used even if you have no DBConf
configuration defined.
Warning
Upgrading the database to newer versions only works with redis. With ledis you will have to delete the database to use a new version that involves a migration (schema upgrade) to the db.
The Redis hostname and port.
Optional integer database to store bosun data. Defaults to 0.
Optional password to use when connecting to Redis.
Optional key defining the sending of client’s name bosun
to Redis. Defaults to true.
If you use Netflix/dynomite then RedisClentSetName must be set to false.
The redis sentinels list. Redis sentinel list will be used only if parameter RedisMasterName
was set as well
The redis master name within sentinel. If it is set bosun will use sentinel to receive information about cuurrent redis master.
Directory in which ledis will store data. Default: LedisDir = "ledis_data"
Ledis host and port. Default: LedisBindAddr = "127.0.0.1:9565"
.
Redis Configuration:
[DBConf]
RedisHost = "localhost:6389"
Ledis Configuration:
[DBConf]
RedisHost = "localhost:6389"
LedisDir = "ledis_data"
LedisBindAddr = "127.0.0.1:9565"
SMTP server configuration for email sending.
Address from which emails will be sent.
Outgoing SMTP server hostname or IP address.
SMTP username
SMTP password
[SMTPConf]
EmailFrom = "bosun@example.com"
Host = "mail.example.com"
Username = "username"
Password = "fe8h392wh"
AzureConf enables Azure Monitor specific functions in the expression language. Multiple clients may be defined allowing you to query different subscriptions and tenants from a single Bosun instance.
Default Azure client to use when the Prefix key is absent or is there and set to “default”. When ysing multiple clients the string default
can change to whatever you want to use in expressions to access this particular client.
The Azure Subscription GUID for the client. See Getting your Azure Subscription GUID (new portal) for instructions on finding this with Azure’s portal.
The Azure Tenant GUID for the client. To get the tenant you can access it via “Azure Active Directory > Properties > Directory ID in the Azure portal” as per Azure’s How to get an Azure Active Directory tenant documentation.
The Azure Client GUID for the client. This will be the GUID listed as the “Application ID” when you get when create the app registration under Azure Active Directory.
The Azure generated secret for tor the client. This will be under Settings :: Keys for the application as referenced in ClientID above.
For expressions that need to make multiple http requests like azmulti()
, this sets the amount of concurrent http requests that will be made at a time. In other words, the number of request workers.
This is an optional parameter, If not set or set to 0
then it will be the default value of 10
workers. A value of 1
means no concurrency since there will only be one worker.
If set to true
then HTTP requests to the Azure API’s will be logged to stdout. This is an optional parameter with a default of false
.
If set to true
then HTTP responses from Azure API’s will be logged to stdout. This is an optional parameter with a default of false
.
[AzureMonitorConf]
# Default instance will be queries when the prefix key is absent or is there and set to "default". If only defining one client for azure you would use this.
[AzureMonitorConf.default]
SubscriptionId = "52c5bef7-d925-4d0e-9bcd-969dbdbb1068"
TenantId = "60730c79-4f4c-4782-9eca-3325638b8f9c"
ClientId = "2a434dc4-48df-43b8-ad0e-020798bcb36c"
ClientSecret = "AzureMakes+Aweso//meSecrets="
Concurrency = 5
[AzureMonitorConf.foo]
SubscriptionId = "4b5922c6-0c5a-462f-876f-07072e842ade"
TenantId = "2e8abf15-1328-458b-8762-192139857055"
ClientId = "3a6d47ca-53eb-4a18-aa77-d3e2d764ba31"
ClientSecret = "AzureCreate+Grea/tSecrets="
Enables an OpenTSDB provider, and also enables OpenTSDB specific functions in the expression language. This also enables the Graph tab in Bosun’s UI as that is OpenTSDB specific. However, you can still graph other time series DBs in Bosun’s UI by using the Expression tab.
OpenTSDB hostname and port to connect to. Since 0.9.0 this can be specified with a scheme. e.g. https://tsdb-querier:4242
Either 2.1 or 2.2. Default: 2.1. If set to 2.2, certain features that correspond to new OpenTDSB features are enabled e.g. filter support.
Causes requests to error if the OpenTSDB response is larger than this setting in bytes. This is useful to catch alerts that are pulling in more data then they need to (i.e. maybe downsampling can be used.)
This does not cancel the query with OpenTSDB, but Bosun will stop processing the response.
[OpenTSDBConf]
Host = "https://ny-tsdb01:4242"
Version = 2.2
ResponseLimit = 25000000
Enables querying multiple Elastic clusters. The elastic expression functions become available when this is defined. The functions are designed more to be used for querying log formatted data and stats from those logs.
The functions that would allow you to use Elastic effectively as a time-series based backend do not currently exist.
Default cluster to query when PrefixKey is not passed to the elastic expression functions.
List of hosts that are members of the cluster. Elastic uses these hosts to discover all the other hosts in the cluster, so you don’t have to specify every host in the cluster. This article on Sniffing describes how this discovery functions.
Boolean determining when setting true periodic health checks and sniffing will be disabled. This is useful when you want to query from a single or fixed number of Elastic nodes.
Advanced Elastic client options e.g. basic authentication, http schema etc.
[ElasticConf]
[ElasticConf.default]
Hosts = ["http://ny-lselastic01.example.com:9200", "http://ny-lselastic02.example.com:9200"]
[ElasticConf.foo]
Hosts = ["http://ny-lselastic01.example.com:9200", "http://ny-lselastic02.example.com:9200"]
SimpleClient = true
[ElasticConf.bar]
Hosts = ["http://ny-lselastic01.example.com:9200", "http://ny-lselastic02.example.com:9200"]
[ElasticConf.bar.ClientOptions]
Enabled = true
BasicAuthUsername = "admin"
BasicAuthPassword = "testing"
Scheme = "https" => default http
SnifferEnabled = false => default true
SnifferTimeoutStartup = 10 => default 5s
HealthcheckEnabled = false
Enables querying Graphite server and exposes its query functions to the expression language.
Graphite connection host and port, e.g. Host = "localhost:80"
.
Headers as key / value pairs (one per line) that will be sent with each Graphite request.
[GraphiteConf]
Host = "localhost:80"
[GraphiteConf.Headers]
X-Meow = "Mix"
Enables querying multiple Prometheus TSDBs via the Prometheus HTTP v1 endpoint. The Prometheus Query Expression Functions become available when this is defined.
Default cluster to query when PrefixKey is not passed to the prometheus query functions.
[PromConf]
[PromConf.default]
URL = "https://prometheus.kubea.example.com"
[PromConf.kubeb]
URL = "https://prometheus.kubeb.example.com"
Embeds the annotation service. This enables the ability to submit and
edit annotations via the UI or API. It also enables the annotation
related expression functions. Currently the only supported database for
annotate is Elastic. It can be the same cluster as the one defined in
ElasticConf
or a different one.
Warning
The format of annotation configuration may change before the final 0.6.0 release.</a>.
As for ElasticConf.
As for ElasticConf.
As for ElasticConf.
The Elastic index to store annotations in. If not set the default is “annotate”.
[AnnotateConf]
Hosts = ["http://ny-lselastic01.example.com:9200", "http://ny-lselastic02.example.com:9200"]
Index = myAnnotate
[AnnotateConf]
Hosts = ["http://ny-lselastic01.example.com:9200", "http://ny-lselastic02.example.com:9200"]
SimpleClient = true
Index = myAnnotate
[AnnotateConf]
Hosts = ["http://ny-lselastic01.example.com:9200", "http://ny-lselastic02.example.com:9200"]
Index = myAnnotate
[AnnotateConf.ClientOptions]
Enabled = true
BasicAuthUsername = "admin"
BasicAuthPassword = "testing"
Scheme = "https" => default http
SnifferEnabled = false => default true
SnifferTimeoutStartup = 10 => default 5s
HealthcheckEnabled = false
Enables the Influx TSDBProvider and makes its query functions available via the API.
Full URL that Influx should use to connect to. e.g. URL = "https://myInfluxServer:1234"
Timeout for Influx queries, formatted as per the Go
duration format. e.g. Timeout = "5m"
Setting to true
allows you to connect to a Influx server even if the
https certificate is not validated correctly.
Influx connection username.
Influx connection password.
User agent that Bosun should identify itself as when querying Influx.
[InfluxConf]
URL = "https://myInfluxServer:1234"
Timeout = "5m"
UnsafeSSL = true
Enables querying CloudWatch metrics and exposes the query functions to the expression language. This functionality relies on bosun having assumed an iam role with the following capabilities
ListMetrics
GetMetricData
You can supply credentials using any of the standard methods such as passing an iam role to the ec2 instance bosun is running on, in the aws shared credentials file or via environment variables.
For complete details see the Specifying Credentials
section of the aws documentation
Should the cloudwatch functionality be loaded.
If wildcards are used in a dimension string bosun must call the ListMetrics api to try and find matches. This parameter controls how many pages of results bosun will iterate through before giving up and throwing an error. 1 page corresponds to 500 metrics
When using wildcards, the expansion limit controls the maximum number of metrics that will be requested using the
getMetricData() api.
For example if you have a large infrastructure which uses spot instances and make a query with a dimension of InstanceId:*
it would match 1000s of metrics. This will both be slow and expensive as you will be billed for each series you request from
the cloudwatch API. The PagesLimit and ExpansionLimit act as a safety valves to stop users inadvertently making very large requests.
The number of simultaneous queries to make to the cloudwatch api.
[CloudWatchConf]
Enabled = true
PagesLimit = 10
ExpansionLimit = 500
Concurrency = 2
Bosun authentication settings. If not specified, your instance will have no authentication, and will be open to anybody. When using Auth, TLS should be used so credentials are not sent in clear text.
A secret string used to encrypt cookies Bosun sets in client browsers to prevent token forgery. We recommend a longish random string here.
Supplying this will activate “token authentication”, where you can generate api tokens that external apps and services can use to interact with Bosun. Should be a long random string.
Disable authentication. Set to true to allow any user to access and change Bosun. The main use case for using this vs omitting AuthConf
altogether is setting to true,
and also enabling token auth to generate tokens before fully activating Authentication.
LDAP authentication configuration.
LDAP Domain name.
Host and port of LDAP server
Set to true to skip certificate validation if you are running self-signed certs, for example.
Default permissions that will be applied to any user who can authenticate to LDAP.
Base search path for searching group and user memberships. Not needed if not specifying Group level permissions. Usually just DC=myOrg,DC=com
is sufficient.
Allows you to set permission levels per LDAP group. See example for usage.
Allows you to grant permissions to individual users. See example for usage.
Various parts of the config allow you to specify permissions. These fields accept a comma seperated list of roles or permissions. Available roles and permissions are defined in the Bosun source. Any of the description values can be used as a permission in the config. See example for some examples. If a user matches multiple Group or User permissions, they will have the aggregate of all permissions granted to those groups and users, as well as the defaults.
[AuthConf]
CookieSecret = "MAPpHDIjciqzTg708Ef0AXLeid0o9ghrwKReyj57RPUCk80QffmLvVVHqc4w+A=="
TokenSecret = "hEgeP7DJn0e2RmhOEQLaNItzNN0fm8fWyyX1F3PB3qUbD9859xFxdL1JPLknuQ=="
[AuthConf.LDAP]
Domain = "mycompany"
LdapAddr = "ldap.mycompany.com:3269"
DefaultPermission = "Reader"
RootSearchPath = "DC=ds,DC=mycompany,DC=com"
[[AuthConf.LDAP.Groups]]
Path = "CN=Sysadmins,OU=Security Groups,DC=mycompany,DC=com"
Role = "Admin"
[[AuthConf.LDAP.Groups]]
Path = "CN=Developers,OU=Security Groups,DC=mycompany,DC=com"
Role = "Writer"
[AuthConf.LDAP.Users]
jSmith = "Actions,Create Annotations,Silence"