Monitoring Block

Table of Contents

Features
Usage
Provisioning
Errors Dashboard
Performance Dashboard
Nextcloud Dashboard
Deluge Dashboard
Backups Dashboard and Alert
Requests Error Budget Alert
Options Reference

Defined in /modules/blocks/monitoring.nix.

This block sets up the monitoring stack for Self Host Blocks. It is composed of:

Features

  • Declarative LDAP Configuration.

    • Needed LDAP groups are created automatically.

  • Declarative SSO Configuration.

    • When SSO is enabled, login with user and password is disabled.

    • Registration is enabled through SSO.

  • Access through subdomain using reverse proxy.

  • Access through HTTPS using reverse proxy.

Usage

The following snippet assumes a few blocks have been setup already:

{
  shb.monitoring = {
    enable = true;
    subdomain = "grafana";
    inherit domain;
    contactPoints = [ "me@example.com" ];
    adminPassword.result = config.sops.secrets."monitoring/admin_password".result;
    secretKey.result = config.sops.secrets."monitoring/secret_key".result;
  
      sso = {
        enable = true;
        authEndpoint = "https://${config.shb.authelia.subdomain}.${config.shb.authelia.domain}";
  
        sharedSecret.result = config.shb.sops.secret.oidcSecret.result;
        sharedSecretForAuthelia.result = config.shb.sops.secret.oidcAutheliaSecret.result;
      };
  };
  
  shb.sops.secret."monitoring/admin_password".request = config.shb.monitoring.adminPassword.request;
  shb.sops.secret."monitoring/secret_key".request = config.shb.monitoring.secretKey.request;
  shb.sops.secret."monitoring/oidcSecret".request = config.shb.monitoring.sso.sharedSecret.request;
  shb.sops.secret."monitoring/oidcAutheliaSecret" = {
    request = config.shb.monitoring.sso.sharedSecretForAuthelia.request;
    settings.key = "monitoring/oidcSecret";
  };
};

Secrets can be randomly generated with nix run nixpkgs#openssl -- rand -hex 64.

With that, Grafana, Prometheus, Loki and Promtail are setup! You can access Grafana at grafana.example.com with user admin and the password from the sops key monitoring/admin_password.

The user and admin LDAP groups are created automatically.

SMTP

I recommend adding a STMP server configuration so you receive alerts by email:

shb.monitoring.smtp = {
  from_address = "grafana@$example.com";
  from_name = "Grafana";
  host = "smtp.mailgun.org";
  port = 587;
  username = "postmaster@mg.example.com";
  passwordFile = config.sops.secrets."monitoring/smtp".path;
};

sops.secrets."monitoring/secret_key" = {
  sopsFile = ./secrets.yaml;
  mode = "0400";
  owner = "grafana";
  group = "grafana";
  restartUnits = [ "grafana.service" ];
};

Log Optimization

Since all logs are now stored in Loki, you can probably reduce the systemd journal retention time with:

# See https://www.freedesktop.org/software/systemd/man/journald.conf.html#SystemMaxUse=
services.journald.extraConfig = ''
SystemMaxUse=2G
SystemKeepFree=4G
SystemMaxFileSize=100M
MaxFileSec=day
'';

Other options are accessible through the upstream services modules. You might for example want to update the metrics retention time with:

services.prometheus.retentionTime = "60d";

Provisioning

Self Host Blocks will create automatically the following resources:

  • For Grafana:

    • datasources

    • dashboards

    • contact points

    • notification policies

    • alerts

  • For Prometheus, the following exporters and related scrapers:

    • node

    • smartctl

    • nginx

  • For Loki, the following exporters and related scrapers:

    • systemd

Those resources are namespaced as appropriate under the Self Host Blocks namespace:

Errors Dashboard

This dashboard is meant to be the first stop to understand why a service is misbehaving.

The yellow and red dashed vertical bars correspond to the Requests Error Budget Alert firing.

Performance Dashboard

This dashboard is meant to be the first stop to understand why a service is performing poorly.

Performance Dashboard Top Part
Performance Dashboard Middle Part
Performance Dashboard Bottom Part

Nextcloud Dashboard

See Nextcloud service manual.

Deluge Dashboard

This dashboard is used to monitor a deluge instance.

Deluge Dashboard Top Part
Deluge Dashboard Bottom Part

Backups Dashboard and Alert

This dashboard show Restic and BorgBackup backup jobs, or any job with “backup” in the systemd service name.

Dashboard

Variables:

  • The “Job” variable allows to select one or more backup jobs. “All” is the default.

  • The “mountpoints” variable allows to select only relevant mountpoints for backup. “All” is the default.

The most important graphs are the first three:

  • “Backup Jobs in the Past Week”: Shows stats on all backup jobs that ran in the past. It is sorted by the “Failed” column in descending order. This way, one can directly see when a job has failures.

  • “Schedule”: Shows when a job will run. The unit is “Datetime from Now” meaning it shows when a job ran or will run relative to the current time. An annotation will show up when the “Late Backups” alert fired or resolved.

  • “Backup jobs”: Shows when a backup job ran. Normally, jobs running for less than 15 seconds will not show up in the graph. We crafted a query that still shows them but the length is 15 seconds, even if the backup job took less time to run.

Backups Dashboard Top Part
Backups Dashboard Middle Part
Backups Dashboard Bottom Part

Alerts

  • The “Late Backups” alert will fire if a backup job did not run at all in the last 24 hours or if all runs were failures in the last 24 hours. It will show up as annotations in the “Schedule” panel of the dashboard.

Late Backups Alert Firing
Backups Alert Showing Up In Dashboard

Requests Error Budget Alert

This alert will fire when the ratio between number of requests getting a 5XX response from a service and the total requests to that service exceeds 1%.

Error Dashboard Top Part
Error Dashboard Bottom Part

Options Reference

shb.monitoring.enable

Whether to enable selfhostblocks.monitoring.

Type: boolean

Default: false

Example: true

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword

Initial admin password.

Type: submodule

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.request

Request part of the secret contract.

Options set by the requester module enforcing some properties the secret should have.

Type: submodule

Default: ""

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.request.group

Linux group owning the secret file.

Type: string

Default: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.request.mode

Mode of the secret file.

Type: string

Default: "0400"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.request.owner

Linux user owning the secret file.

Type: string

Default: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.request.restartUnits

Systemd units to restart after the secret is updated.

Type: list of string

Default:

[
  "grafana.service"
]

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.result

Result part of the secret contract.

Options set by the provider module that indicates where the secret can be found.

Type: submodule

Default:

{
  path = "/run/secrets/secret";
}

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.adminPassword.result.path

Path to the file containing the secret generated out of band.

This path will exist after deploying to a target host, it is not available through the nix store.

Type: absolute path

Default: "/run/secrets/secret"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.contactPoints

List of email addresses to send alerts to

Type: list of string

Default: [ ]

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.debugLog

Set to true to enable debug logging of the infrastructure serving Grafana.

Type: boolean

Default: false

Example: true

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.domain

domain under which Grafana will be served.

Type: string

Example: "mydomain.com"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.grafanaPort

Port where Grafana listens to HTTP requests.

Type: 16 bit unsigned integer; between 0 and 65535 (both inclusive)

Default: 3000

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ldap

Setup LDAP integration.

Type: submodule

Default: { }

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ldap.adminGroup

Group users must belong to to be admins in Grafana.

Type: string

Default: "monitoring_admin"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ldap.userGroup

Group users must belong to to be able to login to Grafana.

Type: string

Default: "monitoring_user"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.lokiMajorVersion

Switching from version 2 to 3 requires manual intervention https://grafana.com/docs/loki/latest/setup/upgrade/#main–unreleased. So this let’s the user upgrade at their own pace.

Type: one of 2, 3

Default: 2

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.lokiPort

Port where Loki listens to HTTP requests.

Type: 16 bit unsigned integer; between 0 and 65535 (both inclusive)

Default: 3002

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.orgId

Org ID where all self host blocks related config will be stored.

Type: signed integer

Default: 1

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.prometheusPort

Port where Prometheus listens to HTTP requests.

Type: 16 bit unsigned integer; between 0 and 65535 (both inclusive)

Default: 3001

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.provisionDashboards

Provision Self Host Blocks dashboards under ‘Self Host Blocks’ folder.

Type: boolean

Default: true

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey

Secret key used for signing.

Type: submodule

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.request

Request part of the secret contract.

Options set by the requester module enforcing some properties the secret should have.

Type: submodule

Default: ""

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.request.group

Linux group owning the secret file.

Type: string

Default: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.request.mode

Mode of the secret file.

Type: string

Default: "0400"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.request.owner

Linux user owning the secret file.

Type: string

Default: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.request.restartUnits

Systemd units to restart after the secret is updated.

Type: list of string

Default:

[
  "grafana.service"
]

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.result

Result part of the secret contract.

Options set by the provider module that indicates where the secret can be found.

Type: submodule

Default:

{
  path = "/run/secrets/secret";
}

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.secretKey.result.path

Path to the file containing the secret generated out of band.

This path will exist after deploying to a target host, it is not available through the nix store.

Type: absolute path

Default: "/run/secrets/secret"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp

SMTP options.

Type: null or (submodule)

Default: null

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp.from_address

SMTP address from which the emails originate.

Type: string

Example: "vaultwarden@mydomain.com"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp.from_name

SMTP name from which the emails originate.

Type: string

Default: "Grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp.host

SMTP host to send the emails to.

Type: string

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp.passwordFile

File containing the password to connect to the SMTP host.

Type: string

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp.port

SMTP port to send the emails to.

Type: 16 bit unsigned integer; between 0 and 65535 (both inclusive)

Default: 25

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.smtp.username

Username to connect to the SMTP host.

Type: string

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ssl

Path to SSL files

Type: null or (open submodule of anything)

Default: null

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ssl.paths

Paths where the files for the certificate will be located.

This option is the contract output of the shb.certs.certs SSL block.

Type: open submodule of anything

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ssl.paths.cert

Path to the cert file.

Type: absolute path

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ssl.paths.key

Path to the key file.

Type: absolute path

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.ssl.systemdService

Systemd oneshot service used to generate the certificate. Ends with the .service suffix.

Use this if downstream services must wait for the certificates to be generated before starting.

Type: string

Example: "cert-generator.service"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso

Setup SSO integration.

Type: submodule

Default: { }

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.enable

Whether to enable SSO integration…

Type: boolean

Default: false

Example: true

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.authEndpoint

Endpoint to the SSO provider.

Type: string

Default: null

Example: "https://authelia.example.com"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.authorization_policy

Require one factor (password) or two factor (device) authentication.

Type: one of “one_factor”, “two_factor”

Default: "one_factor"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.clientID

Client ID for the OIDC endpoint.

Type: string

Default: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret

OIDC shared secret for Grafana.

Type: submodule

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.request

Request part of the secret contract.

Options set by the requester module enforcing some properties the secret should have.

Type: submodule

Default: ""

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.request.group

Linux group owning the secret file.

Type: string

Default: "root"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.request.mode

Mode of the secret file.

Type: string

Default: "0400"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.request.owner

Linux user owning the secret file.

Type: string

Default: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.request.restartUnits

Systemd units to restart after the secret is updated.

Type: list of string

Default:

[
  "grafana.service"
]

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.result

Result part of the secret contract.

Options set by the provider module that indicates where the secret can be found.

Type: submodule

Default:

{
  path = "/run/secrets/secret";
}

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecret.result.path

Path to the file containing the secret generated out of band.

This path will exist after deploying to a target host, it is not available through the nix store.

Type: absolute path

Default: "/run/secrets/secret"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia

OIDC shared secret for Authelia. Must be the same as sharedSecret

Type: submodule

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.request

Request part of the secret contract.

Options set by the requester module enforcing some properties the secret should have.

Type: submodule

Default: { mode = 0400; owner = config.shb.authelia.autheliaUser; group = root; restartUnits = [ ]; }

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.request.group

Linux group owning the secret file.

Type: string

Default: "root"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.request.mode

Mode of the secret file.

Type: string

Default: "0400"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.request.owner

Linux user owning the secret file.

Type: string

Default: config.shb.authelia.autheliaUser

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.request.restartUnits

Systemd units to restart after the secret is updated.

Type: list of string

Default: [ ]

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.result

Result part of the secret contract.

Options set by the provider module that indicates where the secret can be found.

Type: submodule

Default:

{
  path = "/run/secrets/secret";
}

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.sso.sharedSecretForAuthelia.result.path

Path to the file containing the secret generated out of band.

This path will exist after deploying to a target host, it is not available through the nix store.

Type: absolute path

Default: "/run/secrets/secret"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>
shb.monitoring.subdomain

Subdomain under which Grafana will be served.

Type: string

Example: "grafana"

Declared by:

<selfhostblocks/modules/blocks/monitoring.nix>