ivarch.com: Scheduled Command Wrapper: Online Manual

← Back to the project page

NAME

scw - scheduled command wrapper

SYNOPSIS

scw [ --config FILE ] [--set SETTING=VALUE]... ACTION [OPTION]...

scw run [ --force ] ITEM [-- COMMAND]

scw enable|disable ITEM

scw status ITEM

scw list [--enabled|--disabled]

scw update

scw -h|--help
scw -V|--version

DESCRIPTION

Wrap a scheduled command in a framework which adds concurrency locking, prerequisites, dependency checks, conflict avoidance, randomised startup delays, flexible logging, and monitoring metrics.

Each distinct scheduled command is referred to as an “item”, and can have its own configuration for each of these features:

Concurrency locking: Prevents an item from being run more than once at the same time, for example if it's scheduled to run every 2 minutes and occasionally takes longer than that to complete.
Prerequisites: Prevents an item from running if some prior condition is not met, for example checking whether this is the active node of a failover cluster, or whether some crucial underpinning service is running.
Dependency checks: Ensures that an item will run only if some other item has succeeded before it - with the possibility of waiting a short while for the dependency to finish rather than giving up straight away. Useful when linked items need to be scheduled separately at particular times but the later one can only run if the earlier one has succeeded. For example, a data load batch may have to run at some time in the early morning, and a subsequent data processing batch, which for business reasons has to run after a particular time of day, can only run if the early morning data load succeeded.
Conflict avoidance: Prevents an item from running if some other item is still running - with the possibility of waiting a short while for the conflicting item to finish rather than giving up straight away.
Randomised startup delays: Avoids resource overconsumption when the same item is scheduled to run on multiple systems.
Flexible logging: Item output can be sent to any combination of files, syslog, email, or HTTP, with or without timestamps. The standard output and standard error of items can be combined or separated, and a special status stream is also made available so that significant events (such as starting each step of a multi-step process) can be recorded separately.
Metrics: A collection of informational files is maintained for each item. Any monitoring agent can read these files and raise alerts based on their contents. An item list file (a JSON array of item descriptions) is automatically generated so that a system such as Zabbix can find and monitor all items without needing an operator to make adjustments when items are added or removed.

All of these features are optional.

In its simplest form, scw can be invoked from existing scheduler entries by using the “run” action, so for example this crontab(5) entry would change as follows:

# Original entry
0 * * * * /some/command --option ARGUMENT
 
# Replacement
0 * * * * scw run mycommand -- /some/command --option ARGUMENT

In this example, the scheduled command will be known to scw as the “mycommand” item, and all of the logs and metrics produced will use that name.

To make use of the full range of features, place scheduled commands or definition files into the item definition directory, and call “scw update” to generate the system crontab and the item list file.

ACTIONS

run ITEM: Start running the item named ITEM, applying any prerequisite checks, dependency and conflict checks, startup delays, and concurrency locks, after checking whether the item has been disabled (unless the “--force” option was passed, in which case it will be run as if it was enabled). If there is no command defined for ITEM, all arguments after “--” will be passed to “sh -c”. If standard input, output, and error are all connected to a terminal, any configured randomised startup delay will be skipped, and output will be written to the terminal as well as to the usual logging destinations.
enable ITEM: Enable the item named ITEM so that it runs as scheduled.
disable ITEM: Disable the item named ITEM so that it will not run as scheduled, unless run with the “--force” option.
status ITEM: Show the current status of the item named ITEM. This only operates on items defined in the item definition directory.
list: List all defined items, or items that are defined and enabled (with the “--enabled” option), or items that are defined and disabled (with “--disabled”).
update: Update the system crontab and the item list file from the items defined in the item definition directories (see the FILES section).

OPTIONS

With no action:

-h, --help: Print a usage message on standard output and exit successfully.
-V, --version: Print version information on standard output and exit successfully.

With any action:

-c, --config FILE: Read configuration from FILE instead of the system-wide default location.
-s, --set SETTING=VALUE: Set the item or configuration setting SETTING to VALUE (see the CONFIGURATION section).

With the “run” action:

-f, --force: Run the item regardless of whether it has been marked as disabled.

With the “list” action:

-e, --enabled: Only list items which are currently enabled.
-d, --disabled: Only list items which are currently disabled.

ITEM DEFINITIONS

The settings for an item are defined in a file named ITEM.cf under the user's item definition directory, by default /etc/scw/items/USER/ (see the FILES section).

When an item is defined in this way, its Command setting should be provided, to state what to run.

Alternatively, shell or Perl scripts can be placed directly into the directory, named ITEM.sh or ITEM.pl, so long as their first comment block at the top of the file contains the item settings prefixed with “scw”, like this:

#!/bin/sh
#
# Perform action ABC.
#
# scw Description = Do ABC
# scw Schedule = 0 2 * * Mon
# scw MaxRunTime = 30 minutes
# scw SuccessInterval = 1 day 30 minutes
#

This mechanism allows other packages to drop their own scheduled commands directly into this framework, and so long as they run “scw update” after installation, the items will be scheduled and configured correctly.

If an item has a script as well as a “.cf” file, the “.cf” settings are applied first, and the Command is implied.

Defining items in this way allows software developers to include scheduling information directly in their build artefacts, so that packaging and deployment teams don't need to worry about a developer-generated crontab running something with the wrong credentials (assuming that the package has been configured to run the software under its own user account).

CONFIGURATION

Item definitions, and configuration files, share the same syntax: SETTING=VALUE pairs, one per line. Blank lines are ignored, as are comments (denoted by “#”). Leading and trailing whitespace is ignored.

Unless otherwise stated, each setting takes only one value, so setting it again overrides whatever it was set to earlier. To remove a setting, set it to an empty value.

Placeholders

The following placeholders can be used in values:

{ITEM}

The name of this item.

{USER}

The username of the account currently running scw, normalised to lower case.

Configuration settings

The following settings are available in configuration files:

ItemsDir: The directory in which to find item definitions and item scripts. The default value is usually /etc/scw/items/{USER}.
MetricsDir: The directory in which to place metrics files for an item. The directory must already exist and be writable by the current user, or if it doesn't exist, its parent must be writable by the current user so that scw can create it. The default value is usually /var/spool/scw/{USER}/{ITEM}.
CheckLockFile: The lock file to use when performing dependency or conflict checks. The default value is usually /var/spool/scw/{USER}/.lock.
UserConfigFile: The per-user configuration file. When scw runs, it first loads the global configuration file, and then this one. The default value is usually /etc/scw/settings/{USER}.cf. This setting can only be changed in the global configuration file.
ItemListFile: The file to which “scw update” will write a JSON array describing all items, which can be used by a monitoring system to discover what to monitor. The default value is usually /var/spool/scw/items.json. This setting can only be changed in the global configuration file.
CrontabFile: The file to which “scw update” will write a crontab. The default value is usually /etc/cron.d/scw. This setting can only be changed in the global configuration file.
UpdateLockFile: The lock file to use to ensure that only one instance of “scw update” is running at a time. The default value is usually /var/spool/scw/.update-lock. This setting can only be changed in the global configuration file.

Configuration files may also contain any of the other settings listed below for items, to set defaults which items can override.

Item settings

The following settings are available for items.

Description: A short one-line description of what this item does. This is recorded in the item list file by “scw update”.
Command: The command to run. This is passed to “sh -c”.
Schedule: When to run this item. This takes time and date fields in the same format as crontab(5). An item can have up to 16 Schedule values. This is used by “scw update” when generating the system crontab.
RandomDelay: Each time this item starts, there will be a random delay of up to this many seconds before continuing with checks and running the command. The time period can be specified in seconds, or with multiple numbers suffixed with “w” (weeks), “d” (days), “h” (hours), “m” (minutes), or “s” (seconds). Either the whole word can be given, or just the first letter. Spaces are optional. For example, “1d5h7m6s”, “1 day 5 hours 7 minutes 6 seconds”, and “104826” are all equivalent.
MaxRunTime: The maximum number of seconds to allow the command to run, after which it will be forcibly terminated. If this is not set, there is no limit. The same time period formatting rules as RandomDelay apply here.
Prerequisite: A command to run before attempting to run any item's Command. If the Prerequisite command exits with a non-zero status, the item is treated as if it was not scheduled to run at all, and so its command is not run. This can be used to, for instance, check that this server is the active node of a failover cluster, so that scheduled commands only run on the active node. The output of the Prerequisite command is always discarded. Note: The Prerequisite command may be run multiple times for an item if any delays are involved, since it is invoked prior to any delay, and then again after any delay just before the item's Command is to be run.
SuccessInterval: The number of seconds permitted between successful command runs before an alert should be raised. This is not used directly by scw, but is made available as a metrics file for your monitoring system to read - see the Metrics subsection under FILES. The same time period formatting rules as RandomDelay apply here.
ConcurrencyWait: If the item is already running, it will not start a second instance. Instead of abandoning the attempt immediately, it will wait up to this many seconds for the previous run to complete first. If the previous run finishes by then, the new run will proceed as normal. If ConcurrencyWait is not set, there will be no waiting. The same time period formatting rules as RandomDelay apply here.
SilentConcurrency: If the item was already running, and did not finish before the ConcurrencyWait timeout expired, then a second instance won't be started. By default, when this happens, the overrun metrics file is created and no other metrics are affected. If the SilentConcurrency setting is “no”, “off”, “false”, or “0”, then this situation will instead be treated as if the command had run and failed.
DependsOn: The name of another item which must have successfully run since the previous run of this item. If the dependency is not met, this item will not run. An item can have up to 16 DependsOn values.
DependencyWait: If all dependencies are not met, keep waiting for them to be met for up to this many seconds. If, after waiting, the dependencies have been met, start the item as normal. The same time period formatting rules as RandomDelay apply here.
SilentDependency: By default, if an item's dependencies are not met, the item is treated as if it ran its command and it failed. If the SilentDependency setting is “yes”, “on”, “true”, or “1”, then this situation will instead be treated as if the item was not scheduled to be run at all, and no metrics will be updated.
ConflictsWith: The name of another item which must not be running at the same time as this item. If it is, this item will not run. An item can have up to 16 ConflictsWith values.
ConflictWait: If a conflicting item is running, keep waiting for it to finish for up to this many seconds. If, after waiting, the conflicts have been resolved, start the item as normal. The same time period formatting rules as RandomDelay apply here.
SilentConflict: By default, if an item can't start because of a conflict, it is treated as if it ran its command and it failed. If the SilentConflict setting is “yes”, “on”, “true”, or “1”, then this situation will instead be treated as if the item was not scheduled to be run at all, and no metrics will be updated.
StatusMode: A command can provide additional information about its progress through a status stream (see the STATUS REPORTING section). If StatusMode is set to “fd”, then status information is read from the command's file descriptor 3. If it is set to “stdout” or “stderr”, status information is derived from the command's standard output or standard error respectively, using the StatusTag.
StatusTag: When StatusMode is not “fd”, any command output lines in the appropriate stream which start with the StatusTag will have that tag removed and the remainder will be used as the status information. See the STATUS REPORTING section.
TimestampUTC: By default, timestamps are expressed in the system's local time zone. If the TimestampUTC setting is “yes”, “on”, “true”, or “1”, then timestamps are expressed in UTC. Note that this only affects timestamps in the output - the Schedule always refers to the system's local time zone, as cron(8) does.
OutputMap: Map the command's output to a destination. See below for more details. An item can have up to 16 OutputMap values.

Output mapping

The OutputMap settings take values of the form “STREAM FORMAT DESTINATION”, where STREAM selects one or more command output streams, FORMAT selects how it should be formatted, and DESTINATION selects where to send it to.

For example, an item might have this output map configuration, or it could even be in the global configuration file as the default for this server:

# Write stdout, stderr, and status, with timestamps, to a file
OutputMap = OES stamped /var/log/scw/{USER}/{ITEM}.log
 
# Email stdout and stderr, without timestamps, to root@localhost
OutputMap = OE raw root@localhost
 
# On failure, email stderr, without timestamps, to admin@company.com
OutputMap = !E raw admin@company.com
 
# Write status messages to syslog as facility "user", level "notice".
OutputMap = S raw user.notice
 
# Send status messages as JSON data via HTTPS POST
OutputMap = S json https://status.company.com/receiver

The STREAM is any combination of the following:

O

Standard output.

E

Standard error.

S

Status messages.

!

Spool until the command completes, and then only send to the destination if the exit status was non-zero, indicating failure.

The FORMAT is one of the following:

raw

Lines of text exactly as output by the command.

stamped

Lines prefixed with a timestamp and, if multiple streams were selected, an indicator of which stream they came from.

json

A JSON object containing integers named epoch and pid, and strings named hostname, user, item, stream, and message; the stream will be one of “stdout”, “stderr”, or “status”.

form

An HTTP form post (key=value pairs) containing the same fields as the json format.

The DESTINATION is a filename, a list of email addresses separated by semicolons, a syslog priority in the form facility.level, or a URL.

STATUS REPORTING

When an item's command runs through several steps, it can pass details of which step it's up to, and whether the previous step failed, to scw using the status reporting mechanism.

The first word of the status report should be one of “notice”, “ok”, “warning”, or “error”.

When running an item's command, scw inserts a special “begin” status report at the start, and an “end” status report at the end.

Depending on the StatusMode and StatusTag settings, status reporting might look like this:

#!/bin/sh
#
# scw StatusMode = fd
 
printf "%s %s" "notice" "Starting step 1" >&3
  # do some work here
if $succeeded; then
    printf "%s %s" "ok" "Step 1 complete" >&3
else
    printf "%s %s" "error" "Step 1 failed" >&3
    exit 1
fi
 
printf "%s %s" "notice" "Starting step 2" >&3
  # do some more work here
# ...

Adding status information like this makes analysis easier, for example discovering how the time taken for each step of a multi-step command fluctuates, or clearly highlighting where a command failed.

Using a StatusMode of “stdout” or “stderr” means prefixing the status message with the value of the StatusTag, like this:

#!/bin/sh
#
# scw StatusMode = stdout
# scw StatusTag = STATUS:
 
printf "STATUS: %s %s" "notice" "Starting step 1"
# ... and so on.

Writing status information this way may be easier than with the “fd” method in some circumstances. It has the side effect that the status messages will also be recorded in standard output or standard error logs.

EXIT STATUS

The following exit status values apply to all actions:

0: Success: the action completed without error.
4: An unknown option, action, or setting was passed on the command line, or too many or too few arguments were provided for the chosen action. No action was taken.
5: The configuration file could not be read, or contains unrecoverable errors. No action was taken.
6: Some other error occurred that was not covered by any of the above. Action may have been partially completed.

Exit status values for “run”

The “run” action can exit with one of the following:

0: Success: the action completed without error.
1: Item failed: the item's command was run, and it exited non-zero.
2: Item timed out: the item's command was run, but it reached its configured maximum run time, and was forcibly terminated.
8: Item has no command: the specified item has no entry in the item definition directory and there was no command in the remaining command line arguments to scw, so no command has been run.
9: Item not enabled: the item is currently disabled, and the “--force” option was not provided, so the command has not been run.
10: Item prerequisites not met: the prerequisite check has failed, so the command has not been run.
11: Item dependencies not met: the items on which this item depends have not all run, so the command has not been run.
12: Item conflict: one of the items which this item conflicts with is currently running, and all options for startup delays have been exhausted, so the command has not been run.
13: Item already running: the item is already running, and all of its configured options for startup delays have been exhausted, so the command has not been run.

Note that any exit status lower than 3 indicates that the item's command was definitely started; only an exit status of 0 indicates that it succeeded.

Exit status values for “status”

The “status” action exits with the sum of the following values:

16: Added if the item does not exist (meaning that it has no entry in the item definition directory).
32: Added if the item is disabled.
64: Added if the item is currently running.

For example, an exit status of 0 indicates that the item exists, is enabled, and is not currently running. An exit status of 96 indicates an item which is disabled, but currently running.

FILES

File locations may be adjusted by the installation process, so for example paths listed here under /etc may be under /usr/local/etc on your system. Locations may also be overridden by configuration settings.

/etc/scw/default.cf: Global default settings.
/etc/scw/settings/USER.cf: Settings to apply when running as user USER.
/etc/scw/items/USER/*.cf: Item definitions for user USER.
/etc/scw/items/USER/*.sh
/etc/scw/items/USER/*.pl: Item scripts for user USER, with their definitions embedded in a comment block at the top of the script (see the ITEM DEFINITIONS section).
/var/log/scw/USER/*.log: The default location for log files generated by items owned by USER.
/var/spool/scw/USER/ITEM/: Metrics files for the item ITEM owned by the user USER. See below for more details.
/var/spool/scw/USER/.lock: An empty file used for locking while checking for dependencies and conflicts.
/var/spool/scw/items.json: A JSON array describing all items, suitable for using as a Zabbix low-level discovery file. Updated by “scw update”.
/etc/cron.d/scw: The default crontab written by “scw update”.
/var/spool/scw/.update-lock: An empty file used for locking while running “scw update”.

Metrics

The metrics directory for an item can contain these files:

disabled: An empty file whose presence indicates that the item is disabled, and whose last-modification time indicates when it was disabled.
success-interval: The number of seconds permitted between successful command runs before an alert should be raised, followed by a newline. Monitoring systems should be instructed to raise an alert if the succeeded file's last-modification time is more than success-interval seconds ago and the prerequisites-met file exists.
prerequisites-met: An empty file which is created if the item's prerequisites are met, and deleted if they are not. Its last-modification time indicates when the prerequisites were last successfully checked.
started: An empty file whose last-modification time indicates when the item last started. It is not updated until the item's command actually starts running (so, after any startup delays).
ended: An empty file whose last-modification time indicates when the item last ended after running the item's command, regardless of whether the command succeeded. Its last-modification time is not updated unless the command actually ran.
succeeded: An empty file whose last-modification time indicates when the item's command last ran and ended with a zero exit status. It is not deleted on failure.
failed: An empty file which is created when the item's command runs and ends with a non-zero exit status. Its last-modification time indicates when the command first failed - it is not updated when subsequent runs fail. This file is deleted as soon as the command runs and exits with a zero exit status.
overran: An empty file which is created when the item could not run because it was already running and all startup delay options were exhausted. Its last-modification time indicates when this first happened. It is deleted the next time the item is able to start.
run-time: The number of seconds the item's command most recently took to run, followed by a newline. It is updated each time the item's command finishes a run, not counting any startup delays, and regardless of the command's exit status.
.lock: An empty file which the item will lock while running.
pid: While the item is running, this file contains the item's process ID, followed by a newline. The file is deleted on exit.
last-status: The status message most recently reported by the item (see the STATUS REPORTING section).

NOTES

Time periods such as MaxRunTime can be written in seconds, or as any combination of weeks, days, hours, minutes, and seconds, each number suffixed with the unit. Spaces are allowed between each component of the time period, but no other words or punctuation. These are all equivalent:

MaxRunTime = 2 weeks 3 days 10 hours 5 minutes 2 seconds
MaxRunTime = 2w 3d 10h 5m 2s
MaxRunTime = 2w3d10h5m2s
MaxRunTime = 17 days 605 minutes 2 seconds
MaxRunTime = 1505102 seconds
MaxRunTime = 1505102

The name of an item can only contain letters, numbers, underscores, and hyphens.

Item settings files and scripts in ItemsDir, the per-user configuration file in UserConfigFile, and the global configuration file, must be normal files, and must not be symbolic links.

Each setting which takes multiple values - Schedule, DependsOn, ConflictsWith, OutputMap - is limited to 16 values in total after applying rules from all relevant sources. For example, if the global configuration defines 3 OutputMap values, an item can only add 13 more unless it first clears the list by assigning an empty value to OutputMap.

EXAMPLES

The global configuration file /etc/scw/default.cf could look like this:

# Configuration settings
ItemsDir = /etc/scw/items/{USER}
MetricsDir = /var/spool/scw/{USER}/{ITEM}
CheckLockFile = /var/spool/scw/{USER}/.lock
UserConfigFile = /etc/scw/settings/{USER}.cf
ItemListFile = /var/spool/scw/items.json
CrontabFile = /etc/cron.d/scw
UpdateLockFile = /var/spool/scw/.update-lock
 
# Item defaults
SilentConcurrency = yes
SilentDependency = no
SilentConflict = no
StatusMode = fd
TimestampUTC = no
 
# Write stdout, stderr, and status, with timestamps, to a file
OutputMap = OES stamped /var/log/scw/{USER}/{ITEM}.log
  
# On failure, email stderr, without timestamps, to root
OutputMap = !E raw root

An associated configuration file /etc/logrotate.d/scw for logrotate(8) would look like this:

/var/log/scw/*/*.log {
    weekly
    compress
    missingok
    notifempty
    nocreate
}

A local application “app” which runs under the user account appuser, with its own scheduled commands, could have a per-user configuration file /etc/scw/settings/appuser.cf like this:

# Replace the output map for the application's scheduled commands
OutputMap =
OutputMap = OES stamped /srv/app/logs-full/{ITEM}.log
OutputMap = S stamped /srv/app/logs-status/{ITEM}.log
OutputMap = S raw user.notice
OutputMap = S json https://monitoring.company.com/scw-receiver
 
# Keep the metrics near the logs
MetricsDir = /srv/app/metrics/{ITEM}
 
# Only run scheduled commands if the application is active
Prerequisite = systemctl --quiet is-active app.service
 
# The application keeps its scheduled commands alongside the rest of its
# deployed software
ItemsDir = /opt/app/scheduled

The deployment package for “app” could then include scheduled command definitions for the application, such as this example which would be named /opt/app/scheduled/backup.sh:

#!/bin/sh
# scw Description = Back up the application's files
# scw Schedule = 0 22 * * *      # 10pm every day
# scw SuccessInterval = 129600   # alert after a day and a half
 
printf "%s %s" "notice" "starting backup" >&3
 
/opt/app/sbin/trigger-app-backup
exitStatus=$?
 
if test $exitStatus -eq 0; then
    printf "%s %s" "ok" "backup successful" >&3
else
    printf "%s %s" "error" "backup failed - exit status $exitStatus" >&3
fi
 
exit $exitStatus

The post-installation script for the “app” package would include a call to “scw update”. This would define the item “backup” for the user “appuser”, and would regenerate the crontab file /etc/cron.d/scw. The script /opt/app/scheduled/backup.sh would then automatically run at 10pm daily, and write its logs as directed by the OutputMap settings in /etc/scw/settings/appuser.cf. If the monitoring system was directed to discover items by reading the global item list file /var/spool/scw/items.json, it would start monitoring this scheduled command automatically.

REPORTING BUGS

Please report any bugs to scw@ivarch.com.

Alternatively, use the issue tracker linked from the scw home page.

COPYRIGHT

License GPLv3+: GNU GPL version 3 or later.

This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.