NAME
scw - scheduled command wrapper
SYNOPSIS
scw [ --config FILE ] [--set SETTING=VALUE]... ACTION [OPTION]...
scw run [ --force ] [ --strict ] ITEM
scw enable|disable ITEM
scw status ITEM
scw list [--enabled|--disabled] [ --all-users ]
scw update [ --all-users ]
scw
-h|--help
scw -V|--version
DESCRIPTION
Wrap a scheduled command in a framework which adds concurrency locking, prerequisites, dependency checks, conflict avoidance, randomised startup delays, flexible logging, and monitoring metrics.
Each distinct scheduled command is referred to as an “item”, and can have its own configuration for each of these features:
- Concurrency locking
-
Prevents an item from being run more than once at the same time, for example if it's scheduled to run every 2 minutes and occasionally takes longer than that to complete.
- Prerequisites
-
Prevents an item from running if some prior condition is not met, for example checking whether this is the active node of a failover cluster, or whether some crucial underpinning service is running.
- Dependency checks
-
Ensures that an item will run only if some other item has succeeded before it - with the possibility of waiting a short while for the dependency to finish rather than giving up straight away. Useful when linked items need to be scheduled separately at particular times but the later one can only run if the earlier one has succeeded. For example, a data load batch may have to run at some time in the early morning, and a subsequent data processing batch, which for business reasons has to run after a particular time of day, can only run if the early morning data load succeeded.
- Conflict avoidance
-
Prevents an item from running if some other item is still running - with the possibility of waiting a short while for the conflicting item to finish rather than giving up straight away.
- Randomised startup delays
-
Avoids resource overconsumption when the same item is scheduled to run on multiple systems.
- Flexible logging
-
Item output can be sent to any combination of files, syslog, email, or HTTP, with or without timestamps. The standard output and standard error of items can be combined or separated, and a special status stream is also made available so that significant events (such as starting each step of a multi-step process) can be recorded separately.
- Metrics
-
A collection of informational files is maintained for each item. Any monitoring agent can read these files and raise alerts based on their contents. An item list file (a JSON array of item descriptions) is automatically generated so that a system such as Zabbix can find and monitor all items without needing an operator to make adjustments when items are added or removed.
All of these features are optional.
In its simplest form, scw can be invoked from existing scheduler entries by using the “run” action, so for example this crontab(5) entry would change as follows:
# Original entry 0 * * * * /some/command --option ARGUMENT # Replacement 0 * * * * scw run mycommand -s Command="/some/command --option ARGUMENT"
In this example, the scheduled command will be known to scw as the “mycommand” item, and all of the logs and metrics produced will use that name.
To make use of the full range of features, place scheduled commands or definition files into the item definition directory, and call “scw update -a” to generate the crontab and the item list file.
ACTIONS
- run ITEM
-
Start running the item named ITEM, applying any prerequisite checks, dependency and conflict checks, startup delays, and concurrency locks, after checking whether the item has been disabled (unless the “--force” option was passed, in which case it will be run as if it was enabled). If standard input, output, and error are all connected to a terminal, any configured randomised startup delay will be skipped, and output will be written to the terminal as well as to the usual logging destinations.
- enable ITEM
-
Enable the item named ITEM so that it runs as scheduled.
- disable ITEM
-
Disable the item named ITEM so that it will not run as scheduled, unless run with the “--force” option.
- status ITEM
-
Show the current status of the item named ITEM. This only operates on items defined in the item definition directory.
- list
-
List all defined items, or items that are defined and enabled (with the “--enabled” option), or items that are defined and disabled (with “--disabled”). With the “--all-users” option, items for all users are listed, not just the current user.
- update
-
Update the crontab and the item list file from the items defined in the item definition directory (see the FILES section). With the “--all-users” option, items for all users are considered, not just the current user, and the crontab file is written in the form expected in /etc/cron.d/, with the additional username field.
OPTIONS
With no action:
- -h, --help
-
Print a usage message on standard output and exit successfully.
- -V, --version
-
Print version information on standard output and exit successfully.
With any action:
- -c, --config FILE
-
Read configuration from FILE instead of the system-wide default location.
- -s, --set SETTING=VALUE
-
Set the item or configuration setting SETTING to VALUE (see the CONFIGURATION section).
With the “run” action:
- -f, --force
-
Run the item regardless of whether it has been marked as disabled.
- -S, --strict
-
Refuse to run the item if the CheckLockFile cannot be opened, the MetricsDir cannot be written to, the item lock cannot be opened, or a file used in an OutputMap cannot be opened. Normally, if any of these occur, a warning is produced, and the item runs anyway.
With the “list” action:
- -e, --enabled
-
Only list items which are currently enabled.
- -d, --disabled
-
Only list items which are currently disabled.
With the “list” and “update” actions:
- -a, --all-users
-
Look at items for all users, not just the current one. See the NOTES section.
ITEM DEFINITIONS
The settings for an item are defined in a file named ITEM.cf under the user's item definition directory, by default /etc/scw/items/USER/ (see the FILES section).
When an item is defined in this way, its Command setting should be provided, to state what to run.
Alternatively, shell or Perl scripts can be placed directly into the directory, named ITEM.sh or ITEM.pl, so long as their first comment block at the top of the file contains the item settings prefixed with “scw”, like this:
#!/bin/sh # # Perform action ABC. # # scw Description = Do ABC # scw Schedule = 0 2 * * Mon # scw MaxRunTime = 30 minutes # scw SuccessInterval = 1 day 30 minutes #
This mechanism allows other packages to drop their own scheduled commands directly into this framework, and so long as they run “scw update -a” after installation, the items will be scheduled and configured correctly.
If an item has a script as well as a “.cf” file, the “.cf” settings are applied first, and the Command is implied.
Defining items in this way allows software developers to include scheduling information directly in their build artefacts, so that packaging and deployment teams don't need to worry about a developer-generated crontab running something with the wrong credentials (assuming that the package has been configured to run the software under its own user account).
CONFIGURATION
Item definitions, and configuration files, share the same syntax: SETTING=VALUE pairs, one per line. Blank lines are ignored, as are comments (denoted by “#”). Leading and trailing whitespace is ignored.
Unless otherwise stated, each setting takes only one value, so setting it again overrides whatever it was set to earlier. To remove a setting, set it to an empty value.
Placeholders
The following placeholders can be used in values:
- {ITEM}
The name of this item.
- {USER}
The username of the account currently running scw.
A placeholder should only be used once in values for UserConfigFile and ItemsDir - see the NOTES section.
Configuration settings
The following settings are available in configuration files:
- ItemsDir
-
The directory in which to find item definitions and item scripts. The default value is usually /etc/scw/items/{USER}.
- MetricsDir
-
The directory in which to place metrics files for an item. The directory must already exist and be writable by the current user, or if it doesn't exist, its parent must be writable by the current user so that scw can create it. The default value is usually /var/spool/scw/{USER}/{ITEM}.
- CheckLockFile
-
The lock file to use when performing dependency or conflict checks. The default value is usually /var/spool/scw/{USER}/.lock.
- Sendmail
-
The command to use to send email. It should accept message headers and a message body on standard input. The default value is usually /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t.
- TransmitForm
-
The command to use to transmit messages encoded as a form message to an HTTP or HTTPS URL. When called, the environment variable SCW_TIMEOUT will be set to the item's HTTPTimeout value, the environment variable SCW_FILE will name a temporary file containing the data to post, and SCW_URL will contain the URL to post to. The default value is usually curl -s -S -m "$SCW_TIMEOUT" --data-binary "@$SCW_FILE" "$SCW_URL".
- TransmitJSON
-
The command to use to transmit messages encoded as a JSON array to an HTTP or HTTPS URL. The same environment variables are used as above. The default value is usually the same as the above, with the extra option -H "Content-Type: application/json" just before "$SCW_URL".
- UserConfigFile
-
The per-user configuration file. When scw runs, it first loads the global configuration file, and then this one. The default value is usually /etc/scw/settings/{USER}.cf. This setting can only be changed in the global configuration file.
- ItemListFile
-
The file to which “scw update” will write a JSON array describing all items, which can be used by a monitoring system to discover what to monitor. The default value is usually /var/spool/scw/items.json. This setting can only be changed in the global configuration file.
- CrontabFile
-
The file to which “scw update” will write a crontab. The default value is usually /etc/cron.d/scw. This setting can only be changed in the global configuration file.
- UpdateLockFile
-
The lock file to use to ensure that only one instance of “scw update” is running at a time. The default value is usually /var/spool/scw/.update-lock. This setting can only be changed in the global configuration file.
Configuration files may also contain any of the other settings listed below for items, to set defaults which items can override.
Item settings
The following settings are available for items.
- Description
-
A short one-line description of what this item does. This is recorded in the item list file by “scw update”.
- Command
-
The command to run. This is passed to “sh -c”.
- Schedule
-
When to run this item. This takes time and date fields in the same format as crontab(5). An item can have up to 16 Schedule values. This is used by “scw update” when generating a crontab.
- RandomDelay
-
Each time this item starts, there will be a random delay of up to this many seconds before continuing with checks and running the command. The time period can be specified in seconds, or with multiple numbers suffixed with “w” (weeks), “d” (days), “h” (hours), “m” (minutes), or “s” (seconds). Either the whole word can be given, or just the first letter. Spaces are optional. For example, “1d5h7m6s”, “1 day 5 hours 7 minutes 6 seconds”, and “104826” are all equivalent.
- MaxRunTime
-
The maximum number of seconds to allow the command to run, after which it will be forcibly terminated. If this is not set, there is no limit. The same time period formatting rules as RandomDelay apply here.
- Prerequisite
-
A command to run before attempting to run any item's Command. If the Prerequisite command exits with a non-zero status, the item is treated as if it was not scheduled to run at all, and so its command is not run. This can be used to, for instance, check that this server is the active node of a failover cluster, so that scheduled commands only run on the active node. The output of the Prerequisite command is always discarded. Note: The Prerequisite command may be run multiple times for an item if any delays are involved, since it is invoked prior to any delay, and then again after any delay just before the item's Command is to be run.
- SuccessInterval
-
The number of seconds permitted between successful command runs before an alert should be raised. This is not used directly by scw, but is made available as a metrics file for your monitoring system to read - see the Metrics subsection under FILES. The same time period formatting rules as RandomDelay apply here.
- ConcurrencyWait
-
If the item is already running, it will not start a second instance. Instead of abandoning the attempt immediately, it will wait up to this many seconds for the previous run to complete first. If the previous run finishes by then, the new run will proceed as normal. If ConcurrencyWait is not set, there will be no waiting. The same time period formatting rules as RandomDelay apply here.
- SilentConcurrency
-
If the item was already running, and did not finish before the ConcurrencyWait timeout expired, then a second instance won't be started. By default, when this happens, the overrun metrics file is created and no other metrics are affected. If the SilentConcurrency setting is “no”, “off”, “false”, or “0”, then this situation will instead be treated as if the command had run and failed.
- DependsOn
-
The name of another item which must have successfully run since the previous run of this item. If the dependency is not met, this item will not run. An item can have up to 16 DependsOn values.
- DependencyWait
-
If all dependencies are not met, keep waiting for them to be met for up to this many seconds. If, after waiting, the dependencies have been met, start the item as normal. The same time period formatting rules as RandomDelay apply here.
- SilentDependency
-
By default, if an item's dependencies are not met, the item is treated as if it ran its command and it failed. If the SilentDependency setting is “yes”, “on”, “true”, or “1”, then this situation will instead be treated as if the item was not scheduled to be run at all, and no metrics will be updated.
- ConflictsWith
-
The name of another item which must not be running at the same time as this item. If it is, this item will not run. An item can have up to 16 ConflictsWith values.
- ConflictWait
-
If a conflicting item is running, keep waiting for it to finish for up to this many seconds. If, after waiting, the conflicts have been resolved, start the item as normal. The same time period formatting rules as RandomDelay apply here.
- SilentConflict
-
By default, if an item can't start because of a conflict, it is treated as if it ran its command and it failed. If the SilentConflict setting is “yes”, “on”, “true”, or “1”, then this situation will instead be treated as if the item was not scheduled to be run at all, and no metrics will be updated.
- StatusMode
-
A command can provide additional information about its progress through a status stream (see the STATUS REPORTING section). If StatusMode is set to “fd”, then status information is read from the command's file descriptor 3. If it is set to “stdout” or “stderr”, status information is derived from the command's standard output or standard error respectively, using the StatusTag.
- StatusTag
-
When StatusMode is not “fd”, any command output lines in the appropriate stream which start with the StatusTag will have that tag removed and the remainder will be used as the status information. See the STATUS REPORTING section.
- TimestampUTC
-
By default, timestamps are expressed in the system's local time zone. If the TimestampUTC setting is “yes”, “on”, “true”, or “1”, then timestamps are expressed in UTC. Note that this only affects timestamps in the output - the Schedule always refers to the system's local time zone, as cron(8) does.
- HTTPInterval
-
To improve efficiency when sending output to an HTTP or HTTPS URL (see below), lines are not sent immediately, but are collected and transmitted in batches, with this number of seconds between them. The same time period formatting rules as RandomDelay apply here.
- HTTPTimeout
-
Terminate transmissions after this number of seconds. The same time period formatting rules as RandomDelay apply here.
- OutputMap
-
Map the command's output to a destination. See below for more details. An item can have up to 16 OutputMap values.
Output mapping
The OutputMap settings take values of the form “STREAM FORMAT DESTINATION”, where STREAM selects one or more command output streams, FORMAT selects how it should be formatted, and DESTINATION selects where to send it to.
For example, an item might have this output map configuration, or it could even be in the global configuration file as the default for this server:
# Write stdout, stderr, and status, with timestamps, to a file OutputMap = OES stamped /var/log/scw/{USER}/{ITEM}.log # Email stdout and stderr, without timestamps, to root@localhost OutputMap = OE raw root@localhost # On failure, email stderr, without timestamps, to admin@company.com OutputMap = !E raw admin@company.com # Write status messages to syslog as facility "user", level "notice". OutputMap = S raw user.notice # Send status messages as JSON data via HTTPS POST OutputMap = S json https://status.company.com/receiver
The STREAM is any combination of the following:
- O
Standard output. “-” may also be used.
- E
Standard error.
- S
Status messages.
- !
Spool until the command completes, and then only send to the destination if the exit status was non-zero, indicating failure.
The FORMAT is one of the following:
- raw
Lines of text exactly as output by the command.
- stamped
Lines prefixed with a timestamp and, if multiple streams were selected, an indicator of which stream they came from.
- json
An array of JSON objects. Each object contains integers named epoch and pid, and strings named hostname, user, item, stream, and message; the stream will be one of “stdout”, “stderr”, or “status”.
- form
An HTTP form post (key=value pairs) containing the same fields as the json format, each key being suffixed with a sequence number starting with 1, such as user1=root.
The DESTINATION is a filename, a list of email addresses separated by commas, a syslog priority in the form facility.level, or an HTTP or HTTPS URL.
When the DESTINATION is an HTTP or HTTPS URL, the FORMAT may only be “json” or “form”. These formats may only be used with URLs and no other destination types.
When the DESTINATION is one or more email addresses, the provided value is used as the “To:” header in an email which is sent when the command completes. No email is sent if there was no output at all. If the STREAM contains “!”, then the email will only be sent if the command fails (exits with a non-zero status).
STATUS REPORTING
When an item's command runs through several steps, it can pass details of which step it's up to, and whether the previous step failed, to scw using the status reporting mechanism.
The first word of the status report should be one of “notice”, “ok”, “warning”, or “error”.
When running an item's command, scw inserts a special “begin” status report at the start, and an “end” status report at the end.
Depending on the StatusMode and StatusTag settings, status reporting might look like this:
#!/bin/sh # # scw StatusMode = fd printf "%s %s" "notice" "Starting step 1" >&3 # do some work here if $succeeded; then printf "%s %s" "ok" "Step 1 complete" >&3 else printf "%s %s" "error" "Step 1 failed" >&3 exit 1 fi printf "%s %s" "notice" "Starting step 2" >&3 # do some more work here # ...
Adding status information like this makes analysis easier, for example discovering how the time taken for each step of a multi-step command fluctuates, or clearly highlighting where a command failed.
Using a StatusMode of “stdout” or “stderr” means prefixing the status message with the value of the StatusTag, like this:
#!/bin/sh # # scw StatusMode = stdout # scw StatusTag = STATUS: printf "STATUS: %s %s" "notice" "Starting step 1" # ... and so on.
Writing status information this way may be easier than with the “fd” method in some circumstances. It has the side effect that the status messages will also be recorded in standard output or standard error logs.
EXIT STATUS
The following exit status values apply to all actions:
- 0
-
Success: the action completed without error.
- 5
-
An unknown option, action, or setting was passed on the command line, or too many or too few arguments were provided for the chosen action. No action was taken.
- 6
-
The configuration file could not be read, or contains unrecoverable errors. No action was taken.
- 7
-
Some other error occurred that was not covered by any of the above. Action may have been partially completed.
Exit status values for “run”
The “run” action can exit with one of the following:
- 0
-
Success: the action completed without error.
- 1
-
Item failed: the item's command was run, and it exited non-zero.
- 2
-
Item timed out: the item's command was run, but it reached its configured maximum run time, and was forcibly terminated.
- 3
-
The item's command ran successfully, but there was a problem accessing the metrics directory or a lock file, so no concurrency, dependency, or conflict checks were possible.
- 4
-
The item's command was run, and it exited non-zero, and there was a problem accessing the metrics directory or a lock file, so no concurrency, dependency, or conflict checks were possible.
- 8
-
Item has no command: the specified item has no entry in the item definition directory and there was no command in the remaining command line arguments to scw, so no command has been run.
- 9
-
Item not enabled: the item is currently disabled, and the “--force” option was not provided, so the command has not been run.
- 10
-
Item prerequisites not met: the prerequisite check has failed, so the command has not been run.
- 11
-
Item dependencies not met: the items on which this item depends have not all run, so the command has not been run.
- 12
-
Item conflict: one of the items which this item conflicts with is currently running, and all options for startup delays have been exhausted, so the command has not been run.
- 13
-
Item already running: the item is already running, and all of its configured options for startup delays have been exhausted, so the command has not been run.
Note that any exit status lower than 5 indicates that the item's command was definitely started; only an exit status of 0 or 3 indicates that it succeeded.
Exit status values for “status”
The “status” action exits with the sum of the following values:
- 16
-
Added if the item does not exist (meaning that it has no entry in the item definition directory).
- 32
-
Added if the item is disabled.
- 64
-
Added if the item is currently running.
For example, an exit status of 0 indicates that the item exists, is enabled, and is not currently running. An exit status of 96 indicates an item which is disabled, but currently running.
FILES
File locations may be adjusted by the installation process, so for example paths listed here under /etc may be under /usr/local/etc on your system. Locations may also be overridden by configuration settings.
- /etc/scw/default.cf
-
Global default settings.
- /etc/scw/settings/USER.cf
-
Settings to apply when running as user USER.
- /etc/scw/items/USER/*.cf
-
Item definitions for user USER.
- /etc/scw/items/USER/*.sh
- /etc/scw/items/USER/*.pl
-
Item scripts for user USER, with their definitions embedded in a comment block at the top of the script (see the ITEM DEFINITIONS section).
- /var/log/scw/USER/*.log
-
The default location for log files generated by items owned by USER.
- /var/spool/scw/USER/ITEM/
-
Metrics files for the item ITEM owned by the user USER. See below for more details.
- /var/spool/scw/USER/.lock
-
An empty file used for locking while checking for dependencies and conflicts.
- /var/spool/scw/items.json
-
A JSON array describing all items, suitable for using as a Zabbix low-level discovery file. Updated by “scw update”.
- /etc/cron.d/scw
-
The default crontab written by “scw update”.
- /var/spool/scw/.update-lock
-
An empty file used for locking while running “scw update”.
Metrics
The metrics directory for an item can contain these files:
- disabled
-
An empty file whose presence indicates that the item is disabled, and whose last-modification time indicates when it was disabled.
- success-interval
-
The number of seconds permitted between successful command runs before an alert should be raised, followed by a newline. Monitoring systems should be instructed to raise an alert if the succeeded file's last-modification time is more than success-interval seconds ago and the prerequisites-met file exists.
- prerequisites-met
-
An empty file which is created if the item's prerequisites are met, and deleted if they are not. Its last-modification time indicates when the prerequisites were last successfully checked.
- started
-
An empty file whose last-modification time indicates when the item last started. It is not updated until the item's command actually starts running (so, after any startup delays).
- ended
-
An empty file whose last-modification time indicates when the item last ended after running the item's command, regardless of whether the command succeeded. Its last-modification time is not updated unless the command actually ran.
- succeeded
-
An empty file whose last-modification time indicates when the item's command last ran and ended with a zero exit status. It is not deleted on failure.
- failed
-
An empty file which is created when the item's command runs and ends with a non-zero exit status. Its last-modification time indicates when the command first failed - it is not updated when subsequent runs fail. This file is deleted as soon as the command runs and exits with a zero exit status.
- overran
-
An empty file which is created when the item could not run because it was already running and all startup delay options were exhausted. Its last-modification time indicates when this first happened. It is deleted the next time the item is able to start.
- run-time
-
The number of seconds the item's command most recently took to run, followed by a newline. It is updated each time the item's command finishes a run, not counting any startup delays, and regardless of the command's exit status.
- .lock
-
An empty file which the item will lock while running.
- pid
-
While the item is running, this file contains the item's process ID, followed by a newline. The file is deleted on exit.
- last-status
-
The status message most recently reported by the item (see the STATUS REPORTING section).
NOTES
Commands will always appear to take at least 1 second to complete due to scw waiting for output to cease after the process has ended. This approach is required to avoid lines being recorded out of order when the command writes them to different streams.
Time periods such as MaxRunTime may be written in seconds, or as any combination of weeks, days, hours, minutes, and seconds, each number suffixed with the unit. Spaces are allowed between each component of the time period, but no other words or punctuation. These are all equivalent:
MaxRunTime = 2 weeks 3 days 10 hours 5 minutes 2 seconds MaxRunTime = 2w 3d 10h 5m 2s MaxRunTime = 2w3d10h5m2s MaxRunTime = 17 days 605 minutes 2 seconds MaxRunTime = 1505102 seconds MaxRunTime = 1505102
An item's Prerequisite command is run when “scw run” starts. If a delay then arises due to concurrency locks, dependencies not being met yet, or conflicting items still running, then it will be run again, in case conditions have changed. This means that the Prerequisite command should be carefully chosen to handle being run twice per item, and ideally have no side effects.
The name of an item must only contain letters, numbers, underscores, and hyphens.
Item settings files and scripts in ItemsDir, the per-user configuration file in UserConfigFile, and the global configuration file, must be normal files, and must not be symbolic links.
Each setting which takes multiple values - Schedule, DependsOn, ConflictsWith, OutputMap - is limited to 16 values in total after applying rules from all relevant sources. For example, if the global configuration defines 3 OutputMap values, an item may only add 13 more unless it first clears the list by assigning an empty value to OutputMap.
The CrontabFile written by “scw update” is in standard crontab(5) format, unless the filename starts with /etc/cron.d/, or the “--all-users” option was passed, in which case the system crontab format is used, where each command is preceded by the username of the user to run it as. This allows “scw update -a” to write a system-wide crontab for all users, so applications which run under their own user ID can have their packages place their schedules under /etc/scw/items/USER/, and they will be run as USER.
When using the “list” and “update” actions with the “--all-users” option, users are enumerated first, and then each user's items are enumerated. Users are found by replacing {USER} in the UserConfigFile setting with a “*” and using it as a glob(7) pattern, to find all users with their own distinct configuration. Then, {USER} in each ItemsDir setting (its global value, and any new values found in user config files) is replaced with a “*” and used as a glob(7) pattern as well. This means that the “--all-users” option will not work properly if UserConfigFile uses the placeholder twice in its value, and neither of the “list” or “update” actions will work properly if ItemsDir use the placeholder twice, so values like /opt/{USER}/settings-{USER}.cf are not recommended.
Transmission over HTTP and HTTPS is implemented by calling curl(1), which must be in the path. Its output is discarded, and errors are written to stderr.
EXAMPLES
The setup guide contains several examples. This is usually installed as /usr/share/doc/scw/SETUP.md.
REPORTING BUGS
Please report any bugs to scw@ivarch.com.
Alternatively, use the issue tracker linked from the scw home page.
SEE ALSO
crontab(5), cron(8), curl(1)
COPYRIGHT
Copyright © 2024 Andrew Wood.
License GPLv3+: GNU GPL version 3 or later.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.