Cron Alternatives

This document describes alternatives to the Scheduler interface. Arrower relies on jobs being scheduled by the job queue in the application. If you want to run repeating tasks outside of the application this is possible.

On the one end of the spectrum are kubernetes jobs and on the other end are crons that run directly inside the database.

Running a cron directly in the database can be a huge advantage, as it allows to execute SQL statements directly without any overhead. This is ideal for maintenance work or data aggregation pipelines. The drawback is that your infrastructure might not support those extensions, in which case you can just use Arrower's Scheduler.

Scheduling with pg_cron

Also consult the official documentation of pg_cron.

-- Delete old data on Saturday at 3:30am (GMT)
SELECT cron.schedule('30 3 * * 6', $$DELETE FROM events WHERE event_time < now() - interval '1 week'$$);
 schedule
----------
       42

-- Vacuum every day at 10:00am (GMT)
SELECT cron.schedule('nightly-vacuum', '0 10 * * *', 'VACUUM');
 schedule
----------
       43

-- Change to vacuum at 3:00am (GMT)
SELECT cron.schedule('nightly-vacuum', '0 3 * * *', 'VACUUM');
 schedule
----------
       43

-- Stop scheduling jobs
SELECT cron.unschedule('nightly-vacuum' );
 unschedule 
------------
 t

SELECT cron.unschedule(42);
 unschedule
------------
          t

-- Vacuum every Sunday at 4:00am (GMT) in a database other than the one pg_cron is installed in
SELECT cron.schedule_in_database('weekly-vacuum', '0 4 * * 0', 'VACUUM', 'some_other_database');
 schedule
----------
       44

-- Call a stored procedure every 5 seconds
SELECT cron.schedule('process-updates', '5 seconds', 'CALL process_updates()');

-- Process payroll at 12:00 of the last day of each month
SELECT cron.schedule('process-payroll', '0 12 $ * *', 'CALL process_payroll()');

 ┌───────────── min (0 - 59)
 │ ┌────────────── hour (0 - 23)
 │ │ ┌─────────────── day of month (1 - 31) or last day of the month ($)
 │ │ │ ┌──────────────── month (1 - 12)
 │ │ │ │ ┌───────────────── day of week (0 - 6) (0 to 6 are Sunday to
 │ │ │ │ │                  Saturday, or use names; 7 is also Sunday)
 │ │ │ │ │
 │ │ │ │ │
 * * * * *

An easy way to create a cron schedule is: crontab.guru.

Postgres Image With pg_cron

You can use any (managed) database that has the pg_cron extension installed. Alternatively, use the docker image ghcr.io/go-arrower/postgres

docker pull ghcr.io/go-arrower/postgres:latest

The image installs the extension into the default database postges and makes it available to your database, as if it was installed there.

The image is regenerated every week, so that you might have the latest version of postgres and its base image available to you.

note

Link ADL on why this is. Testing and limitations of pg_cron

If you want to use the cron from a different user or database, you might have to fine-tune your setup:

Ensure your user can access the cron schema:

-- Inside the database where pg_cron is installed:
-- grant usage to regular user
GRANT USAGE ON SCHEMA cron TO your_username;

The Arrower migrations assume pg_cron is installed in the database postgres if this is not the case update the database used in the migrations, see create_pg_cron_extension.up.sql

Limitations

pg_cron can only run in one database
=> two dbs and migrations to manage (outside of the project)
=> see hack to make it work with fdw

Scheduling with TimescaleDB actions

https://docs.timescale.com/api/latest/actions/

Limitations

Not included in the could licensing of TimescaleDB and only available in the open source / self-host version.

Inserting Jobs into the Database

The repeating tasks introduced above rely on and execute SQL statements only. In some cases this might be to limiting for your needs.
Any valid row in the database will be executed as a job. See the PersistencePayload to learn more about the payload format.

INSERT INTO arrower.gue_jobs(job_id, created_at, updated_at, run_at, queue, job_type, priority, args)
VALUES (generate_ulid(), now(), now(), now(),
        '', 'MyJob', 0,
        (json_build_object(
                 'jobData', json_object('Payload' VALUE 'your-payload')
         ) #>> '{}')::BYTEA -- JSON can not be converted to BYTEA: convert to TEXT first via #>> Get JSON object at specified path as text
       );

Scheduling Complex Tasks as Jobs

If you need to execute more complex business logic, use a cron to insert a row into the jobs table. This way the jobs system is taking over, allowing you to perform any kind of operation without the limits of SQL.

SELECT cron.schedule('your-custom-cron', '* * * * *',
$$INSERT INTO arrower.gue_jobs(job_id, created_at, updated_at, run_at, queue, job_type, priority, args)
VALUES (
generate_ulid(), now(), now(), now(),
'', 'MyJob', 0,
(json_build_object('jobData', '{}') #>> '{}')::BYTEA);$$);

Scheduling with pg_cron​

Postgres Image With pg_cron​

Limitations​

Scheduling with TimescaleDB actions​

Limitations​

Inserting Jobs into the Database​

Scheduling with pg_cron

Postgres Image With pg_cron

Limitations

Scheduling with TimescaleDB actions

Limitations

Inserting Jobs into the Database