Backfill
Overview
Easy to use backfill tool. Handles timezone conversion and rate limiting. Also includes filtering by table, source or both.
Usage
Run backfill --help to see the usage
Usage: backfill [OPTIONS] <START> <END>
Arguments:
<START>
<END>
Options:
-d, --dir <DIR> [default: /opt/ardexa/logs]
-l, --log-file <LOG_FILE>
--go
--eps <EPS> [default: 0]
--table <TABLE>
--source <SOURCE>
-v, --verbose
-h, --help Print help
-V, --version Print versionThe mandatory arguments are the START and END dates. These must be in standard RFC format (e.g. 2025-02-07T17:28:46+11:00). It will then scan through /opt/ardexa/logs looking for lines that fall inside the given timeframe. NOTE This plugin will only work with logs that contain a date field. The position of the date field does not matter. The first date field will be used in the case where there is more that one.
By default, the plugin will not log any data until it receives the --go flag, signaling that you are happy with the arguments. Only then will the plugin start writing data to the corresponding latest.csv file. The default is to simply scan for all the matching files and print a summary of the number of matching records.
For large volumes of data (> 100,000), you are strongly encouraged to use the rate limiter function. When using the --eps flag (events per second), only use integers. The plugin will include a time estimate in the summary so that you can tune the rate to meet your needs, but keep in mind that going too fast may overwhelm the connection to the internet, so choose your value wisely. If in doubt, you don't have to process the entire timeframe at once.
Once you are ready to commence the backfill, it is a good idea to DISOWN the process if you think it will take more than a few minutes. Use the -l flag to specify a log file so that you can keep track of the progress. Also use the special DISOWN: prefix in remote shell for an easy way to background the task, e.g.
DISOWN: backfill 2024-10-12T23:45:00+01:00 2025-01-27T11:25:00+01:00 --eps 300 --table solar -l /tmp/backfill.log --goUse --table to target a single table, or --source to target a single source. These options can be used together
Last updated
Was this helpful?