Difference between revisions of "ISFDB:Configure Nightly Processing"

From ISFDB
Jump to navigation Jump to search
(Updated the instructions to include running the new weekly process)
(→‎Monthly Processing: Clarified that the monthly process is currently commented out on the live server and that the algorithm doesn't scale well)
 
Line 23: Line 23:
 
  00    02    7    *    *    /var/www/html/nightly/monthly_job.py > /dev/null 2>&1
 
  00    02    7    *    *    /var/www/html/nightly/monthly_job.py > /dev/null 2>&1
  
When it runs, this task will recreate the "Suspected Duplicate Authors" cleanup report. Note that this process takes a long time and puts a lot of stress on the server, so it should only be run infrequently.
+
When it runs, this task will recreate the "Suspected Duplicate Authors" cleanup report. Note that the algorithm doesn't scale well with the number of author records in the database. It takes a long time and puts a lot of stress on the server, so it should only be run infrequently. At the moment it's commented out on the live server due to performance issues.
  
 
[[Category:Installation instructions]]
 
[[Category:Installation instructions]]

Latest revision as of 13:49, 30 November 2023

Note: the following three files reside in the "nightly" subdirectory under INSTALL_HTML.

Weekly Processing

Configure crontab to run weekly_job.py once a week, e.g.:

00    01    *    *    7    /var/www/html/nightly/weekly_job.py > /dev/null 2>&1

When it runs, this task will regenerate database statistics AND rerun the cleanup reports.

Nightly Processing

Configure crontab to run nightly_job.py once a day EXCEPT when weekly processing runs, e.g.:

00    01    *    *    1-6    /var/www/html/nightly/nightly_job.py > /dev/null 2>&1

When it runs, this task will rerun the cleanup reports.

Monthly Processing

Configure crontab to run monthly_job.py once a month, e.g.:

00    02    7    *    *    /var/www/html/nightly/monthly_job.py > /dev/null 2>&1

When it runs, this task will recreate the "Suspected Duplicate Authors" cleanup report. Note that the algorithm doesn't scale well with the number of author records in the database. It takes a long time and puts a lot of stress on the server, so it should only be run infrequently. At the moment it's commented out on the live server due to performance issues.