Message-ID: <1863686814.653.1710822452463.JavaMail.confluence@host3.pipelinefx.com> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_652_1683011293.1710822452463" ------=_Part_652_1683011293.1710822452463 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Clean Up Old Jobs Automatically

Clean Up Old Jobs Automatically

Synopsis

Old jobs left in Qube are often of little to no use.  While their d= atabase entries aren't very detrimental to the operation of Qube, the logs = from all of those jobs, when ignored, can balloon in size, sometimes fillin= g the drive(s) on which they reside.

By removing old jobs, you reduce the number of files used by the databas= e, and, more importantly, reduce disk space required to hold logs of jobs t= hat are no longer useful to your production/facility.

We have included a python script called job_cleanup.py that takes, as an= argument, the number of days worth of jobs that you would like to keep.  It then removes all of the remaining jobs and the= ir logs.  The idea, then, is that you run this script daily, always ke= eping, for example, the last 30 days's worth of jobs and deleting the rest.=  The script looks at the job's completion time rather than submission= or start time, so a job will not be considered for removal until the its m= ost recent completion time is older than the specified number of days.

= =20
=20 Icon=20
=20

Performance Charts data comes from a separate database, so even after re= moving old jobs, the Performance Charts remain fully intact.

=20
=20
=20
=20 Icon=20
=20

Job removal is a fairly database- = and disk-intensive operation; avoid removing a very large number of jobs at= one time. Limit job removal to 1000 jobs at a time, and wait a few minutes= between removals to avoid the supervisor's filesystem getting swamped with= table deletion operations.

=20
=20
=20

Requirements

Python 2.x must be installed on the supervisor.  For Windows, we re= commend www.python.org.  OS X should have Python 2.x already= installed.  Linux should install Python 2.x from their package manage= r.

While not required, for better performance, the MySQLdb python module sh= ould be available.  Assuming you have pip, running  pip install python-mysql &nb= sp;(or  easy_install pyt= hon-mysql  if you don't have pip) in a terminal/command = prompt should installed install the MySQLdb module for you.  You = may need to run as sudo on OS X or Linux when you run pip.

How to use the scr= ipt

You will find this same information by running "job_cleanup.py -h&q= uot;

 

$ ./job_cleanup.py -h
Usage:
    =20 job_cleanup.py [options]: delete jobs and/or log= s, either for jobs completed more than X days ago, and/or for all jobs remo= ved from Qube.
 
Options:
  =20 -h, --help      &n= bsp;     show this help message and exit
  =20 -j, --removeJobs      D= elete jobs from Qube completed more than X days ago,
        = ;            &n= bsp;   =20 must be used in conjunction with the "-d&qu= ot; days
        = ;            &n= bsp;   =20 argument
  =20 -d DAYS, --days=3DDAYS  Delete logs for any= jobs that were submitted more than
        = ;            &n= bsp;   =20 a certain number of days ago
  =20 --removeLogs      =     Delete logs as part of the job removal
  =20 --removeOrphanedLogs  Delete logs for jobs = that no longer exist in Qube -
        = ;            &n= bsp;   =20 removed but their logs were left behind.
  =20 -v, --verbosity     &nb= sp; Increase verbose logging (to stdout). -vv is more
        = ;            &n= bsp;   =20 verbose than -v
  =20 -q, --quiet      &= nbsp;    suppress all logging and output short of fatal erro= rs
  =20 -I, --ignore-sanity   Ignore sanity ch= eck (allows more than 10% of jobs to
        = ;            &n= bsp;   =20 be removed).
  =20 -n, --dry-run      = ;   Show what would have been done, but do nothing.

 

Before yo= u begin: Preparation

Before you set up the scheduled task/cron job, you need to be sure the s= cript will run to completion without errors.  

By default, the script will not delete more than 10% of the jobs in the = database.  The first time you run the script, you'll likely need to ig= nore that check, but you probably do not want to ignore it on a daily basis= .

The job_cleanup script also provides a way for you to simulate the proce= ss without actually doing anything - a dry run.  This way you can see = what's going to happen to see if it's in line with your expectations.

For our example, we want our scheduled task to remove all but the last 3= 0 days worth of jobs, removing all of the old job logs and any orphaned log= s (those jobs that have been deleted, but their job logs were left behind).=

 

Preparation, then, should go like this:

  1. cd QBDIR/utils
  2. Do a dry run:  job_cleanup.py -j -d 30 --removeLogs --remove= OrphanedLogs -n

    This wi= ll probably fail the sanity check.  That's ok.  If it does not fa= il the sanity step, skip to step 4.  If it does fail the sanity check,= continue to step 3.
  3. Do a dry run, ignoring the sanity ch= eck:  job_cleanup.py -j = -d 30 --removeLogs --removeOrphanedLogs -n -I  This should print out a long list of jobs that will be= removed, each line should say "(dry run)" at the end, letting yo= u know it's not actually doing anything. Only when you're sati= sfied - in other words, when it's not reporting it will delet= e jobs you want to keep - with what the dry run returns, proceed to step 4.=

  4. Now run the script without the -n, t= his will actually delete files and jobs and is irreversible:  job_cleanup.py = -j -d 30 --removeLogs --removeOrphanedLogs -I

Step 4 may take a considerable amount of time.

=20
=20 Icon=20
=20

If you wanted to remove all but the last 7 days worth of jobs, for examp= le, you would change the "-d 30" to "-d 7".

=20
=20
=20

Creating a scheduled task to clean up old jobs on a = Windows supervisor

Use the Windows Task Scheduler wizard.  Go to Start > Control Pa= nel > Administrative Tools > Task Scheduler, then click on "Crea= te Basic Task" and follow through the wizard.

You likely want the scheduled task to run daily, in the middle of the ni= ght.  You want it to "Start a program" and the program shoul= d be "C:\Program Files\pfx\qube\utils\job_cleanup.py" with additi= onal arguments of "-j -d 30 --removeLogs --removeOrphanedLogs&quo= t; (without quotes).

Note: These arguments will keep the last 30 days worth of jobs.  If= you would like more or less, then adjust the -d argument accordingly.

Creating a scheduled task to clean up old jobs on an OS= X supervisor

OS X uses launchctl and launchd to schedule scripts. To set this up, cre= ate a .plist file with contents similar to this file, which will run the sc= ript once a day at 12:03am:

=20
=20
=20
=20 =20 =20 =20 =20 =20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
  File Modified
File com.pipelinefx.job_cleanup.plist Dec 22, 2014= by John Burk

Labels

=20
=20
=20
    =20
  • No labels
  • =20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
qb_job_cleanup.plist
=20
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0=
//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version=3D"1.0">
<dict>
    <key>Label</key>
    <string>com.pipelinefx.job_cleanup</string>
    <key>OnDemand</key>
    <true/>
    <key>RunAtLoad</key>
    <false/>
    <key>Program</key>
    <string>/Applications/pfx/qube/utils/job_cleanup.py</string>=
;
    <key>ProgramArguments</key>
    <array>
        <string>-j</string>
        <string>-d 30</string>
        <string>--removeLogs</string>
        <string>--removeOrphanedLogs</string>
        <string>-I</string>
    </array>
    <key>StandardOutPath</key>
    <string>/var/tmp/qb.cleanup.log</string>
    <key>StandardErrorPath</key>
    <string>/var/tmp/qb.cleanup.log</string>
    <key>StartCalendarInterval</key>
    <dict>
        <key>Hour</key>
        <integer>00</integer>
        <key>Minute</key>
        <integer>03</integer>
    </dict>
</dict>
</plist>
=20
=20
=20

Important

=20 Icon=20
=20

You can name this file whatever you like, but the name must end in .plis= t, and the name (without the .plist suffix) must match the Label string. In= this example we have used com.pipelinefx.job_cleanup and = com.pipelinefx.job_cleanup.plist

=20
=20
=20

Then perform these steps in a Mac Terminal (shell):

=20
$ sudo mv com.pipelinefx.job_cleanup.plist /Library/LaunchDaemo=
ns/
$ sudo chown root /Library/LaunchDaemons/com.pipelinefx.job_cleanup.plist=
=20
$ sudo chmod 644 /Library/LaunchDaemons/com.pipelinefx.job_cleanup.plist=20
$ sudo launchctl load /Library/LaunchDaemons/com.pipelinefx.job_cleanup.pli=
st
=20
  • The sample script is written to run each day at 3 minutes past midnight= . You can change the Hour & Minute tags to suit your installation. You = can also restrict this to running, say, weekly on Sunday by adding &l= t;key>Weekday</key> <integer>0</integer> into t= he StartCalendarInterval dictionary.
  • Note that the output of stderr and stdout is written to /var/tmp/qb.cle= anup.log. This can be changed by you to the location of your choice, and yo= u can separate stderr and stdout into two different files if you prefer.
  • Testing this setup first by adding '-n' as an argument in the file is a= good idea.

You can test the script without waiting for midnight by typing:

=20
$ sudo launchctl start com.pipelinefx.job_cleanup
=20

You can remove the script with this commands:

=20
$ sudo launchctl unload /Library/LaunchDaemons/com.pipelinefx.j=
ob_cleanup.plist
$ sudo rm /Library/LaunchDaemons/com.pipelinefx.job_cleanup.plist
=20

Creating a cron job to clean up old jobs on a Linux supervis= or

Add a file in /etc/cron.daily called job_cleanup.  Be sure it is ex= ecutable by all (chmod a+x job_cleanup).  This is a shell script that = will be run daily. A working example looks like this:

 

#!/bin/bash
logfile=3D=20 /var/log/job_cleanup=20 .log
/usr/local/pfx/qube/utils/job_cleanup=20 .py -j -d 30 --removeLogs --removeOrphanedLogs &= gt;> $logfile

 

Note: These arguments will keep the last 30 days worth of jobs.  If= you would like more or less, then adjust the -d argument accordingly.

See also

qbremove

 

------=_Part_652_1683011293.1710822452463--