October 24, 2016

Truncating the Job Console logs

In the last post we talked about how to automate deleting the task flow logs.  Another place where log entries accumulate is the Library Job Console.

The Job Console log is more manageable than the Task Flow log.  In the Job Console tab you can display up to 200 entries per page and you can click the first record, scroll to the bottom of the list, shift+click to select all the entries, then right-click and delete.


This is better but can still be a bunch of work if you haven't cleared the logs in a while.  And wouldn't you rather have an automated solution that can be scheduled so you can work on other things?  Of course you would.

Analysis

The discovery was pretty easy.  In the EPMA database the table name [JM_Job] is a dead give away.  The contents show the same information we see in the Job Console tab.


There are three tables with the JM* prefix which hold the job log data: [JM_Job], [JM_Batch], and [JM_Attachment].  They are connected by the i_job_id key which has sequential values and is the primary key in the [JM_Job] table.  The [JM_Attachment] primary key combines the i_job_id and the i_attachment_id, where i_attachment_id is also a sequential value.  The primary key for the [JM_Batch] database combines the foreign keys i_job_id and i_attachment_id with the field i_batch_id.


The [JM_Job] database has the start and completion times for the various jobs.  So we can reference that to find our cut off date, get the i_job_id for the previous job, and delete from the three tables anything that is less than the cut off i_job_id.

Solution

The wrinkle is that because of foreign key constraints of i_job_id the [JM_Job] database has to be the last one truncated.  And because the i_attachment_id is a foreign key in [JM_Batch] that has to be the first table truncated.

Like with the task flow truncation we can either calculate or specify the cut off date.

DECLARE @CutOff datetime;
SET     @CutOff = DATEADD(M,-4,GetDate());
--DECLARE @CutOff char(10);
--SET     @CutOff = '2016-01-01'

There are several ways to tackle getting the i_job_id for our cut off date.  One approach is to assign the i_job_id to a variable and use that in the DELETE statement:

DECLARE @ID Int;
SET @ID = (SELECT Top 1 i_job_id FROM JM_Job WHERE d_started < @CutOff ORDER by i_job_id DESC)
DELETE FROM JM_Batch WHERE i_job_id <= @ID

But these are fairly small tables so it is easy to use a subselect in the DELETE:

DELETE from JM_Batch
where i_job_id in (
SELECT i_job_id
FROM JM_Job
WHERE d_started < @CutOff)

The last piece is to arrange the DELETEs in the correct order.  The complete TSQL script is:

use EPMA
go

-- Anything prior to cutoff date will be deleted
--   The DateAdd function can change depending on need
--
DECLARE @CutOff datetime;
SET     @CutOff = DATEADD(M,-4,GetDate());
--DECLARE @CutOff char(10);
--SET     @CutOff = '2015-01-01'
PRINT 'Cut off date = '+convert(VarChar(32),@CutOff)

-- Because of the key constraints the deletes need to happen in this order
--   JM_Batch has to be first because it has keys from JM_Attachment and JM_Job
--   JM_Attachment has key for JM_Job so deletes have to happen before
--     JM_Job but after JM_Batch
--   JM_Job has to be last because that has the date field we query against
--
-- For the first 2 deletes we get a list of i_job_id from JM_Job that are older
--   than the cutoff date, and compare the i_job_id in the table to that list
--
DELETE from JM_Batch
where i_job_id in (
SELECT i_job_id
FROM JM_Job
WHERE d_started < @CutOff)

DELETE from JM_Attachment
where i_job_id in (
SELECT i_job_id
FROM JM_Job
WHERE d_started < @CutOff)

DELETE from JM_Job
where d_started < @CutOff

This gives us a tidy bit of code that we can run on a regular basis to truncate the Library Job Console log to purge stale information.

This blog post is a bit short, so here's a picture of a kitty.