Implement SQL Server Agent Jobs with AlwaysOn Availability Groups

Implement SQL Server Agent Jobs with AlwaysOn Availability Groups

In this post I talk about SQL Server Agent jobs when implementing AlwaysOn Availability Groups.

AlwaysOn Availability Groups synchronizes the databases that are part of an Availability Group but any other object that is not part of these databases do not get synchronized (i.e. logins, linked servers, operators etc.) and this includes SQL Server Agent jobs.

For a tool that synchronizes all the objects that are beyond the scope of AlwaysOn Availability Groups see this post

Our requirement is to have all jobs that exist on the Primary replica to exist also on the Secondary replica(s) and to be exactly the same (and in the same state: enabled/disabled) so that in the event of a fail over of the primary replica to the secondary replica things keep working just the same as they do on the primary replica.

BUT we also need to assure that they actually get executed on the Primary replica only!.

I found that the simplest way to do so is by adding a job step (which I named ‘get_availability_group_role’) that checks whether this sql instance is a primary replica or not. If this is a primary replica then the job proceeds to the next step and if this is not a primary replica the job terminates by executing a stop job request thus never reaching the next step. This method allows us to have the same jobs exiting at all replicas and being executed on the primary replica only.

–Note that there are some jobs that you may want to have running on any replica such as jobs that manage a partition sliding window, index maintenance etc.

The function I use fn_hadr_group_is_primary from this post is a UDF (User Defined Function) that we create in the master database. The function gets an Availability Group name as an input parameter and returns a Boolean value indicating weather this instance is a primary replica.

-- fn_hadr_group_is_primary
USE master;
GO
IF OBJECT_ID('dbo.fn_hadr_group_is_primary', 'FN') IS NOT NULL
  DROP FUNCTION dbo.fn_hadr_group_is_primary;
GO
CREATE FUNCTION dbo.fn_hadr_group_is_primary (@AGName sysname)
RETURNS bit
AS
BEGIN;
  DECLARE @PrimaryReplica sysname; 

  SELECT
    @PrimaryReplica = hags.primary_replica
  FROM sys.dm_hadr_availability_group_states hags
  INNER JOIN sys.availability_groups ag ON ag.group_id = hags.group_id
  WHERE ag.name = @AGName;

  IF UPPER(@PrimaryReplica) =  UPPER(@@SERVERNAME)
    RETURN 1; -- primary

    RETURN 0; -- not primary
END; 

Use the fn_hadr_group_is_primary function within a new job step to find if this sql instance is a Primary replica. If this is not a primary replica we issue a stop job request while identifying the current job name using SQL Server Agent Tokens

-- Detect if this instance's role is a Primary Replica.
-- If this instance's role is NOT a Primary Replica stop the job so that it does not go on to the next job step
DECLARE @rc int; 
EXEC @rc = master.dbo.fn_hadr_group_is_primary N'my-ag';

IF @rc = 0
BEGIN;
    DECLARE @name sysname;
    SELECT  @name = (SELECT name FROM msdb.dbo.sysjobs WHERE job_id = CONVERT(uniqueidentifier, '$(ESCAPE_NONE(JOBID))'));
    
    EXEC msdb.dbo.sp_stop_job @job_name = @name;
    PRINT 'Stopped the job since this is not a Primary Replica';
END;

Putting it all together. This code snippet bellow loops over all jobs and prints a new job step called ‘get_availability_group_role’ with the code from the above snippet. The new job step is the first step within each job with a step id of 1.

Executing the printed code will result in a new job step with a step_id of 1 which is the first step to be executed, the previously step_id 1 will now become step_id 2 and so on

USE msdb;
SET NOCOUNT ON; SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; 

IF OBJECT_ID(N'tempdb.dbo.#data', N'U') IS NOT NULL DROP TABLE dbo.#data;
CREATE TABLE dbo.#data (id int IDENTITY PRIMARY KEY, name sysname);


-- Get all job names exclude jobs that already have a step named 'get_availability_group_role'
INSERT dbo.#data (name)
SELECT DISTINCT j.name--, s.step_name 
FROM dbo.sysjobs j
    EXCEPT
SELECT DISTINCT j.name
FROM dbo.sysjobs j
INNER JOIN dbo.sysjobsteps s ON j.job_id = s.job_id
WHERE s.step_name = N'get_availability_group_role';

-- Remove jobs that need to run on any replica
DELETE FROM #data WHERE name LIKE 'SQL Sentry%';
DELETE FROM #data WHERE name LIKE 'syspolicy_purge_history';
DELETE FROM #data WHERE name LIKE 'DBA - Backup%';
--SELECT * FROM #data ORDER BY 1;


DECLARE @command varchar(max), @min_id int, @max_id int, @job_name sysname, @availability_group sysname;
SELECT  @min_id = 1, @max_id = (SELECT MAX(d.id) FROM #data AS d);

SELECT @availability_group = (SELECT ag.name FROM sys.availability_groups ag);

-- If this is instance does not belong to HA exit here
IF @availability_group IS NULL 
BEGIN;
    PRINT 'This instance does not belong to AG. Terminating.';
    RETURN;
END;


DECLARE @debug bit = 1; --<------ print only 

-- Loop through the table and execute/print the command per each job
WHILE @min_id <= @max_id
BEGIN;
        SELECT @job_name = name FROM dbo.#data AS d WHERE d.id = @min_id;

        SELECT @command = 
        'BEGIN TRAN;
        DECLARE @ReturnCode INT;
        EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_name=''' + @job_name + ''', @step_name=N''get_availability_group_role'', 
                @step_id=1, 
                @cmdexec_success_code=0, 
                @on_success_action=3, 
                @on_success_step_id=0, 
                @on_fail_action=3, 
                @on_fail_step_id=0, 
                @retry_attempts=0, 
                @retry_interval=0, 
                @os_run_priority=0, @subsystem=N''TSQL'', 
                @command=
        N''-- Detect if this instance''''s role is a Primary Replica.
-- If this instance''''s role is NOT a Primary Replica stop the job so that it does not go on to the next job step
DECLARE @rc int; 
EXEC @rc = master.dbo.fn_hadr_group_is_primary N''''' + @availability_group + ''''';

IF @rc = 0
BEGIN;
    DECLARE @name sysname;
    SELECT  @name = (SELECT name FROM msdb.dbo.sysjobs WHERE job_id = CONVERT(uniqueidentifier, $(ESCAPE_NONE(JOBID))));
    
    EXEC msdb.dbo.sp_stop_job @job_name = @name;
    PRINT ''''Stopped the job since this is not a Primary Replica'''';
END;'', 
        @database_name=N''master'', 
        @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) 
BEGIN; 
    PRINT ''-- Rollback: ''''' + @job_name + ''''''' ROLLBACK TRAN; 
END;
ELSE COMMIT TRAN;' + CHAR(10) + 'GO';

        PRINT @command;
        IF @debug = 0 EXEC (@command);

    SELECT @min_id += 1;
END;
The following two tabs change content below.
Yaniv Etrogi is an SQL Server consultant. He loves SQL Server and he is passionate about Performance Tuning, Automation and Monitoring.

Leave a Comment

Your email address will not be published. Required fields are marked *