This question circulates through the office about once a year. “Um, I have Windows service which processes requests which are queued in a database table. Everything is working great, but I would like to add a second instance of the service and I need to know the best way to ensure the services don’t step on each other. In other words, how do I mark each row as it is being processed so it won’t be picked up again?”
Here are the ingredients in my current ideal solution:
- The table must include the following columns: a Unique Identifier such as RequestID, Status, LockedBy, and LockedTime. I also like to include RequestedTime and CompletionTime columns as I’ve found that capturing these values help when/if system stats are required.
- The key is to lock the current request at the time of processing. To do so, you should use a transaction and an update lock.
- I think it is important to pull requests off the queue in an intelligent order. FIFO is the best bet under most circumstances. In this case, the RequestedTime column becomes imperative (unless your RequestID is an auto-incrementing identity column and you are comfortable ordering on it.)
- Finally, I like to make sure no request is left behind. Let’s say Service A locks a request for processing and then the application server crashes. I want the lock to eventually expire so Service B can pick up the row and save the day.
Assuming new requests are added to the queue as ‘Pending’ , one may use a routine similar to the one below to safely pull the next request off the queue:
create procedure [GetNextItemInQueue] ( @LockedBy varchar(75) ) as set nocount on declare @Error int, @ExitCode int, @RowCount int, @PendingStatus int, @MinutesBeforeBreakLock int, @RequestId int select @Error = 0, @ExitCode = 0, @RowCount = 0, @PendingStatus = 1, @MinutesBeforeBreakLock = 30 begin tran ProcessQueue select top 1 @RequestId = RequestId from Queue with (updlock) where Status = @PendingStatus and (LockedTime IS NULL OR DateDiff(mi, LockedTime, GetDate()) >= @MinutesBeforeBreakLock) order by RequestedTime Asc select @RowCount = @@RowCount, @Error = @@Error if (@RowCount = 0) goto ExitProc if (@Error <> 0) goto ErrorHandler update Queue set LockedBy = @LockedBy, LockedTime = GetDate() where RequestId = @RequestId select @Error = @@Error if (@Error <> 0) goto ErrorHandler select @RequestId as RequestId select @Error = @@Error if (@Error <> 0) goto ErrorHandler goto ExitProc ErrorHandler: raiserror (@Error, 16, 1) if @@trancount > 0 rollback tran ProcessQueue select @ExitCode = -100 ExitProc: commit tran ProcessQueue return (@ExitCode)
Once the request is processed, a database call will be made to update the row’s Status to ‘Complete’ and CompletionTime.
Note, you may choose to change the status of each row from ‘Pending’ to ‘Processing’ when you do the initial lock. This is perfectly fine, but I’ve found that is unnecessary as the Processing status can be determined based on the value of other columns.
Again, that’s my current preferred implementation. Perhaps there are better solutions out there?
November 11th, 2008 at 2:56 am
did u face any problems with this one? I am trying to do exactly same and wondering if this works fine or needs some improvement
November 11th, 2008 at 8:36 am
Hi TG. The implementation outlined above has been running in production for well over a year now without issues. We haven’t encountered any issues yet. Thanks for following up.
April 27th, 2009 at 2:22 pm
does this scale in a web cluster environment making sure one request is not processed by two clients?
April 27th, 2009 at 2:52 pm
@Joe Campbell - It does. As a matter of fact, that’s the primary reason for a routine like this. When a row is locked, it is timestamped and who locked it is captured. As long as the row is locked and the expiration time isn’t met, the row won’t be processed again - by any client. Thanks for asking.
April 27th, 2009 at 6:22 pm
Ben, thanks for the quick response. How does this handle concurrent client requests for a row which is not locked?
April 27th, 2009 at 7:16 pm
@Joe - That’s where the update lock (or updlock) comes in.
Let’s say Request A and B come in at exactly the same time. Well, SQL Server will ensure that only one of the requests will get a lock on the top row - in other words, the next row to be processed in the queue.
Once a row is locked, it is no longer available and it can’t be locked by another request. It should also be noted that the lock stays in place until the associated update is performed.
If Request A gets the initial lock, Request B will lock the next row in the queue and follow the same process as Request A (lock row and capture id, update row based on LockedBy and LockedTime values, implicitly release lock.)
Did I answer your question?
April 29th, 2009 at 7:35 am
Yes u did. Thank you.
May 25th, 2009 at 12:05 am
Hi,
very nice implementation!
what about if I need to re queue an item if processing encountered some external problem?
And what if I want to retry processing for a limited number of times, after which i drop the queue item.
May 25th, 2009 at 6:19 am
@Giorgio - You’re first case is covered using the @MinutesBeforeBreakLock variable. @MinutesBeforeBreakLock gives a thread a limited amount of time to complete the processing the queued item.
If, after the number of minutes established with @MinutesBeforeBreakLock, row processing isn’t completed, the row becomes available for processing by another thread. This logic was put in place for exactly the reason you mentioned — failures dues external issues.
Regarding your second case, that’s not built into the script right now, but it could be extended pretty easily. Off the top of my head (read: following queries should be vetted), a “NumAttempts” column could be added to the Queue table and incremented each time a given row is “pulled off” the queue as part of the update:
update Queue
set LockedBy = @LockedBy,
LockedTime = GetDate(),
NumAttempts = Coalesce(NumAttempts,0) + 1
where RequestId = @RequestId
If you want to limit the number of attempts, the select statement would be changed to something like this:
select top 1 @RequestId = RequestId
from Queue with (updlock)
where Status = @PendingStatus
and (LockedTime IS NULL OR
DateDiff(mi, LockedTime, GetDate()) >= @MinutesBeforeBreakLock)
and (NumAttempts < @AttemptsAllowed)
order by RequestedTime Asc
Note, the restriction based on NumAttempts being less than your established limit.
Thanks for the comment and the question.