Vikash Pathak February 2016

SQL SERVER - Query optimization 'like' causing most cpu uses 100%

I have two tables in database Products and Filters.

The schema:

enter image description here

I have created a query that find all the records from the filters table, loop with each record and call a procedure that set the category id for Products table.

Filter table data will be as follow.

enter image description here

The filter selection query is as follow..

DECLARE @TotalRecords INT, @Start INT, @Limit INT, @CatId INT, @Merchants NVARCHAR(max), @NotMatch NVARCHAR(max), @WillMatch NVARCHAR(max);
SELECT @TotalRecords = COUNT(*) FROM filters;

SET @Limit = 1;
SET @Start = 0;

WHILE(@TotalRecords > 0)
BEGIN       
    SELECT @CatId = category_id, @Merchants = merchant_name, @NotMatch = not_match, @WillMatch = will_match FROM 
    (
        SELECT TOP (@Start + @Limit) *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rnum 
        FROM filters
    ) a
    WHERE rnum > @Start;

    -- call filter procedure.
    exec procSetProductCategory @CatId = @CatId, @Merchants = @Merchants, @WillMatch = @WillMatch, @NotMatch = @NotMatch;

    SET @Start += 1;
    SET @TotalRecords -= 1;
END

And the procSetProductCategory as follow..

CREATE PROC [dbo].[procSetProductCategory]
(
    @CatId INT = NULL,
    @Merchants NVARCHAR(max),
    @NotMatch NVARCHAR(max),
    @WillMatch NVARCHAR(max)
)
AS
BEGIN
SET NOCOUNT ON

    declare @query nvarchar(max), @orToken nvarchar(max), @andToken nvarchar(max);
     set @query = 'UPDATE Products SET category_id = '+ convert(nvarchar(20), @CatId) + ' WHERE category_id IS NULL AND merchant_name IN(' + @Merchants + ')';

    if(@WillMatch is not null AND LTRIM(RTRIM(@W        

Answers


Neville K February 2016

Without a query plan, it's hard to be sure, but I'm guessing this is because you are matching on '%something%', which means the query has to inspect every row.

That's always going to be slow, and there's nothing much you can do to help with indexing.

If you're doing text comparison, you may get better performance by using SQL Server's full text matching feature.


deroby February 2016

For starters, as pointed out already: there really is something wrong with the logic here. That said, assuming you are stuck with it there are some things you might want to try. My first question would be: how long does this thing run? You shouldn't worry too much that it takes 100% CPU; the question is how much time it takes to finish.

Query1:

It seems that you are creating a loop over the the filters table, fetching every single row, one by one.

  • SQL isn't optimised to do row-by-row operations; you really should consider changing the logic to something set-based
  • If you really want to do something row by row, then please use a CURSOR and not the current approach.
    • First you go over the entire table to count how many filters there are
    • Then you go over the entire table and order the records by SELECT 1
    • Out of the sorted list you pick one that has rnum bigger than your counter

=> This is wrong in so many ways, it actually hurts =(

  • If you sort/order by SELECT 1 then it could return the records in the order ABCD the first time and BADC the second time; and both answers would be correct because you're sorting by a constant: the actual order of the records doesn't matter!
  • Each and every time you go through the loop, the server has to sort the entire table before it can tell which rnum values will fit the requirement of being greater than @start; EVERY TIME!
  • There will be many records that fit rnum > @start, the returned record being used to fill up the records could be any one of them!

To 'fix' this I'd suggest to use the following approach:

DECLARE @TotalRecords INT, 
        @Start INT, 
        @Limit INT, 
        @CatId INT, 
        @Merchants NVARCHAR(max), 
        @NotMatch NVARCHAR(max), 
   

Post Status

Asked in February 2016
Viewed 2,285 times
Voted 13
Answered 2 times

Search




Leave an answer