JayC February 2016

Set unique value dynamically without loop using T-SQL

I need to join a table to itself on either of two values. I can do this easily enough with an OR in the join. However, I am trying to create a table to house this data to make it easier as I add new records daily. I want to create a table to hold this data, and keep a unique value for each result.

I have a working example below, but it uses a while loop and is extremely slow. I would like to convert this to a set-based operation, but keep coming up short.

Table @t is the source of the data. Table @hh is the new table to hold the unique value (hhid).

Example:

declare @t table (appid int, phone varchar(10), bcn varchar(10));
declare @hh table (bcn varchar(10), hhid int default(null));

insert @t 
      select 1, '1115551212','1'
union select 2, '1115551212','1'
union select 3, '1115551212','2'
union select 4, '9995551212','2'
union select 5, '8885551212','3'
union select 6, '1115551212','4'
union select 7, '1115551212','5'
union select 8, '7775551212','1'
union select 9, '7785551212','6'
union select 10, '7795551212','6'

insert @hh select distinct bcn,null from @t;

DECLARE @hhid int = -1;
DECLARE @bcn varchar(10);

SELECT TOP(1) @bcn =  bcn FROM @hh WHERE hhid is null;
SELECT @hhid = ISNULL((SELECT MAX(isnull(hhid,-1)) FROM @hh),-1);
if @hhid = -1
        SET @hhid = 4999999;
WHILE @bcn is not null
BEGIN
    SET @hhid += 1;

    UPDATE @hh SET hhid = @hhid WHERE bcn in (
    select distinct t2.bcn
    FROM @hh h
    JOIN @t t on h.bcn = t.bcn
    left join @t t2 on t.phone = t2.phone
    WHERE h.bcn = @bcn and t.phone <> '');

    SET @bcn = null;
    SELECT top(1) @bcn =  bcn from @hh where hhid is null;
END

select * from @hh

EDIT: Expected Results (as returned by the code):

BCN  HHID  
 1   5000000  
 2   5000000  
 3   5000001  
 4   5000000  
 5   5000000  
 6   5000002  

From the output, you can see

Answers


TT. February 2016

This script does it without loop/cursor. It creates two indexes that are important to speed up the query: t_phone_ind and t_bcn_ind. You can check the actual execution plan, you'll see they are both being used to speed up the query.

CREATE TABLE #t(appid INT PRIMARY KEY, phone VARCHAR(10), bcn VARCHAR(10));
CREATE NONCLUSTERED INDEX t_phone_ind ON #t(phone,bcn); -- for bcn->MIN(phone)
CREATE NONCLUSTERED INDEX t_bcn_ind ON #t(bcn,phone);   -- for phone->MIN(bcn)
insert #t(appid,phone,bcn) 
      select 1, '1115551212','1'
union select 2, '1115551212','1'
union select 3, '1115551212','2'
union select 4, '9995551212','2'
union select 5, '8885551212','3'
union select 6, '1115551212','4'
union select 7, '1115551212','5'
union select 8, '7775551212','1'
union select 9, '7785551212','6'
union select 10, '7795551212','6';

;WITH cte1 AS (
    SELECT bcn,MIN(phone) AS phone FROM #t GROUP BY bcn 
),
cte2 AS (
    SELECT phone,MIN(bcn) AS bcn FROM #t GROUP BY phone
)
SELECT
    cte1.bcn,
    hhid=4999999+DENSE_RANK() OVER (ORDER BY cte2.bcn)
FROM
    cte1
    LEFT JOIN cte2 ON
        cte2.phone=cte1.phone
ORDER BY
    cte1.bcn;

DROP TABLE #t;

Result:

+-----+---------+
| bcn |  hhid   |
+-----+---------+
|   1 | 5000000 |
|   2 | 5000000 |
|   3 | 5000001 |
|   4 | 5000000 |
|   5 | 5000000 |
|   6 | 5000002 |
+-----+---------+

Post Status

Asked in February 2016
Viewed 1,112 times
Voted 6
Answered 1 times

Search




Leave an answer