Thom Ash February 2016

Does UNION or UNION all build one massive query that locks all tables selected?

I'm being told by my lead DBA that I wrote poorly formed code because I used a UNION ALL to accumulate results of successive queries on different tables. I thought when a query with multiple select statements that had results UNIONed executed separately so when each select statement executes it places a shared lock on the table that is released when finished and the next select starts. I thought the results were accumulated in some buffer or tmp table.

Would someone kindly tell me what goes on behind the scenes and what resources consumed when a results of a hundred select statements are UNIONed. Each select operates on one table and collects schema, Table, and Column names.

Sorry, I don't have query plan. The DBA complained the query was too big to show much of the plan. His comments are below the query.

SELECT 'R_Stage' as TheSchema, 'DateFrozenSectionModF63x086' as TheTable, 'PersonModTextStaffSID' as TheColumn, COUNT(*) as NullCount 
FROM [R_Stage].[DateFrozenSectionModF63x086] WHERE [PersonModTextStaffSID] = -1  
UNION ALL 
SELECT 'R_Stage' as TheSchema, 'DateFrozenSectionModF63x086' as TheTable, 'LabDataLabSubjectSID' as TheColumn, COUNT(*) as NullCount 
FROM [R_Stage].[DateFrozenSectionModF63x086] WHERE [LabDataLabSubjectSID] = -1  
UNION ALL 
SELECT 'R_Stage' as TheSchema, 'DateFrozenSectionModF63x086' as TheTable, 'LabDataPatientSID' as TheColumn, COUNT(*) as NullCount 
FROM [R_Stage].[DateFrozenSectionModF63x086] WHERE [LabDataPatientSID] = -1  
UNION ALL 
SELECT 'R_Stage' as TheSchema, 'DateGrossDescChangedF63x087' as TheTable, 'PersonModTextStaffSID' as TheColumn, COUNT(*) as NullCount 
FROM [R_Stage].[DateGrossDescChangedF63x087] WHERE [PersonModTextStaffSID] = -1  
UNION 
ALL SELECT 'R_Stage' as TheSchema, 'DateGrossDescChangedF63x087' as TheTable, 'LabDataLabSubjectSID' as TheColumn, COUNT(*) as NullCount 
FROM [R_Stage].[DateGrossDescChangedF63x087] WHERE [LabDataLabSubjectSID] = -1  
UNION ALL 
SELECT 'R_Stage'        

Answers


Payson February 2016

The advice from your DBA seems quite reasonable. He/she doesn't mention locking, and it's not clear why you've mentioned that as the problem.

As the DBA states, you're executing 791 queries that the database engine then unions together. This will impose a load on the database. Assuming your DBA is correct about those queries being full table scans, that means the entire table is going to be read 791 times.

Regardless of any locking, that is going to thrash the disks, overrun file system and database caches, and load up the CPU running those queries.

Assuming your database is large enough that it doesn't fit in the RAM file system or database cache, that means it has to be read from disk in full each time. If the query were rewritten as your DBA advises so that it only made 1 full table scan through the database, the impact on the file system would be 1/791 of the query as currently written.

If your database does indeed take read locks at the same time, your query will impact updaters of that table 791 times.

Your DBA's recommendations have the effect of making the proposed query roughly 791 times as efficient.

If we assume just as a working example that your table is 100 meg, at a disk read speed of 100 mb/s it will take around 1 second to process each of 791 queries, so the full query would take around 14 minutes. Rewritten as your DBA advises it will take around 1 second.

This isn't a locking problem, it's a classic I/O performance problem. If you have locking problems as well, that just makes it worse.

The exact performance characteristics of your query depend on many factors, including how large the table is, what indexes are defined (noting that indexes can make a query slower in certain circumstances), how 'wide' the table is, the types of columns in the table, what hardware the query is running on, what database system you use, how fast the disks are, how much RAM your DB has, what else is


lrb February 2016

I'm sorry that post just made my eyes hurt. It sounds like you are needing to write a script to clean up or identify a problem. To make this easy You could automate as script that will spit out smallish testable sql statements code before you post up those 300 tables. If your dba will let you use cursors and temp tables, both of which should be avoided when possible, however, this seems more like an identify the problem and or clean up issue rather than focus on efficiency. That being said, I would not want to lock those tables up on a production system for periods of time...so do a lot of smaller task to reduce locks and reach the same goal. You can run this script in sql server admin and copy the output as input to give to your dba, maybe it helps.

SET NOCOUNT ON

DECLARE  @OUTPUT TABLE
(  
    TheSchema NVARCHAR(45),
    TheTable NVARCHAR(45),
    Field1 NVARCHAR(45),
    Field2 NVARCHAR(45),
    Field3 NVARCHAR(45)
)

INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x086','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x087','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x088','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x089','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x090','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x091','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x092','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x093','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT @OUTP 

Post Status

Asked in February 2016
Viewed 3,521 times
Voted 7
Answered 2 times

Search




Leave an answer