sagi February 2016

Oracle - delete based on join with two other tables

I'm having a little trouble designing a code that will run fast.

My requirments are:

I have a table SCD(180mil records) and a smaller table LOG(about 300 records).

LOG structure:

REAL_KEY | FIC_KEY

SCD structure :

Another_KEY | SERIAL_KEY ....

I need to delete from SCD all the real key's, where also exists a record with FIC_KEY, so something like

delete from scd t
where serial_number in(select real_key from log l1)
and exists(select 1 from scd s,log l2 where s.serial_key = l2.fic_key
           and l2.real_key = l1.real_key)

The problem is I cant use the first correlated query results to compare the second (l2.real_key = l1.real_key). Also, even if it will run, this probably will take a lot of time since SCD contains a lot of records.

Any help would be appriciated.

Answers


Marmite Bomber February 2016

First decide which keys should be deleted.

If you want to delete onlyreal_keys for which a fic_key exist in SCD, this is as follows:

select real_key 
from log join scd on log.fic_key = scd.serial_key

The delete is than

delete from scd where serial_key in ( 
select real_key 
from log join scd on log.fic_key = scd.serial_key); 

If you have an index on serial_key in SCD the delete will be done with two NL joins which should be quite instant. If no the performance is estimated with two hash joins of the SCD table with a small table. This should not be ages for 180M rows. You may also speed up a bit using parallel hash join.


sagi February 2016

Well I've managed to come up with the answer. I'll explain the logic:

We've had an error over the last few nights, when our sources gave us data with the new REAL serial and because of a problem we had we disabled our process that fix those serials in our data warehouse.

This caused some problems, when each real serial came as a NEW row, and inserted into our SCD tables as new records instead of updating the existing serials, so we had to reconstruct our tables to how they were few nights ago.

The best and fastest delete query is:

delete from scd t
where t.serial_number in(select s.real_serial_number
                         from UPD_SERIAL s,scd t2
                         where t2.serial_number = s.fic_serial_number)

Post Status

Asked in February 2016
Viewed 3,038 times
Voted 11
Answered 2 times

Search




Leave an answer