I am creating an SSIS package that essentially attempts to find all rows in Table A that are not in Table B. The join column is an Identity column in Table A that is the clustered index and a column in Table B that is not an Identity, but is indexed. And I am doing this in batches of 10,000 rows at a time. Both table A and table B have approximately 350M rows.
I initially thought a Lookup transformation would be appropriate but I cannot use Full Cache because it attempts to load 350M rows in the cache! If I use No Cache, the process of looking up just 10,000 rows is horrendously slow (even though the lookup column in Table B is indexed).
Also, Table A and Table B are in two different databases on two different servers.
Is there another transformation that would be more appropriate for what I want to do?
Using SQL Server 2014.