# Developers Planet

Nikolay Lysenko February 2016

### SQL, joining tables

Here is a problem. We have two tables A and B. Both have a field of Date type. Denote them Date_A and Date_B, correspondingly. The problem is to join these two tables to each other by the following method - each row X from the table A we need to join to some row Y from B such that among all rows Z from B row Y provides the minimum value of expression abs(Date_A(X)-Date_B(Z)), i.e. it's value in Date_B is the closest to value of Date_A in X. Assumingly, this minimum is always unique, but if it's not then it's good to choose one of them randomly (but only one). Example. Table A:

`"A", "B", "2015-10-01"`

Table B: ``` "C", "2015-10-07" "D", "2015-12-02"```

Expected result:

`"A", "B", "2015-10-01", "C", "2015-10-07"`

p.s. the platform is Teradata, if it matters

dnoeth February 2016

Of course you could write a join-condition based on your logic, but this worst_case in a parallel DBMS like Teradata. It will always result in a product join (probably followed by another step to return only one matching row).

For a closest-match-join I usually try to find the actual value using LAG/LEAD logic:

``````SELECT
dt as date_A,
-- find the previous date_B // LAG
MAX(CASE WHEN x = 2 THEN dt end) OVER (ORDER  BY dt, x ROWS UNBOUNDED PRECEDING ) AS prev_date,
-- find the next date_B // LEAD
MIN(CASE WHEN x = 2 THEN dt end) OVER (ORDER  BY dt, x ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS next_date,
-- now find the nearest date
CASE WHEN prev_date IS NULL THEN next_date
WHEN next_date IS NULL THEN prev_date
WHEN dt - prev_date < next_date - dt THEN prev_date
ELSE next_date
END AS date_B
FROM
( -- get all dates, maybe UNION INSTEAD OF UNION ALL
SELECT date_A as dt, 1 AS x
FROM table_A
UNION ALL
SELECT date_B, 2
FROM table_B
) AS dt
QUALIFY x = 1 -- only rows from table_A
``````

This will need two STAT-steps, but now you got the correct `date_B` to equi-join. Put this in a Derived Table and join back to both `table_A` and `tabke_B`, should be significantly faster unless both tables are very small.

Of course you're probably joining on other columns as well, so add them to UNION and PARTITON BY.