Nikolay Lysenko February 2016
### SQL, joining tables

Here is a problem. We have two tables A and B. Both have a field of Date type. Denote them Date_A and Date_B, correspondingly. The problem is to join these two tables to each other by the following method - each row X from the table A we need to join to some row Y from B such that among all rows Z from B row Y provides the minimum value of expression abs(Date_A(X)-Date_B(Z)), i.e. it's value in Date_B is the closest to value of Date_A in X. Assumingly, this minimum is always unique, but if it's not then it's good to choose one of them randomly (but only one). Example. Table A:

`"A", "B", "2015-10-01"`

Table B:
```
"C", "2015-10-07"
"D", "2015-12-02"
```

Expected result:

`"A", "B", "2015-10-01", "C", "2015-10-07"`

p.s. the platform is Teradata, if it matters

Of course you could write a join-condition based on your logic, but this worst_case in a parallel DBMS like Teradata. It will always result in a product join (probably followed by another step to return only one matching row).

For a *closest-match-join* I usually try to find the actual value using LAG/LEAD logic:

```
SELECT
dt as date_A,
-- find the previous date_B // LAG
MAX(CASE WHEN x = 2 THEN dt end) OVER (ORDER BY dt, x ROWS UNBOUNDED PRECEDING ) AS prev_date,
-- find the next date_B // LEAD
MIN(CASE WHEN x = 2 THEN dt end) OVER (ORDER BY dt, x ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS next_date,
-- now find the nearest date
CASE WHEN prev_date IS NULL THEN next_date
WHEN next_date IS NULL THEN prev_date
WHEN dt - prev_date < next_date - dt THEN prev_date
ELSE next_date
END AS date_B
FROM
( -- get all dates, maybe UNION INSTEAD OF UNION ALL
SELECT date_A as dt, 1 AS x
FROM table_A
UNION ALL
SELECT date_B, 2
FROM table_B
) AS dt
QUALIFY x = 1 -- only rows from table_A
```

This will need two STAT-steps, but now you got the correct `date_B`

to equi-join. Put this in a Derived Table and join back to both `table_A`

and `tabke_B`

, should be significantly faster unless both tables are very small.

Of course you're probably joining on other columns as well, so add them to UNION and PARTITON BY.

Asked in February 2016

Viewed 1,771 times

Voted 12

Answered 1 times

Viewed 1,771 times

Voted 12

Answered 1 times