Home Ask Login Register

Developers Planet

Your answer is one click away!

dStudios February 2016

PostgreSQL MAX() inner query really slow

I've been currently tasked with optimising some of the SQL queries that are ran on very large datasets. I have the query below which I'm trying to optimise but at the moment struggling with it

The query takes roughly around 40-45 minutes to comeback with an answer which is obviously not acceptable. It will even take longer as the dataset grows within DB.

I believe its the inner MAX() function that dramatically slows down the query?

I appreciate any help or advice in making the query better and faster.

SELECT 
  positionview.equipmenttimestamp, 
  positionview.name, 
  positionview.iNumber, 
  positionview.datauserprovidername, 
  positionview.latitude, 
  positionview.longitude
FROM 
  public.positionview
WHERE positionview.outbound = false
AND positionview.referenceIdentifier is null
AND positionview.responseType = 'XYZ'
AND positionview.equipmentTimestamp >='2016-02-02 14:44:08.213'
AND positionview.equipmentTimestamp <='2016-02-05 14:44:08.213'
AND positionview.equipmentTimestamp =(
  SELECT MAX(positionview2.equipmentTimestamp)
    FROM PositionView positionview2
    WHERE positionview.iNumber = positionview2.iNumber
    AND positionview2.outbound = false
    AND positionview2.referenceIdentifier is null
    AND positionview2.responseType = 'XYZ'
    AND positionview2.equipmentTimestamp >='2016-02-02 14:44:08.213'
    AND positionview2.equipmentTimestamp <='2016-02-05 14:44:08.213');

Many thanks in advance!

Answers


Gordon Linoff February 2016

I would suggest window functions:

SELECT pv.*
FROM (SELECT pv.*, MAX(pv.equipmenttimestamp) OVER (PARTITION BY pv.inumber) as maxet
      FROM public.positionview pv
      WHERE pv.outbound = false AND
            pv.referenceIdentifier is null AND
            pv.responseType = 'XYZ' AND
            pv.equipmentTimestamp >='2016-02-02 14:44:08.213' AND
            pv.equipmentTimestamp <='2016-02-05 14:44:08.213'
     ) pv
WHERE equipmenttimestamp = maxet;


a_horse_with_no_name February 2016

This kind of statement can be done using Postgres' distinct on() operator (which is usually faster than the equivalent solution using a standard window function):

SELECT distinct on (inumber)
  equipmenttimestamp, 
  name, 
  iNumber, 
  datauserprovidername, 
  latitude, 
  longitude
FROM 
  public.positionview
WHERE outbound = false
AND referenceIdentifier is null
AND responseType = 'XYZ'
AND equipmentTimestamp >='2016-02-02 14:44:08.213'
AND equipmentTimestamp <='2016-02-05 14:44:08.213'
order by inumber, equipmentTimestamp desc;

An index on (inumber,equipmenttimestamp) will probably help

Post Status

Asked in February 2016
Viewed 2,491 times
Voted 13
Answered 2 times

Search




Leave an answer


Quote of the day: live life