Home Ask Login Register

Developers Planet

Your answer is one click away!

Sean Laney February 2016

Postgres most common value query

I am trying to figure out how to structure some queries, and I am a bit lost.

Tables:

CREATE TABLE dv_customer(
   customer_id INTEGER PRIMARY KEY,
   first_name VARCHAR(50),
   last_name VARCHAR(50),
   email VARCHAR(50),
   address_id INTEGER,
   active BOOLEAN
);

CREATE TABLE dv_address(
    address_id INTEGER PRIMARY KEY,
    address VARCHAR(50),
    address2 VARCHAR(50),
    district VARCHAR(50),
    city_id INTEGER,
    postal_code VARCHAR(50),
    phone VARCHAR(50)
);

CREATE TYPE MPAA_RATING AS ENUM(
'G',
'PG',
'PG-13',
'R',
'NC-17'
);

CREATE TABLE dv_film(
    film_id INTEGER PRIMARY KEY,
    title VARCHAR(50),
    description TEXT,
    length SMALLINT,
    rating MPAA_RATING,
    release_year SMALLINT
);

CREATE TABLE cb_customers(
    last_name VARCHAR(50),
    first_name VARCHAR(50),
    PRIMARY KEY (last_name, first_name)
);

CREATE TABLE cb_books(
    title VARCHAR(50),
    author_id INTEGER,
    edition SMALLINT,
    publisher VARCHAR(50),
    PRIMARY KEY (title, author_id, edition)
);

CREATE TABLE cb_authors(
    author_id INTEGER PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50)
);

CREATE TABLE mg_customers(
    customer_id INTEGER PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(50),
    address_id INTEGER,
    active BOOLEAN
);

I need to figure out the following Queries:

What are the first and last names of all customers who live in the district having the most customers?

So far:

SELECT x.first_name, x.last_name
    FROM dv_customer x, dv_address y    
    WHERE x.address_id = y.address_id 
    AND (SELECT count(district)
    FROM dv_address >= SELECT count(district) FROM dv_address
   );

What are the first and last names of the top 10 authors when ranked by the number of books each has written? I want author name and book count, in descending order of book count.

So far:


      

Answers


a_horse_with_no_name February 2016

What are the first and last names of the top 10 authors when ranked by the number of books each has written

This kind of query is typically done using a window function:

select first_name, last_name, num_books
from (
  SELECT x.first_name, x.last_name, 
         dense_rank() over (order by count(y.title) desc) as rnk, 
         count(*) as num_books
  FROM cb_authors x
    join cb_books y on x.author_id = y.author_id
  GROUP BY x.author_id
) t
where rnk <= 10

Your from clause FROM cb_authors x, cb_books y is missing a join condition and thus creates a cartesian join between the two tables. It is a good example on why the implicit joins in the where clause are a bad thing. If you get in the habit of using an explicit JOIN operator you will never accidentally miss a join condition.

The above also uses x.author_id which is sufficient for grouping as it is the primary key of the column and all other (non-grouped) columns in the select list are functionally dependent on it.


FuzzyTree February 2016

The query below will give you the district with the most customers

select district
from dv_address
group by district
order by count(*) desc
limit 1

Then you can select all customers living in that district with a sub query

select c.* from dv_customer c
join dv_address a on c.address_id = a.address_id
where a.district = (
    select district
    from dv_address
    group by district
    order by count(*) desc
    limit 1
)

Similarly you can get the top 10 author_id's with this query

select author_id 
from cb_books
group by author_id
order by count(*) desc
limit 10

Similarly, with a dervied table

select a.*, t.cnt from cb_authors a
join (
    select author_id, count(*) cnt
    from cb_books
    group by author_id
    order by count(*) desc
    limit 10
) t on t.author_id = a.author_id
order by t.cnt desc

Post Status

Asked in February 2016
Viewed 1,883 times
Voted 9
Answered 2 times

Search




Leave an answer


Quote of the day: live life