Skip to content

Instantly share code, notes, and snippets.

@brookskindle
Last active October 21, 2019 06:03
Show Gist options
  • Save brookskindle/c5166eb27df1e42648fa63cd9e8d8203 to your computer and use it in GitHub Desktop.
Save brookskindle/c5166eb27df1e42648fa63cd9e8d8203 to your computer and use it in GitHub Desktop.
Queries against the Checkouts by Title dataset that the City of Seattle has. https://data.seattle.gov/Community/Checkouts-by-Title/tmmm-ytt6

Database looks like

$ psql checkouts <<< "\d"
          List of relations
 Schema |   Name    | Type  | Owner  
--------+-----------+-------+--------
 public | checkouts | table | brooks
(1 row)

$ psql checkouts <<< "\d checkouts"
                  Table "public.checkouts"
      Column      |  Type   | Collation | Nullable | Default 
------------------+---------+-----------+----------+---------
 usage_class      | text    |           |          | 
 checkout_type    | text    |           |          | 
 material_type    | text    |           |          | 
 checkout_year    | integer |           |          | 
 checkout_month   | integer |           |          | 
 checkouts        | integer |           |          | 
 title            | text    |           |          | 
 creator          | text    |           |          | 
 subjects         | text    |           |          | 
 publisher        | text    |           |          | 
 publication_year | text    |           |          | 
Indexes:
    "year_idx" btree (checkout_year)

Data looks like

 usage_class | checkout_type | material_type | checkout_year | checkout_month | checkouts |                                                                                                                              title                                                                                                                               |          creator           |                                                                                                        subjects                                                                                                        |                                  publisher                                   | publication_year 
-------------+---------------+---------------+---------------+----------------+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------+------------------
 Physical    | Horizon       | VIDEODISC     |          2017 |              7 |        10 | Shark tale [videorecording] / DreamWorks Animation ; DreamWorks SKG ; Pacific Data Images ; produced by Bill Damaschke, Janet Healy, Allison Lyon Segan ; screenplay, Rob Letterman, Michael J. Wilson ; directed by Bibo Bergeron, Vicky Jenson, Rob Letterman. |                            | Groundfishes Juvenile fiction, White shark Juvenile fiction, Tall tales, Friendship Fiction, Feature films, Animated films, Childrens films, Comedy films, Video recordings for the hearing impaired                   | DreamWorks Home Entertainment ; Distributed by Universal Studios Home Video, | [2005].
 Physical    | Horizon       | BOOK          |          2017 |              7 |         1 | Extreme food : what to eat when your life depends on it / Bear Grylis Bear Grylls.                                                                                                                                                                               | Grylls, Bear               | Wilderness survival, Self reliance, Outdoor cooking, Cooking Game, Hunting, Plants Edible, Trapping, Fishing                                                                                                           | William Morrow,                                                              | 2015.
 Digital     | Hoopla        | MUSIC         |          2017 |              7 |         1 | Run River North                                                                                                                                                                                                                                                  |                            | Alternative                                                                                                                                                                                                            | Nettwerk Records                                                             | 
 Digital     | OverDrive     | EBOOK         |          2017 |              7 |         1 | On The Ball: EJ12 Girl Hero Series, Book 6                                                                                                                                                                                                                       | Susannah McFarlane         | Juvenile Fiction                                                                                                                                                                                                       | Lemonfizz Media                                                              | 2017
 Physical    | Horizon       | VIDEODISC     |          2017 |              7 |         2 | Ghare-baire [videorecording] = The home and the world / Janus Films ; The National Film Development Corporation of India ; screenplay and direction by Satyajit Ray.                                                                                             |                            | Bengal India Drama, Government Resistance to India Drama, India History British occupation 1765 1947 Drama, Triangles Interpersonal relations Drama, Man woman relationships India Drama, Feature films, Fiction films | The Criterion Collection,                                                    | [2014?]
 Physical    | Horizon       | BOOK          |          2017 |              7 |         2 | Invasive / Chuck Wendig.                                                                                                                                                                                                                                         | Wendig, Chuck              | United States Federal Bureau of Investigation Fiction, Murder Investigation Fiction, Women detectives Fiction, Billionaires Fiction, Research teams Fiction, Genetic engineering Fiction, Science fiction              | Harper Voyager, an imprint of HarperCollins Publishers,                      | [2016]
 Physical    | Horizon       | BOOK          |          2017 |              7 |         9 | Hooray for bread / Allan Ahlberg ; illustrated by Bruce Ingman.                                                                                                                                                                                                  | Ahlberg, Allan             | Bread Juvenile fiction                                                                                                                                                                                                 | Candlewick Press,                                                            | 2013.
 Digital     | OverDrive     | EBOOK         |          2017 |              7 |         1 | Victory and Honor: Honor Bound Series, Book 6                                                                                                                                                                                                                    | W.E.B. Griffin             | Fiction, Historical Fiction, Thriller                                                                                                                                                                                  | Penguin Group (USA), Inc.                                                    | 2011
 Physical    | Horizon       | SOUNDDISC     |          2017 |              7 |         2 | Maui tears [sound recording] / Sleepy Sun.                                                                                                                                                                                                                       | Sleepy Sun (Musical group) | Rock music 2011 2020                                                                                                                                                                                                   | Dine Alone Records,                                                          | p2014.
 Physical    | Horizon       | BOOK          |          2017 |              7 |         3 | Junie B., first grader. Dumb bunny / Barbara Park ; illustrated by Denise Brunkus.                                                                                                                                                                               | Park, Barbara              | Jones Junie B Fictitious character Juvenile fiction, Easter egg hunts Juvenile fiction, Winning and losing Fiction, Parties Fiction, Humorous stories                                                                  | Random House,                                                                | [2009], c2007.
(10 rows)
/* TODO: create materialized view for this? */
WITH subject_popularity AS (
SELECT
checkout_year AS year,
checkout_month AS month,
checkouts AS checkouts,
TRIM(unnest(string_to_array(subjects, ','))) AS subject
FROM checkouts
)
SELECT
MAX(year) AS year,
SUM(checkouts) AS checkouts,
MAX(subject) AS subject
FROM subject_popularity
GROUP BY subject, year
ORDER BY year DESC, checkouts DESC
LIMIT 10000
year | checkouts | subject
------+-----------+-----------------------------------------------------------------------------------------------
2019 | 1348448 | Fiction
2019 | 854448 | Video recordings for the hearing impaired
2019 | 807154 | Nonfiction
2019 | 684857 | Feature films
2019 | 603138 | Fiction films
2019 | 524205 | Literature
2019 | 334981 | Thriller
2019 | 323856 | Mystery
2019 | 319389 | Romance
2019 | 318219 | Fantasy
2019 | 257778 | Video recordings for people with visual disabilities
2019 | 257108 | Historical Fiction
2019 | 241205 | Graphic novels
@brookskindle
Copy link
Author

brookskindle commented Oct 16, 2019

Running query.sql on my desktop machine took just under 15 minutes. What are some ways that I could make the query faster?

  • database normalization
  • materialized view
  • ???

Computer stats

                                       brooks@minty
 MMMMMMMMMMMMMMMMMMMMMMMMMmds+.        OS: Mint 19.2 tina
 MMm----::-://////////////oymNMd+`     Kernel: x86_64 Linux 4.15.0-60-generic
 MMd      /++                -sNMd:    Uptime: 6h 28m
 MMNso/`  dMM    `.::-. .-::.` .hMN:   Packages: 2373
 ddddMMh  dMM   :hNMNMNhNMNMNh: `NMm   Shell: bash 4.4.20
     NMm  dMM  .NMN/-+MMM+-/NMN` dMM   Resolution: 1920x1080
     NMm  dMM  -MMm  `MMM   dMM. dMM   DE: Cinnamon 4.2.3
     NMm  dMM  -MMm  `MMM   dMM. dMM   WM: Muffin
     NMm  dMM  .mmd  `mmm   yMM. dMM   WM Theme: Mint-Y-Dark (Mint-Y)
     NMm  dMM`  ..`   ...   ydm. dMM   GTK Theme: Mint-Y [GTK2/3]
     hMM- +MMd/-------...-:sdds  dMM   Icon Theme: Mint-Y
     -NMm- :hNMNNNmdddddddddy/`  dMM   Font: Ubuntu 10
      -dMNs-``-::::-------.``    dMM   CPU: AMD Phenom II X6 1045T @ 6x 2.7GHz [35.0°C]
       `/dMNmy+/:-------------:/yMMM   GPU: GeForce GTX 650 Ti BOOST
          ./ydNMMMMMMMMMMMMMMMMMMMMM   RAM: 3163MiB / 7975MiB
             \.MMMMMMMMMMMMMMMMMMM                                    

@brookskindle
Copy link
Author

I analyzed the query through http://tatiyants.com/pev/

query-runtime

The only reason this query is taking so long is because it has to group by (order) the subject, and then the year. Unfortunate, because without that group by, the query is somewhat useless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment