Skip to content

Instantly share code, notes, and snippets.

@maggie-lee
Last active March 19, 2019 16:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save maggie-lee/e8b967de8cb5615a40a81d40be4531dd to your computer and use it in GitHub Desktop.
Save maggie-lee/e8b967de8cb5615a40a81d40be4531dd to your computer and use it in GitHub Desktop.
Python scraper of annual voting summaries from Georgia legislature website.
from bs4 import BeautifulSoup
import csv
from selenium import webdriver
from pyvirtualdisplay import Display
import time
from selenium.webdriver.support.select import Select
sessions = ['27', '25', '24', '23', '21', '20', '18', '14']
urls = ['http://www.legis.ga.gov/Legislation/en-US/VoteList.aspx?Chamber=2',
'http://www.legis.ga.gov/Legislation/en-US/VoteList.aspx?Chamber=1']
for url in urls:
time.sleep(10)
for session in sessions:
(time.sleep(10))
print (url, session)
with Display():
# we can now start Firefox and it will run inside the virtual display
driver = webdriver.Firefox()
try:
driver.get(url)
page = driver.page_source
my_selection = Select(driver.find_element_by_id("ctl00_SPWebPartManager1_g_f97fdca8_f858_400b_9279_a6a8f76ec618_Session"))
my_selection.select_by_value(session)
page = driver.page_source
soup = BeautifulSoup(page, 'html.parser')
divs = soup.find_all('div', style={"width:100%; background-color:#EEEFCE;"})
# print (divs)
for row in divs:
tds = row.find_all('span')
vote = []
for td in tds:
vote.append(td.get_text())
print (vote)
with open('votes.csv', 'a') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(vote)
divs = soup.find_all('div', style={"width:100%; background-color:#FFFFFF;"})
# print (divs)
for row in divs:
tds = row.find_all('span')
vote = []
for td in tds:
vote.append(td.get_text())
print (vote)
with open('votes.csv', 'a') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(vote)
finally:
driver.quit()
@maggie-lee
Copy link
Author

maggie-lee commented Mar 19, 2019

This scrapes annual vote date/time summaries from the Georgia Legislature's website from 2005-2019, which are found here:
http://www.legis.ga.gov/Legislation/en-US/VoteList.aspx

It's what underlies the chart here: https://datawrapper.dwcdn.net/XYgAq/2/

It writes results to a .csv. If you'd like that csv, it's here.

My next step was summarizing the data via Excel, using pivot tables. Sum up the number of votes on a given day, then assign a legislative day to each date. That's here if you want to review.

Then paste the data into Datawrapper. :)

There's almost certainly an easier way to do this; there's said to be an API under the GGA site, but I'm not sure how to get at it.

Votes counted here are all floor votes: regular votes, attendance, agree/disagree, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment