Skip to content

Instantly share code, notes, and snippets.

@campreb
Created September 7, 2014 21:36
Show Gist options
  • Save campreb/25a58a6d60fbbf14c376 to your computer and use it in GitHub Desktop.
Save campreb/25a58a6d60fbbf14c376 to your computer and use it in GitHub Desktop.
NZ Opinion Poll Scraper

Usage

ruby scraper.rb > polls.json
require 'nokogiri'
require 'open-uri'
require 'json'
doc = Nokogiri::HTML(open('http://en.wikipedia.org/wiki/Opinion_polling_for_the_next_New_Zealand_general_election'))
rows = doc.css('table:first tr')
headers = rows[0].css('th')
parties = {}
headers[2..headers.length].each_with_index do |cell, i|
parties[i+2] = cell.text
end
results = []
rows.each do |row|
cells = row.css('td')
next unless cells.length > 1
poll = cells[0].text.gsub(/\[.*\]/, '')
date = cells[1].text.gsub(/\[.*\]/, '')
parties.each do |key, party|
value = cells[key].text
next if value == ''
results.push({
poll: poll,
date: date,
party: party,
value: value.to_f
})
end
end
puts results.to_json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment