Skip to content

Instantly share code, notes, and snippets.

@golbin
Last active July 4, 2018 08:00
Show Gist options
  • Save golbin/dd68f44e7dbb147eb7d5733607a73a5e to your computer and use it in GitHub Desktop.
Save golbin/dd68f44e7dbb147eb7d5733607a73a5e to your computer and use it in GitHub Desktop.
Get Reddit Articles
import requests
REDIT_URL = 'https://www.reddit.com'
ML_TOP_URL = 'https://www.reddit.com/r/machinelearning/top.json?redditWebClient=mweb2x&layout=classic&raw_json=1&withAds=true&subredditName=machinelearning&sort=top&t=day&feature=link_preview&sr_detail=true&app=2x-client-production'
HEADERS = {'User-agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 10_3 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) CriOS/56.0.2924.75 Mobile/14E5239e Safari/602.1'}
def get_articles():
res = requests.get(ML_TOP_URL, headers=HEADERS)
articles = []
# No exceptions because it's just for one time use
for article in res.json()['data']['children']:
articles.append({
'title': article['data']['title'],
'reddit_url': REDIT_URL + article['data']['permalink'],
'origin_url': article['data']['url'],
})
return articles
# Ugly test
articles = get_articles()
[print(article) for article in articles]
# Output samples
# {
# 'title': '[R] Capture the Flag: the emergence of complex cooperative agents | DeepMind',
# 'reddit_url': 'https://www.reddit.com/r/MachineLearning/comments/8vu823/r_capture_the_flag_the_emergence_of_complex/',
# 'origin_url': 'https://deepmind.com/blog/capture-the-flag/'
# }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment