parker-jana

#Challenge:

Create a command line program that will take an internet domain name (e.g. “jana.com”) and print out a list of the email addresses that were found on that website. It should find email addresses on any discoverable page of the website, not just the home page.

##Examples:

> python find_email_addresses.py jana.com
Found these email addresses:
sales@jana.com

	\| I'm interested in this logic ("the url contains the root domain as part of the domain or subdomain"):
	\| https://github.com/amelehy/email_parse/blob/adc7497de476598f743cb0a83e752407cc069ca0/parser.py#L68-L71

	\| Can you talk me through what it does (what do you expect ROOT_DOMAIN to be? which urls will pass the check and which fail?) and why you chose to make it work that way?

	So I expect ROOT_DOMAIN to be the base domain for the initial URL that is passed by the user. So for example if I were to pass “mit.edu”, “jana.com”, or “drive.google.com" as an argument, ROOT_DOMAIN would be “mit”, “jana”, or “google" respectively.

	In the section where it checks if the ROOT_DOMAIN is part of the domain or subdomain of each parsed URL, the idea is that it is trying to determine which URLs that are gathered from the page actually belong to (or are related to) the original website that was intended on being crawled and which are “external links."

	So for example if I were to pass “www.jana.com”, this script will gathe

	<html>
	<head>
	<title>A simplified user growth model</title>
	</head>
	</html>

	{
	"rows": [
	{
	"data": [
	0,
	0,
	0,
	0,
	1,
	0,


	<!DOCTYPE html>
	<html>
	<head>
	<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
	<link type="text/css" rel="stylesheet" href="style.css"/>
	<script src="https://d3js.org/d3.v3.min.js" charset="utf-8"></script>
	</head>
	<body>
	<div id="controls">

	from random import seed, choice
	from collections import Counter
	from math import sqrt

	Z = 1.9599 # 95% confidence level
	seed(0) # for reproducibility


	class Sample(Counter):
	"""

	import sys
	import boto


	def rename_alarm(alarm_name, new_alarm_name):
	conn = boto.connect_cloudwatch()

	def get_alarm():
	alarms = conn.describe_alarms(alarm_names=[alarm_name])
	if not alarms:

	mystery1 = {'': 0}
	mystery2 = {'': 0.}