This week’s assignment is to grab text from a page. So I try to grab all the title of TechCrunch :

I open the page and find all the news titles are under h2 tag,  so the CSS selector would be “ a”. Here is the code:

from bs4 import BeautifulSoup
import urllib

start_url = ''
html = urllib.urlopen(start_url).read()

soup = BeautifulSoup(html, 'html.parser')

titles =' a')

for title in titles:
    print title.text


and results:


Leave a Reply

Your email address will not be published. Required fields are marked *