- Beautiful Soup: We called him Tortoise because he taught us.
Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping. Three features make it powerful:
in Public bookmarks with html open_source parser python scraper web xml by 3 users
- mechanize
Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .
* mechanize.Browser is a subclass of mechanize.UserAgentBase, which is, in turn, a subclass of urllib2.OpenerDirector (in fact, of mechanize.OpenerDirect
client client-side cookie html meta python refresh web
in Public bookmarks with open_source python scraper web by 3 users
- twill: a simple scripting language for Web browsing
twill is a simple language that allows users to browse the Web from a command-line interface. With twill, you can navigate through Web sites that use forms, cookies, and most standard Web features.
in Public bookmarks with open_source python scraper scripting web by 5 users
- Web scraping software and services
screen-scraper.com
Use screen-scraper software for website data extraction. Watch our New Video Introduction! Do data mining, web scraping, and automated data extraction.
automated collection content data extraction html mining page ripper save scraping screen site software text web
in Public bookmarks with scraper software web by 3 users
scraper from all users