Page 68 - Programming the Raspberry Pi Getting Started with Python
P. 68
language in their own right; they are used for doing complex searches and validations of text. They are
not easy to learn or use, but they can simplify tasks like this one.
What we have done here is called web scraping, and it is not ideal for a number of reasons. First of
all, organizations often do not like people “scraping” their web pages with automated programs.
Therefore, you may get a warning or even banned from some sites.
Second, this action is very dependent on the structure of the web page. One tiny change on the
website and everything could stop working. A much better approach is to look for an official web
service interface to the site. Rather than returning the data as HTML, these services return much more
easily processed data, often in XML or JSON format.
If you want to learn more about how to do this kind of thing, search the Internet for “web services in
Python.”
Summary
This chapter has given you the basics of how to use files and access web pages from Python. There is
actually a lot more to Python and the Internet, including accessing e-mail and other Internet protocols.
For more information on this, have a look at the Python documentation at
http://docs.python.org/release/3.1.5/library/internet.html.