Page 68 - Programming the Raspberry Pi Getting Started with Python
P. 68

language in their own right; they are used for doing complex searches and validations of text. They are
          not easy to learn or use, but they can simplify tasks like this one.
             What we have done here is called web scraping, and it is not ideal for a number of reasons. First of
          all,  organizations  often  do  not  like  people  “scraping”  their  web  pages  with  automated  programs.
          Therefore, you may get a warning or even banned from some sites.
             Second,  this  action  is  very  dependent  on  the  structure  of  the  web  page. One  tiny  change  on  the
          website  and  everything  could  stop  working. A much better approach is to look for an official web
          service interface to the site. Rather than returning the data as HTML, these services return much more
          easily processed data, often in XML or JSON format.
             If you want to learn more about how to do this kind of thing, search the Internet for “web services in

          Python.”
          Summary
          This chapter has given you the basics of how to use files and access web pages from Python. There is
          actually a lot more to Python and the Internet, including accessing e-mail and other Internet protocols.
          For  more  information  on  this,  have  a  look  at  the  Python  documentation  at
          http://docs.python.org/release/3.1.5/library/internet.html.
   63   64   65   66   67   68   69   70   71   72   73