Skip to content

Usage

Respecting robots.txt

Whitepyges retrieves information from Whitepages.com using web scraping techniques. By default, the library checks and abides by the site's robots.txt rules to ensure responsible and ethical scraping. This means that if a particular action is disallowed by robots.txt, Whitepyges will not perform that request.

Overriding robots.txt

If you encounter issues where a search is blocked due to robots.txt restrictions, you can override this behavior by setting the ignore_robots parameter to True in your search methods. This will instruct Whitepyges to ignore the robots.txt rules and proceed with the request.

Example:

from whitepyges import Address

address = Address(street="123 Main St", city="Springfield", state="IL")
# This will ignore robots.txt and perform the search anyway
residents = address.search(ignore_robots=True)

Note

By default, Whitepyges respects robots.txt to comply with website policies. Use the ignore_robots option responsibly and only if you understand the implications of bypassing these restrictions.