Pages blocked by robots.txt, or too few pages scanned TN-M03
If too few pages are scanned there are several possible causes:
- The crawler only visits pages on the same domain as the home page, so pages on a different domain do not appear on the map. To add other domains to the same report or sitemap, select the Options command from the View menu, and add the additional domain names to the Additional Domains box in the Links tab.
- Some pages were blocked by the Robot Exclusion Standard (robots.txt) or explicitly blocked in the Blocks tab on the Options window.
- To find out which links are blocked by robots.txt for a site, http://www.google.com for example, open the address http://www.google.com/robots.txt. If you get a Not Found message then no links are blocked, if you get a text file back this will list which links are blocked.
- You can ignore the Robot Exclusion Standard by selecting the Options command from the View menu and unchecking Obey Robots.txt
See Also: What is robots.txt
Applies To: PowerMapper 3.0 and SortSite 3.0 or later
Last Reviewed: January 29, 2015