Meta tags and robots.txt in Yahoo Search
As a webmaster, you can can manage how your website appears in Yahoo Search by using meta tags and robots.txt.
Yahoo Search results come from the Yahoo Web crawler (Slurp), and Bing’s Web crawler. To learn more about optimizing for Bing, see the Bing Webmaster Center. Below, we’ll cover meta tags and robots.txt directives that the Yahoo Web crawler understands.
You can manage how Yahoo indexes your website by adding meta tags to—or configuring HTTP headers for—your website’s individual HTML pages.
When you use the noindex tag on a page, Yahoo crawls the page and extracts links from it, but doesn’t include the page in our Search index (the page won’t appear in search results). If Yahoo has not crawled the page yet, or is blocked from crawling it by robots.txt directive, the page may be indexed.
Make sure that the Yahoo crawler “Slurp” is allowed to crawl pages that you don’t want indexed so it can see the the associated noindex tag.
Apply noindex in a robots meta tag
Place the following tag in thesection of an HTML page that you don’t want Slurp to index:
Apply noindex in an HTTP header
Instead of adding a meta tag to each page, you can place the directive in the HTTP header of one or more pages:
Use the “robots-nocontent” class to wrap the HTML code and page content that you don’t want the Yahoo Web crawler to index:
Yahoo caches snapshots of most of the pages it discovers by crawling. These cached pages are linked to in Yahoo Search results pages. To prevent your website from being cached in this way, apply the noarchive meta tag or HTTP header directive.
Apply noarchive in a robots meta tag
Place the following tag in thesection of an HTML page:
Apply noarchive in an HTTP header
Configure your web server to place the following directive in the HTTP header:
Note - After the next content refresh cycle, Yahoo Search will continue to index and follow links in any pages you configure with using Noarchive, but the cached version of the pages will not display.
Yahoo Search obeys the “nofollow” directive for links; it follows the link, but excludes it from ranking calculation.
You can apply a rel="nofollow" attribute to any hyperlink on a page, the “nofollow” meta tag to an HTML page, or the X-Robots-Tag: nofollow directive in a page’s HTTP header, to indicate that the link or links on the page may not be approved or trusted.
While Yahoo Search may use the "nofollow" link for discovering content, the link will not be considered an approved link for consideration when ranking the target page.
This attribute works to reduce the benefits of comment spamming. For instance, websites with public comment areas can apply a "nofollow" attribute to publicly entered links to help fight comment spam.
Apply nofollow in an HTML “a” element
Apply nofollow in a robots meta tag
Apply nofollow in an HTTP header
Configure your web server to place the following directive in the HTTP header used to serve the page:
To tell Yahoo not to use an ODP description and title as candidate titles and descriptions for one or more of your URLs, use the “noodp” value in the robots meta tag:
When Yahoo finds any of these meta tags in a web document, it will not take ODP titles or abstracts into consideration when presenting the title and description for that URL in search results.
If you'd like to prevent Slurp from reading some portion of your site, create a /robots.txt file and add a rule for "User-agent: Slurp".
Caution - Disallowing crawling of a page doesn’t guarantee that it won’t be indexed. To stop it from being indexed, see the “prevent a page from being indexed” section, above.
Example of code in a robots.txt file:
Slurp Disallow: /cgi-bin/
You can add a "Crawl-delay: xx" instruction, where "xx" is a delay value between successive crawler accesses. If the crawler rate is a problem for your server, you can increase the crawl delay. you can set the delay from 1 to 10 A crawl delay of 5 would indicate a very slow rate of crawl while a rate of 10 indicates an extremely slow crawl rate.
Setting a crawl-delay of 1 for Yahoo Slurp looks like this:
It’s best to restrict total crawler activity to your server by disallowing unimportant content with a “disallow” robots.txt rule. If you feel a delay is necessary, use a small crawl-delay value to avoid blocking Yahoo Search discovery and refresh of your key content.
You can submit your sitemap to Yahoo Search’s crawler (Slurp) through robots.txt directive, just add the following to your robots.txt file:
Submit your sitemap to Bing - learn more about how Bing accepts sitemap submissions.
Thank you! Your feedback has successfully been submitted.
Please tell us why you didn't find this helpful.