Meta tags and robots.txt in Yahoo Search

As a webmaster, you can manage how your website appears in Yahoo Search by using meta tags and robots.txt.

Yahoo Search results come from the Yahoo web crawler (Slurp) and Bing’s web crawler. To learn more about optimizing for Bing, see the Bing Webmaster help center. Below, we’ll cover meta tags and robots.txt directives that the Yahoo web crawler understands.

Meta tags

You can manage how Yahoo indexes your website by adding meta tags to your website’s individual HTML pages, or configuring HTTP headers for them.

Prevent a page from being indexed

When you use the noindex tag on a page, Yahoo crawls the page and extracts links from it, but doesn’t include the page in the Yahoo Search index (the page won’t appear in search results). If a page has a noindex tag, but Yahoo hasn't crawled the page and "seen" the tag (but has found the page linked from other pages it did crawl), or is blocked from crawling the page by robots.txt directive, the page may still be indexed.

Make sure that the Yahoo crawler, Slurp, is allowed to crawl pages that you don’t want indexed so it can see the associated noindex tag.

Apply noindex in a robots meta tag

Place the following tag in the head section of an HTML page that you don’t want Slurp to index:


Apply noindex in an HTTP header

Instead of adding a meta tag to each page, you can place the directive in the HTTP header of one or more pages:

X-Robots-Tag: noindex

Prevent specific content from being indexed

Use the “robots-nocontent” class to wrap the HTML code and page content that you don’t want the Yahoo web crawler to index:

Don't index this text.

Prevent a page from being cached

Yahoo caches snapshots of most of the pages it discovers by crawling. These cached pages are linked to in Yahoo Search results pages. To prevent your website from being cached in this way, apply the noarchive meta tag or HTTP header directive.

Apply noarchive in a robots meta tag

Place the following tag in the head section of an HTML page:

Apply noarchive in an HTTP header

Configure your web server to place the following directive in the HTTP header:

X-Robots-Tag: noarchive

  Note - After the next content refresh cycle, Yahoo Search will continue to index and follow links in any pages you configure using noarchive, but the cached version of the pages won't display.

Prevent Yahoo Search from following links

Yahoo Search obeys the “nofollow” directive for links; it follows the link, but excludes it from ranking calculation.

You can apply a rel="nofollow" attribute to any hyperlink on a page, the “nofollow” meta tag to an HTML page, or the X-Robots-Tag: nofollow directive in a page’s HTTP header, to indicate that the link or links on the page may not be approved or trusted.

While Yahoo Search may use the "nofollow" link for discovering content, the link won't be considered an approved link when ranking the target page.

This attribute works to reduce the benefits of comment spamming. For instance, websites with public comment areas can apply a "nofollow" attribute to publicly entered links to help fight comment spam.

Apply nofollow in an HTML “a” element

Apply nofollow in a robots meta tag

Apply nofollow in an HTTP header

Configure your web server to place the following directive in the HTTP header used to serve the page:

X-Robots-Tag: nofollow

Restrict use of DMOZ (Open Directory Project) titles and abstracts

To tell Yahoo not to use a DMOZ description and title as candidate titles and descriptions for one or more of your URLs, use the “noodp” value in the robots meta tag:


When Yahoo finds any of these meta tags in a web document, it won't take DMOZ titles or abstracts into consideration when presenting the title and description for that URL in search results.

Robots.txt directives

Prevent certain sub directories from being crawled

If you'd like to prevent Slurp from reading some portion of your site, create a robots.txt file in the root directory (home folder) of your website, and add a rule for User-agent: Slurp.

  Caution - Disallowing crawling of a page doesn’t guarantee that it won’t be indexed. To stop it from being indexed, see the “prevent a page from being indexed” section, above.

Example of code in a robots.txt file:

User-agent: Slurp
Disallow: /cgi-bin/

Limit crawler frequency

You can add a "Crawl-delay: xx" instruction, where "xx" is a delay value between successive crawler accesses. If the crawler rate is a problem for your server, you can increase the crawl delay.

Setting a crawl-delay of 1 for Yahoo Slurp looks like this:

User-agent: Slurp
Crawl-delay: 1

It’s best to restrict total crawler activity to your server by disallowing unimportant content with a “disallow” robots.txt rule. If you feel a delay is necessary, use a small crawl-delay value to avoid blocking Yahoo Search discovery and refresh of your key content.

Submit a sitemap

You can submit your sitemap to the Yahoo Search crawler, Slurp, through robots.txt directive. Just add the following to your robots.txt file:

Sitemap: [Full URL of your sitemap xml file]

Submit your sitemap to Bing - learn more about how Bing accepts sitemap submissions.

Further reading