Are all repo contents prevented from being crawled by search engines?

Hello GitHub experts!

Recently I found that the https://github.com/robots.txt has prevented all repo contents from being crawled by search engines. Here is the current content of the robots.txt.

# If you would like to crawl GitHub contact us at support@github.com.
# We also provide an extensive API: https://developer.github.com/

User-agent: baidu
crawl-delay: 1


User-agent: *

Disallow: */pulse
Disallow: */tree/
Disallow: */blob/
Disallow: */wiki/
Disallow: /gist/
Disallow: */forks
Disallow: */stars
Disallow: */download
Disallow: */revisions
Disallow: */issues/new
Disallow: */issues/search
Disallow: */commits/
Disallow: */commits/*?author
Disallow: */commits/*?path
Disallow: */branches
Disallow: */tags
Disallow: */contributors
Disallow: */comments
Disallow: */stargazers
Disallow: */archive/
Disallow: */blame/
Disallow: */watchers
Disallow: */network
Disallow: */graphs
Disallow: */raw/
Disallow: */compare/
Disallow: */cache/
Disallow: /.git/
Disallow: */.git/
Disallow: /*.git$
Disallow: /search/advanced
Disallow: /search
Disallow: */search
Disallow: /*q=
Disallow: /*.atom

Disallow: /ekansa/Open-Context-Data
Disallow: /ekansa/opencontext-*
Disallow: */tarball/
Disallow: */zipball/

Disallow: /account-login
Disallow: /Explodingstuff/

I believe that by Disallow: */blob/, none of repo contents can be crawled.
I remember that previously GitHub allows master branch to be crawled. Could you share what’s the intention of this change? What should I do if I want my repo contents to be indexed by Google?

Hi @zhiliangxu! :wave: Welcome to the Community!

I’ve been checking with our SEO team about this - it seems that our robots.txt was updated in May to block all crawling, but we are in the process of reverting this change.

We expect repositories to start getting indexed again in the next few weeks!

1 Like