Are all repo contents prevented from being crawled by search engines? #22746

zhiliangxu · 2020-07-28T09:31:46Z

zhiliangxu
Jul 28, 2020

Hello GitHub experts!

Recently I found that the https://github.com/robots.txt has prevented all repo contents from being crawled by search engines. Here is the current content of the robots.txt.

# If you would like to crawl GitHub contact us at support@github.com. # We also provide an extensive API: https://developer.github.com/ User-agent: baidu crawl-delay: 1 User-agent: * Disallow: */pulse Disallow: */tree/ Disallow: */blob/ Disallow: */wiki/ Disallow: /gist/ Disallow: */forks Disallow: */stars Disallow: */download Disallow: */revisions Disallow: */issues/new Disallow: */issues/search Disallow: */commits/ Disallow: /commits/?author Disallow: /commits/?path Disallow: */branches Disallow: */tags Disallow: */contributors Disallow: */comments Disallow: */stargazers Disallow: */archive/ Disallow: */blame/ Disallow: */watchers Disallow: */network Disallow: */graphs Disallow: */raw/ Disallow: */compare/ Disallow: */cache/ Disallow: /.git/ Disallow: /.git/ Disallow: /.git$ Disallow: /search/advanced Disallow: /search Disallow: */search Disallow: /q= Disallow: /.atom Disallow: /ekansa/Open-Context-Data Disallow: /ekansa/opencontext-* Disallow: */tarball/ Disallow: */zipball/

Disallow: /account-login Disallow: /Explodingstuff/

I believe that by Disallow: */blob/, none of repo contents can be crawled.
I remember that previously GitHub allows master branch to be crawled. Could you share what’s the intention of this change? What should I do if I want my repo contents to be indexed by Google?

Answered by yamiacat

Aug 3, 2020

Hi @zhiliangxu! 👋 Welcome to the Community!

I’ve been checking with our SEO team about this - it seems that our robots.txt was updated in May to block all crawling, but we are in the process of reverting this change.

We expect repositories to start getting indexed again in the next few weeks!

View full answer

yamiacat · 2020-08-03T09:24:51Z

yamiacat
Aug 3, 2020

Hi @zhiliangxu! 👋 Welcome to the Community!

I’ve been checking with our SEO team about this - it seems that our robots.txt was updated in May to block all crawling, but we are in the process of reverting this change.

We expect repositories to start getting indexed again in the next few weeks!

0 replies

pablo-mayrgundter · 2023-09-24T22:06:41Z

pablo-mayrgundter
Sep 24, 2023

Hi, any update on this?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Community

Are all repo contents prevented from being crawled by search engines? #22746

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

GitHub Community

Are all repo contents prevented from being crawled by search engines? #22746

zhiliangxu Jul 28, 2020

Replies: 2 comments

yamiacat Aug 3, 2020

pablo-mayrgundter Sep 24, 2023

zhiliangxu
Jul 28, 2020

yamiacat
Aug 3, 2020

pablo-mayrgundter
Sep 24, 2023