N
N_Cook
Guest
I expected one of my sites to be on archive.org, originally it was , but
not now
Looking at robots.txt file
(Currently )
User-agent: *
Disallow: /cgi-bin/
Disallow: /cgi/
(dec 2012 from archive.org )
# Default /robots.txt File for all Community Architect Partner pages
User-agent: *
Disallow: /cgi-bin/
I looked at the wiki on this , but does not explain what is/not involved
in "/cgi/" statement
I assume because its an ads paid for site and they get no revenue from
such caches, no remote linking allowed presumably for same reason.
1/ If I have access to that directory, I doubt I do, would changing it
back to as before corrupt operations? robots.txt has not appeared on any
directory listing I've made , while on upload/download access to it.
2/ On another site , via www access , there is no robots.txt, does that
mean it should turn up on archive.org (if it is aware of the site + page
that is) ?
not now
Looking at robots.txt file
(Currently )
User-agent: *
Disallow: /cgi-bin/
Disallow: /cgi/
(dec 2012 from archive.org )
# Default /robots.txt File for all Community Architect Partner pages
User-agent: *
Disallow: /cgi-bin/
I looked at the wiki on this , but does not explain what is/not involved
in "/cgi/" statement
I assume because its an ads paid for site and they get no revenue from
such caches, no remote linking allowed presumably for same reason.
1/ If I have access to that directory, I doubt I do, would changing it
back to as before corrupt operations? robots.txt has not appeared on any
directory listing I've made , while on upload/download access to it.
2/ On another site , via www access , there is no robots.txt, does that
mean it should turn up on archive.org (if it is aware of the site + page
that is) ?