OT: robots.txt query

N_Cook · Sep 2, 2016

I expected one of my sites to be on archive.org, originally it was , but
not now
Looking at robots.txt file
(Currently )
User-agent: *
Disallow: /cgi-bin/
Disallow: /cgi/

(dec 2012 from archive.org )
# Default /robots.txt File for all Community Architect Partner pages

User-agent: *
Disallow: /cgi-bin/

I looked at the wiki on this , but does not explain what is/not involved
in "/cgi/" statement

I assume because its an ads paid for site and they get no revenue from
such caches, no remote linking allowed presumably for same reason.
1/ If I have access to that directory, I doubt I do, would changing it
back to as before corrupt operations? robots.txt has not appeared on any
directory listing I've made , while on upload/download access to it.

2/ On another site , via www access , there is no robots.txt, does that
mean it should turn up on archive.org (if it is aware of the site + page
that is) ?

OT: robots.txt query

N_Cook

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

OT: robots.txt query

N_Cook

Guest

Log in

Welcome to EDABoard.com

Sponsor