[rt-users] Google crawlers.

Mark Jenks mark.jenks at iodincorporated.com
Thu Dec 9 08:34:20 EST 2010


I guess I'm going for the DNS blocking method, since I don't see where
to put in a robots.txt file into RT.   I'll have to do some custom
conf.d stuff when I get back into the office next week.

 

-Mark

 

From: Jason Ledford [mailto:jledford at biltmore.com] 
Sent: Thursday, December 09, 2010 7:10 AM
To: Mark Jenks
Cc: RT-Users at lists.bestpractical.com
Subject: RE: Google crawlers.

 

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=1647
34

If you own the site, you can verify your ownership in Webmaster Tools
and use the verified URL removal tool to remove an entire directory from
Google's search results.

Before using the URL Removal Tool, you must use robots.txt to block
crawler access to the directory
<http://www.google.com/support/webmasters/bin/answer.py?answer=156449>
(or, if you're removing a site, to your whole site). (For more
information about blocking search engines from confidential information,
see Blocking Google
<http://www.google.com/support/webmasters/bin/answer.py?answer=93708> .)
Returning a 404 HTTP status code isn't enough, because it's possible for
a directory to return a 404 status code, but still serve out files
underneath it. Using robots.txt to block a directory ensures that all of
its children are disallowed as well.

Once you have completed one of the steps above, you can request removal
of the directory and all of its contents from search results using the
URL Removal Tool in Webmaster Tools.

1.	On the Webmaster Tools home page, click the site you want.
2.	On the Dashboard, click Site configuration in the left-hand
navigation.
3.	Click Crawler access, and then click Remove URL.
4.	Click New removal request.
5.	Type the URL of the directory you want removed from search
results and then click Continue. How to find the right URL.
<http://www.google.com/support/webmasters/bin/answer.py?answer=63758>
Note that the URL is case-sensitive-you will need to submit the URL
using exactly the same characters and the same capitalization that the
site uses. If you want to remove the whole site, you can leave this
blank.
6.	Click Remove directory
7.	Select the checkbox to confirm that you have completed the
requirements listed in this article, and then click Submit Request.

Be careful when requesting removal of a site. The only reason you should
request a site removal is when you want all the contents of a site
permanently removed from Google's index. 

Removing https://www.example.com will also remove
http://www.example.com, as well as http://example.com and
https://example.com.

If you're worried that your site may have a penalty, or you want to
start from scratch after purchasing a domain from somebody else, we
recommend filing a reconsideration request
<http://www.google.com/support/webmasters/bin/answer.py?answer=35843>
letting us know what you're worried about and what has changed. If your
site has been hacked, check this article for recommendations.
<http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=163
633> 

 

 

 

From: rt-users-bounces at lists.bestpractical.com
[mailto:rt-users-bounces at lists.bestpractical.com] On Behalf Of Mark
Jenks
Sent: Thursday, December 09, 2010 7:37 AM
To: rt-users
Subject: [rt-users] Google crawlers.

 

I posted some links here when I was having problems with RT3.   Now
google is showing up with those links and is now trying to crawl my
site.

 

I tried to drop robots.txt into the html folder, but it goes to the
logon screen instead of showing me the robots.txt file.

 

What do I need to do?     Could we remove my original post from
gossamer?  Or how do I get the robots.txt file to work?

 

I know they can only get to the front screen of RT without logging in,
but I don't want it to show up in google search at all.

 

Thanks!

 

-Mark

 

 

Mark Jenks

Network Administrator

 

1030 Ontario Road  Green Bay, WI 54311  p: 920.406.3702

 

mark.jenks at iodincorporated.com

 


Electronic Privacy Notice. This e-mail, and any attachments, contains
information that is, or may be, covered by electronic communications
privacy laws, and is also confidential and proprietary in nature. If you
are not the intended recipient, please be advised that you are legally
prohibited from retaining, using, copying, distributing, or otherwise
disclosing this information in any manner. Instead, please reply to the
sender that you have received this communication in error, and then
immediately delete it. Thank you in advance for your cooperation   --  


Electronic Privacy Notice. This e-mail, and any attachments, contains information that is, or may be, covered by electronic communications privacy
laws, and is also confidential and proprietary in nature. If you are not the intended recipient, please be advised that you are legally prohibited from
retaining, using, copying, distributing, or otherwise disclosing this information in any manner. Instead, please reply to the sender that you have
received this communication in error, and then immediately delete it. Thank you in advance for your cooperation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bestpractical.com/pipermail/rt-users/attachments/20101209/ba0dec9b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 11481 bytes
Desc: image001.jpg
URL: <http://lists.bestpractical.com/pipermail/rt-users/attachments/20101209/ba0dec9b/attachment.jpg>


More information about the rt-users mailing list