Get an email when Google visits
When googlebot comes (if it does!) to pages that contains this code, you get an e-mail sent to the address you specify.
Insert the PHP code below in all pages where you want to track whether google is visiting the page or not.
<?php
//Change the following to your email address
$email = “myemail@something.com”;
if(eregi(”googlebot”,$_SERVER['HTTP_USER_AGENT']))
{
mail($email, “Googlebot detected”,
“Google has crawled : “.$_SERVER['REQUEST_URI']);
}
?>
If you have include a file like header.php, footer.php,.. etc. everywhere then you can simply place this code in any of those files to track all those pages. MT users can place it in their template and all their pages will be tracked.
PERL users can use:
July 16th, 2003 at 11:12 pm
From scriptygoddess When googlebot comes (if it does!) to pages that contains this code, you get an e-mail sent to the address you specify….
July 16th, 2003 at 11:16 pm
From scriptygoddess When googlebot comes (if it does!) to pages that contains this code, you get an e-mail sent to the address you specify.
July 17th, 2003 at 7:57 am
I get a parse error everytime I call this piece of code on my site.
It seems that the line “$email ” is failing.
Any ideas?
July 17th, 2003 at 8:00 am
Thanks to the editors for inserting that bug
Change
$email = myemail@something.com;
to
$email = “myemail@something.com”;
July 17th, 2003 at 8:04 am
Had I been paying attention this morning, I would’ve noticed that!
Thanks for the quick response.
July 17th, 2003 at 8:17 am
Go get some caffiene
July 17th, 2003 at 8:28 am
“Thanks to the editors for inserting that bug :D”
Actually I posted exactly what you submitted (I’ll forward it to you).
Post is now updated with your correction.
July 17th, 2003 at 8:41 am
Reading the scriptygoddess I saw a post authored by Jayant Kumar Gandhi. In this post, Jayant states how using simple php, you can make your server email you whenever the famed "GoogleBot" visits your page. This was particularly interesting t…
July 17th, 2003 at 9:18 am
ah ya. probably if was just sleep taking over me when i posted it late @ night.
July 17th, 2003 at 9:54 am
hmmmm. only one thing. what happens when google comes in and visits about 800+ pages, or in my case 35000+ pages! you go to check your mail and BANG! mailserver explodes!
July 17th, 2003 at 11:05 am
Just do it for you main page or important ones or new ones.
wot say
July 18th, 2003 at 10:30 am
This is a pretty cool script that will email you when the googlebot indexes your pages. I think I may try to add this script this weekend, I’ll let you know how it goes. Nope I don’t have a life….
July 21st, 2003 at 4:26 pm
Well done scripties, this is really useful.
There is one thing that the MovableType users will need to remember though: it’s so trivial that we will easily forget it.
To add the PHP code to the MT templates you need to make sure that MT generates it’s files (or at least the main index) as PHP files and not simply as HTML. That way the PHP code is executed instead of being returned as text.
In the MT administration area, click on “Templates > Main Index ” and change the “Output File” to “index.php” if it’s “index,html”. If you call it “main.html” or indeed anything, just make sure that the file’s extension is “.php” and test it out just to make sure.
For those of you using Linux and you’ve access to the “wget” program you can have it act as the Googlebot engine by running the following:
wget -U googlebot http://www.put-your-site-url-here.com
That will let you know that the test is working.
ps: my *might* need to remove the old “index.html” file (I recommend you rename it first) in case your web server looks for “.html” files before it looks for “.php”. And remember, you *do not) need to change your archived files: they can remain as “.html” if you have many of them.
July 21st, 2003 at 4:36 pm
The scriptygoddesses have a very simple and very effective bit of PHP code to let you know that Google has crawled your site. It depends on your webserver running PHP and that you’ve access to sendmail (but you *may* be…
July 25th, 2003 at 1:31 pm
does it work on .html file? What if the webhost doesn’t support .php or PERL ? can I still get notified when Googlebot come?
July 25th, 2003 at 1:58 pm
To Thomas:
Normally no, this code (in either PHP or Perl) will not work in a .html file. Webservers know to execute code that lives inside .php files *before* they are sent our to your web browser and this is how thye web server is able to send you the email to tell you that Google has visited. This is known as “server-side” code becasuse it’s code that runs on the server. If your web server is not set up to run code and only serve up .html code then this won’t work for you.
The other code that works on the web is JavaScript or VBScript. With this case though it runs in your web brower, but that’s code running on your browser machine and not on the server. Your browser machine doesn’t have any way of knowing what other borwsers have come to the website. Other browsers in this case includes the GoogleBot so there’s no easy way that client side JavaScript or VBScript will work.
[I say no easy way, but it would be possible if your webserver was set up to be frightfully insecure and there's no way I'm going to get into how to do that. You'll be so much better off finding a server with php or a hostings sote that allows php instead.]
/e
July 26th, 2003 at 12:24 am
It maybe be possible using SSI if your host allows it. You can exec the sendmail program with the parameters of the email.
I am not sure how will one compare that the useragent is google or not.
perhaps you can make a shell script or a executable program and pass all the related parameters and that shell script/program will invoke sendmail.
Presently I have no plans of implementing it myself. If a lotta people request it, I might think of it.
July 26th, 2003 at 5:37 pm
Would there be anyway to have the email also send the user’s agent and IP address?
Im sort of using this as a notification of when google visits so I know how often they’re crawling my site. I for one, don’t want my site crawled and would like to have the IP’s of the google crawlers so that I can ban them.
Any other suggestions to aid me in this task?
July 26th, 2003 at 11:22 pm
I haven’t perused Scriptygoddess lately so I’m missing out on all kinds of little gems like this handy little piece of code which notifies you by email when a Googlebot visits your site and this code which allows you to…
July 27th, 2003 at 12:18 am
In PHP $_SERVER['HTTP_USER_AGENT'] will have the user-agent while $_SERVER['REMOTE_ADDR'] has the IP address.
The better option is:-
See http://www.google.com/bot.html
July 27th, 2003 at 4:15 pm
Its too bad that the google bot doesnt seem to obey those tips.
I use both a robots.txt and the META tag for robots to no avail.
GoogleBot just doesnt seem to care!
August 7th, 2003 at 7:24 am
Originally found at Scripty Goddess, here’s a code snippet you can add to your pages to tell you when the GoogleBot has found your site.
<?php
//Change the following to your email address
$email = “myemail@something.com”;
if(eregi(”googleb…
August 10th, 2003 at 7:29 pm
Visitando scriptygoddess me encontre este script que te brinda la posiblidad de…
August 11th, 2003 at 12:05 am
Adi’s trying to see if Google crawls his site often. So we both installed this code snippet from :SG:. Since Google keeps hitting my site, I’m going to see if I can redirect them to him by the force of…
October 30th, 2003 at 10:28 pm
Well 1st off i been reading over this php stuff and looking at the codeing and i see stuff i get and all but well i am lost i have a web server called savant and i use it to host my web site and i been trying to do some stuff like Fil has posted and i cant get it to work if you could tell me how to get it to work i would o you one i have a cgi-bin well i can add what i like to it what i would like to do is when someone comes on my site i would like to send them a email thanking to for comeing to my site and they no that this email is comeing to them so it is not spam if you could tell me how i could go about doing this would be grate i been doing html have not got in to all this other stuff but I AM NOW thanks you can email me at itsmcfly32@ma.rr.com
November 3rd, 2003 at 2:13 pm
One of my friend wanted me to do this for him. I thought I will share this with you too. See the following PHP code below to achieve it. <?php if(eregi(”googlebot”,$_SERVER['HTTP_USER_AGENT'])) { mail(”myemail@something.com”, “Googlebot detected”, “…
November 5th, 2003 at 7:05 pm
See what I’ve done? The side bars (just left for now) are collapsable, and not in the stupid way that the blogrolling is collapsable… It actually doesn’t transmit the data if it doesn’t need to, meaning an even quicker download…
January 5th, 2004 at 11:14 pm
Where would you put the following pieces of info in the script?
$_SERVER['HTTP_USER_AGENT']
$_SERVER['REMOTE_ADDR']
Thanks!
January 6th, 2004 at 12:47 am
I’ve updated my RSS feeds. You can now choose from posts only or posts with comments. Although I’m not sure the posts with comments feed is working. I used Jennifer’s template and I just copied and pasted it, but still…
January 6th, 2004 at 1:06 pm
You mean to have those pieces of information mailed to you too?
January 6th, 2004 at 4:59 pm
Yes, but I figured out a better way. I just denied from googlebot.com in my .htaccess file. But thanks to the script, I knew they were crawling my site, so I could block them.
Thank you anyway!!
March 18th, 2004 at 2:47 pm
I would like to comment for Perl users.
$email = myemail@something.com;
After that is changed into:
$email = “myemail@something.com”;
Be sure to add a “\” in front of the @ sign. Otherwise, it will not work. The end result would be:
$email = “myemail\@something.com”;
Computer Central
May 23rd, 2004 at 10:38 pm
Adi’s trying to see if Google crawls his site often. So we both installed this code snippet from :SG:. Since Google keeps hitting my site, I’m going to see if I can redirect them to him by the force of…