scriptygoddess

18 Jan, 2003

Parse search keywords from referrer

Posted by: Jennifer In: Scripts

(POST UPDATED 8:07pm)
Here's a script snippet. As I said previously, I had a number of requests for scripts that will strip out the keywords from a referrer from a search engine. The code was taken exactly from MT RefSearch (Just copied and pasted this one function here).

Click here to see just the Extract Keyword function from MT RefSearch. (It returns an array with each keyword as one element in the array)

Here's one possible use…to print out each keyword from the referrer:
<?php
$url = getenv("HTTP_REFERER");
$keywords = ExtractKeywords($url);
if ($keywords != "") {
foreach ($keywords as $value) {
echo "Value: $value<br>\n";
}
}
?>

4 Responses to "Parse search keywords from referrer"

1 | Richy C.

January 18th, 2003 at 7:26 pm

Avatar

I take it you are aware that MT RefSearch does a similar thing and copes with around 240 search engines? Include ones which can have 'multiple parameters' in the search field?

The code is reasonably documented – so if you want to take a peek – be my guest!

2 | Jennifer

January 18th, 2003 at 8:02 pm

Avatar

Actually – I have MT RefSearch working here too – hadn't even thought to look to see if it had seperated out a function that would do that… Changing the code above.
To download the full code to MT RefSearch – go here.

3 | Jason

May 25th, 2003 at 4:31 pm

Avatar

I have ben looking everywhere to find a function to strip out the keywords from the search engines refering url and looked at the code you have here on this site but I seem to get lost at the end. See, what I need it to do is just…instead of writing to a flat file or lof file, I want to store the keywords in a mysql database so that I can include it with a stats program I have been working on. Any tips? Anything would be appreciated!

4 | RaulentRoi

July 11th, 2003 at 3:55 pm

Avatar

Thank you goddess for this code. It works just fine and I couldn't have done it without you.

I did have a couple of problems with it having to do with French accents. And also I chose to clean the keywords when some unwanted characters were in it, like dots or commas. I also eliminated all the one letter words. This is what it turned out like (This is a rough looking code, but I 'm just a beginner):

//*First I query the referrals URLs in my data base, which I name URLJustvisited

$URLJustVisited = mysql_result($result,$i,"URLJustVisited");
$keywords = ExtractKeywords($URLJustVisited);
foreach ($keywords as $value) {
if ($keywords != "") {

//*Now I clean up the unwanted characters – it may seem strange to replace a character by itself, but it does have an effect on the outcome
$value1 = str_replace("+", " ", $value);
$value2 = str_replace("-", "", $value1);
$value3a = str_replace("qu'", "", $value2);
$value3b = str_replace("d'", "", $value3a);
$value3c = str_replace("l'", "", $value3b);
$value3d = str_replace("n'", "", $value3c);
$value3e = str_replace("m'", "", $value3d);
$value3f = str_replace("j'", "", $value3e);
$value3g = str_replace("t'", "", $value3f);
$value3h = str_replace("'", "", $value3g);
$value4 = str_replace("é", "é", $value3h);
$value5 = str_replace("é", "é", $value4);
$value6 = str_replace("â", "â", $value5);
$value7 = str_replace("ç", "ç", $value6);
$value8 = str_replace("ç", "ç", $value7);
$value9 = str_replace("è", "è", $value8);
$value10 = str_replace("è", "è", $value9);
$value11a = str_replace(":", "", $value10);
$value11b = str_replace(".", "", $value11a);
$value11c = str_replace(";", "", $value11b);
$value11d = str_replace(",", "", $value11c);
$value12 = str_replace("\"", "", $value11d);
$value13 = str_replace(" ", " ", $value12);
print "$value13<br>";

//*Now I clear all one letter results since they are not keywords

if (strlen($value13) > 1) {

//********************************
//*Now I proceed with the insert – update database with $value13

I had a problem while inserting into DataBase: A "duplicate entry warning" (on die), despite the fact that I querried to make sure that $value did not already exist prior to attempt to insert it in the database.

Why this happens is still a mystery to me. I went around it by disabling the "OR die" which is not a big problem for my usage since it concerns a very few words that are very unlikely to require analysis.

Finally, I added a delete command to get rid of a bunch of words that are not keywords (the, this, at, for ….).

Thanks again.

Featured Sponsors

Genesis Framework for WordPress

Advertise Here


  • Scott: Just moved changed the site URL as WP's installed in a subfolder. Cookie clearance worked for me. Thanks!
  • Stephen Lareau: Hi great blog thanks. Just thought I would add that it helps to put target = like this:1-800-555-1212 and
  • Cord Blomquist: Jennifer, you may want to check out tp2wp.com, a new service my company just launched that converts TypePad and Movable Type export files into WordPre

About


Advertisements