Using Perl, Cookies and MySQL to Record Web Site Visitor’s Data

I have built several web sites and applications in the past, and I have to admit that I am a data junkie. If there is user data or any other data to be captured when someone uses my application or visits a web page, then I like to capture it (I can always delete it later). Of course, you can use a third-party Javascript plug-in for web site analytics (which I also use), but I like to gather information myself.

Most of the pages that I have written use Perl – even for the home page when possible (yes, I am “old school”, and my lack of using modern Perl code is evident). Each time a user clicks to go to another page, I like to capture as much information as possible about their visit.

Perl, PHP and other scripting languages have ways for you to capture the information about your web page visitor. For Perl, this simple script will tell you a lot of information the visitor and your server. In order for this script to work, you have to call it from a web browser.


#!/usr/bin/perl -w
use strict;
use CGI qw(:standard);
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);

print header;
print start_html("Environment");

foreach my $key (sort(keys(%ENV))) {
print "$key = $ENV{$key}
\n";
}

print end_html;

I took the above script and placed it in my webserver’s cgi-bin directory, made it executable, and opened it in my browser. I added a “?test=1234567″ at the end of the script so that you could see the value of the REQUEST_URI variable.

Here is the web output:


DOCUMENT_ROOT = /Library/WebServer/Documents
GATEWAY_INTERFACE = CGI/1.1
HTTP_ACCEPT = text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
HTTP_ACCEPT_CHARSET = ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_ACCEPT_ENCODING = gzip, deflate
HTTP_ACCEPT_LANGUAGE = en-us,en;q=0.5
HTTP_CONNECTION = keep-alive
HTTP_COOKIE =
HTTP_HOST = 192.168.1.2
HTTP_USER_AGENT = Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
PATH = /usr/bin:/bin:/usr/sbin:/sbin
QUERY_STRING = test=1234567
REMOTE_ADDR = 192.168.1.5
REMOTE_PORT = 60398
REQUEST_METHOD = GET
REQUEST_URI = /cgi-bin/mysql/perl_env.pl?test=1234567
SCRIPT_FILENAME = /Library/WebServer/cgi-bin/mysql/perl_env.pl
SCRIPT_NAME = /cgi-bin/mysql/perl_env.pl
SERVER_ADDR = 192.168.1.2
SERVER_ADMIN = email_address@scriptingmysql.com
SERVER_NAME = 192.168.1.2
SERVER_PORT = 80
SERVER_PROTOCOL = HTTP/1.1
SERVER_SIGNATURE =
SERVER_SOFTWARE = Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/0.9.7l DAV/2 PHP/5.2.15

This is a lot of information, and you probably don't need or want most of it. You will want to select information that you can actually or potentially use. For example, if you want to capture their computer info, HTTP referral and remote IP address, in your Perl script. you would create variables in your Perl script like this:


$HTTP_USER_AGENT = $ENV{'HTTP_USER_AGENT'};
$REMOTE_ADDR = $ENV{'REMOTE_ADDR'};
$HTTP_REFERER = $ENV{'HTTP_REFERER'};

You can then insert this information into your MySQL database, along with an "action" of what they are doing on the page. I create an activity table to record what the user is doing on various web pages. If they are visiting your dashboard page, you might set their "activity" to "Dashboard Visit". I also include a date and time stamp. A sample activity table might look like this:


CREATE TABLE `activity` (
`Customer_Serial_Number` varchar(15) NOT NULL,
`Action` varchar(50) NOT NULL,
`Date_Time` datetime NOT NULL,
`Info` varchar(50) NOT NULL,
`Notes` varchar(100) NOT NULL,
`HTTP_USER_AGENT` varchar(150) NOT NULL
`REMOTE_ADDR` varchar(15) NOT NULL
`HTTP_REFERER` varchar(300) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

If the web site requires a user name and password to login, I also use cookies - which is usually how I retrieve the Customer_Serial_Number and/or any other personal information that they have provided to me. Or, if they fill out a "contact us" form, you can save their name and information in a cookie to be retrieved when they visit your web site later. Just be sure to set the cookie expiration date to a date in the future, or to have the cookie never expire.

Here is a quick way to create a cookie in Perl. In this example, we are creating two cookies, Customer Serial Number and Customer Email, both of which we have retrieved from MySQL after they logged into the site:


   # ===========
   # set cookie for Customer Serial Number
   # ===========
   $customer_serial_number = CGI::Cookie->new(-name => "CustSN",
      -value => $Customer_Serial_Number,
      -expires => "+15m",
      -domain => ".mydomain.com",
      -path => "/"
   );

   # ===========
   # set cookie for Customer Email
   # ===========
   $Email = CGI::Cookie->new(-name => "CustE",
      -value => "$Customer_Email_Address",
      -expires => "+1y",
      -domain => ".mydomain.com",
      -path => "/"
   );

# create the cookie for Customer Serial Number and Customer Email
$q = new CGI;
print $q->header(-cookie => [$customer_serial_number, $Email] );

Here are the definitions for the cookie variables (information from http://perldoc.perl.org/CGI/Cookie.html.


CSN - the name of the cookie. You will use this name to retrieve the cookie later.
value - the value of the cookie - the value can be a scalar, an array reference, or a hash reference.
expires - how long you want the cookie to be available. Here are some settings:
   - Cookie expires 10 seconds from now +10s
   - Cookie expires 10 minutes from now +10m
   - Cookie expires 10 hours from now +10h
   - Cookie expires 10 days from now +10d
   - Cookie expires 10 months from now +10M
   - Cookie expires 10 years from now +10y
   - to make it expire immediately, set it to a negative number, such as: -10M
domain - points to a domain name or to a fully qualified host name. If not specified, the cookie will be returned only to the web server that created it.
path - points to a partial URL on the current server. The cookie will be returned to all URLs beginning with the specified path. If not specified, it defaults to '/', which returns the cookie to all pages at your site.

There are a couple of other variables that you can use, so refer to the Perl cookie page for more information.

Deciding which information that you want to keep is up to you. Then, you can simply create a MySQL table to store this information. When a user registers on your site, you can store their email address in a cookie that never expires, so the next time they login, you can auto-populate the login form with their email address.

You can also redirect mobile users to a web page that is specific to their screen size. For example, when I connect to this script with my iPhone, I can tell that the connection is from an iPhone via the HTTP_USER_AGENT:


HTTP_USER_AGENT = Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_2_10 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8E600 Safari/6533.18.5


However, my iPhone's remote IP address is almost useless:


REMOTE_ADDR = 10.1.10.10


So, your Perl script can then detect that this user is on an iPhone, and then it can redirect them with a meta refresh tag to a version of the web page specifically designed for the iPhone screen resolution.

One web site that I designed required a user to login to get past the home screen. Once they logged in, I would grab their serial number from the MySQL database, and then create a cookie with the serial number value. With each page they visited, I would send a few bits of information to my MySQL visitor database.

Here is part of the Perl script that I would use to record customer information: (information on inserting data into MySQL via Perl may be found in an earlier post).


#!/usr/bin/perl -w
use CGI qw(:standard);
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);

# ----------------------------------------------------------------------------------
# grab the environment variables
# ----------------------------------------------------------------------------------

$HTTP_USER_AGENT = $ENV{'HTTP_USER_AGENT'};
$REMOTE_ADDR = $ENV{'REMOTE_ADDR'};

# ----------------------------------------------------------------------------------
# fetch the cookie for the Customer Serial Number
# ----------------------------------------------------------------------------------

%cookies = CGI::Cookie->fetch;
if ($cookies{CustSN}) {
$Customer_Serial_Name = $cookies{CustSN}->value;
}

# ----------------------------------------------------------------------------------
# grab date and time
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }

$DateTime = "$year-$mon-$mday $hour:$min:$sec";

# ---------------------------------------------------------------------------------
# insert customer activity into database
# ---------------------------------------------------------------------------------

$Action = "Dashboard Visit";

$dbh = ConnectToMySql($Database);

$query = "insert into Activity (Customer_Serial_Number, Action, Date_Time, HTTP_USER_AGENT, REMOTE_ADDR) values (?,?,?,?,?)";

$sth = $dbh->prepare($query);

$sth->execute('$Customer_Serial_Number', '$Action', '$DateTime', '$HTTP_USER_AGENT', '$REMOTE_ADDR');

# ---------------------------------------------------------------------------------
# ---------------------------------------------------------------------------------

Once you have enough customer data, you can figure out the paths that your customers are taking when visiting your site. Do they login and go straight to the forum page? Or if you have a web store, do they go straight to the clearance page? Or are they logging in and then doing nothing? If a customer never visits a certain page, what can you do to drive them to visit that page? Third-party tools can provide good overall web site analytical information, but nothing beats being able to see one particular customer's activity on your own.

 

-----------------------------------------

Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world's most popular open-source database program.

Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.