In today's Daily Golden Nugget, I'm continuing the log file conversation with a more detailed look at referrals.
Here's the same example I showed you yesterday:
66.240.78.172 - - [17/Apr/2015:09:33:14 -0400] "GET /jewelry-catalog/rings.html HTTP/1.1" 200 14896 "http://www.bing.com/search?q=gold+rings+14k&src=IE-SearchBox&FORM=IE8SRC" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0)"
The part that I've indicated in bold shows the referring website that the visitor came from. Typically, the only way for this referral information to be captured in your log file is when someone clicks one of your website links on that referring page.
The easiest example I can give you is when the referral comes from bing.com or google.com. In those cases, we can assume that the visitor is clicking on one of your organic search results. Although referrals from searches engines might be interesting to look at, they are not that exciting.
It might be more exciting to see referrals from one of the social networks. The referring website will always appear as a fully qualified URL, which includes the http:// and the www.
When you quickly look through your log file, you will find several entries like this one:
173.252.102.112 - - [14/Apr/2015:20:29:43 -0400] "GET /images/log.jpg HTTP/1.1" 200 1271 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
Although you might think this is a referral from Facebook, this is actually what your web server sees when Facebook creates those web page previews while you are sharing a link. Basically, you can ignore anything that says facebookexternalhit.
This is what a real Facebook referral looks like in your log file:
108.16.117.32 - - [24/Dec/2014:11:21:08 -0700] "GET / HTTP/1.1" 200 12398 "http://l.facebook.com/l.php?u=http%3A%2F%2Fwww.jewelrystore.com%2F&h=MAQHYXYj3&s=1" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0"
Notice that it is not using "www.facebook.com" but rather "l.facebook.com." The "l" (that's a lowercase L) subdomain is how they control all their links. This particular Facebook referral was from December 24, 2014 and I recognize it as someone clicking the website link from the About box on the Facebook Page.
Here's a more complicated looking referral that was recorded when someone clicked a link to the jwag.biz website:
74.83.126.3 - - [23/Feb/2015:13:42:39 -0700] "GET /newsletters/2015/02/23/identifying-and-dealing-with-a-hacked-website.html HTTP/1.1" 200 57096 "http://l.facebook.com/l.php?u=http%3A%2F%2Fbit.ly%2F1AsyYTP&h=rAQGDMsWrAQE2Ku1RdS6jrrZJDPD72DRUcEyGi-k3QRu9mg&enc=AZNcjhFcpE90UrfTtkooriXAsQInBdvMhVaOX5TUDypw6H7uOl-z1efLZS_NSaNDC4RnFoDjzJOpHww0INdNZIzEMVlv_Y6gMggwQ4DAudaID44igDGu46ZCWc9fG_9P9HneUvBHy5vF2cdGMaDKCso5GcWjKc_HFBn-zwI3GIxscg&s=1" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
Truthfully, looking at these log file entries can make you go cross-eyed, that's why you need some type of reporting software. I use Webalizer to run simple reports on my log files.
Here are some of the referrals I pulled from my March 2015 referral statistics from jwag.biz:
Hits Referrer
---------------- --------------------
2562 0.56% https://www.google.com/
405 0.09% https://www.google.co.uk/
403 0.09% http://www.google.com/url
399 0.09% http://www.google.com/
246 0.05% https://www.google.ca/
10 0.00% http://l.facebook.com/l.php
1 0.00% http://api.twitter.com/1/statuses/show/572335731965890560.json
---------------- --------------------
2562 0.56% https://www.google.com/
405 0.09% https://www.google.co.uk/
403 0.09% http://www.google.com/url
399 0.09% http://www.google.com/
246 0.05% https://www.google.ca/
10 0.00% http://l.facebook.com/l.php
1 0.00% http://api.twitter.com/1/statuses/show/572335731965890560.json
My log files recorded 1952 different referring domain names in March 2015. The web server always records referrals when people click from one page to the next while on your website; for that reason you will always see your own domain name as the top referring source. I always find Google and all its different country versions to be one of the other top referral sources.
Evil Within Referral Logs
The internet is a key component in modern marketing and building a successful business. Eventually you, or someone who works for you, will have to learn how to read your website log reports and Google Analytics.
Unfortunately, there are a lot of bad people online and they prey upon those of us who are simply trying to analyze these reports to pull out the important business information. These sleazy people often use your own log files against you for their own nefarious purposes.
Here's an example of one of the sleazy links I found in my own referral log in March.
Hits Referrer
---------------- --------------------
56 0.01% http://sleazyperson.blogspot.de/2011/09/just-like-first-x-men.html
---------------- --------------------
56 0.01% http://sleazyperson.blogspot.de/2011/09/just-like-first-x-men.html
Please note that I slightly changed that URL from the actual one in order to protect you, as you'll soon see.
As you add more blog posts to your site, you will eventually gain links from other bloggers. In my own Daily Nuggets, I link to Wikipedia and Google help docs all the time because they relate to my topic. Similarly, other bloggers will eventually link to your blogs as related sources for their posts.
Sometimes, on a boring Sunday afternoon, I'll look through my list of referrals for other blog. Blogspot.com is a free blogging platform from Google. Google changes the .com TLD to a different one depending on the location of the reader. While I'm in France I always see Blogspot as "blogspot.fr." The blogspot.de URL shown above tells me that 56 people in Germany clicked on a shared link from a blog titled "Just Like First X-Men."
I like Marvel's X-Men, so I decided to take a look at this blog myself. I copied that URL from my report and pasted it into my browser. I always take precautions to make sure my anti-virus, anti-spam, firewall, and all cookies are blocked when visiting random, unknown web pages. It's a good thing too because that particular blog post was actually a link to a Trojan virus that would have infected my computer.
Other than sneaky links to Trojan viruses, your referrer log is also loaded with, what I'll call, "porn landmines." Many porn websites have ways to artificially flood your referral logs with their own domain names which look innocuous. Just like my above blog example, if you curiously investigate these sites you will land on a porn site.
In fact, while looking for examples to include in the Nugget I found more fake referrals from porn sites than any other type. My best suggestion for referral investigations is to use Google Chrome in incognito mode or a text based browser. You also need to account for other people who might see the NSFW type websites that could popup on your computer.
Other Evil Uses of Referral Logs
In yesterday's Nugget I stressed that you should not allow your log files to be available to the public. It's best to configure your website so they are saved in a hidden directly.
Keeping these files hidden will protect you from the following:
1. Your competitors monitoring your traffic
2. Google reading them
3. Industrial espionage
While I don't think a retail jeweler will need to worry about their competitor snooping through log files, just know that it is possible.
On the other hand, Google has been known to access and process website log files as a way to better understand a site, and to find hidden pages that are otherwise not linked to from anywhere.
An experienced webmaster will know that Google can somehow find a hidden page on your site even though it isn't linked to from anywhere. When you carefully read this Google support page you'll see a vague reference to Google being able to read your referrer log.
Google saves these referral links as part of your website's linking profile, and it's theorized that they are then used against you as part of the Penguin Penalty. That's where the espionage comes into play. Let me stress that this is an unsubstantiated theory that few people have ever written about. If it's true, it certainly would be one of Google's own secrets.
All you need to do to protect yourself from this sort of referral log evil is simply hide your log files from everyone.
Referral Spam
In the last few months, the SEO community has been rocked with a lot of conversations about referral spam. That's a huge topic that I'll be writing about in the upcoming weeks. Stay tuned for that.