Tuesday, April 29, 2014

Advanced robots.txt techniques – Crawl Delay

We have a client we have been carrying out search engine optimisation work for over the last 3 months. However, recently we noticed that they had a bit of an issue with their robots.txt

Now for those who don’t know, a robots.txt file sits in the root directory of your website and it is there to tell search engines which files they should and should NOT index (The only real exception to this rule is that some search engines also allow you to provide the address of your XML Sitemap in this file)

Note: placing a file or directory to exclude in your robots.txt file is no guarantee that these pages will not be indexed by search engines. It is merely a way to indicate that you don’t want the page appearing in Google, Bing/Yahoo, etc. … but it still might.

Anyhow, back to this problematic little robots.txt file – here it is, in all its glory:
User-agent: *
Crawl-Delay: 10

The main issue here is with the command “Crawl-Delay: 10”

Note: This was put in automatically by the client’s eCommerce application and not manually by the client.
It is important to know that the “Crawl-Delay” command is not a representation of crawl rate (e.g. the amount of pages indexed at any one time), but instead it defines the amount of time (from 1 to 30 seconds) that the search engine "bot" will wait between crawling each and every page of your site. Meaning that the higher the figure, the lesser the number of pages on your site that actually get indexed.

The original purpose of this command was to stop search engine spiders from tearing through a large site and having a performance effect (e.g. too many crawls at once could even bring a site to its knees). However, Google publically states that they do not support the crawl delay command, (although Bing does), so the case for actually having this command there in the first place is significantly reduced.
Moreover, this crawl delay command poses an issue for some search engine optimisation services, such as Raven Tools. Which can’t effectively crawl sites when this single line is present.

So my advice…. Unless you have strange and less common search engine spiders frequently visit and they have a noticeable effect on your site’s stability, I would remove this command if at all possible.

Wednesday, April 23, 2014

Presenting at the ETAG Technology Solutions Conference 2014

Ouch, I've just found a rather cringe-worthy video of me going 'Umm' a lot, especially at the beginning of this presentation I gave in front of 200 people. This was at the ETAG (Edinburgh Tourism Action Group) Technology Solutions for Tourism Conference 2014 a couple of months ago.

My focus here was on the opportunity for International eCommerce, although there was a bunch of stuff I covered generically around online selling of tourism.

Saturday, April 12, 2014

The unpaid invoices of Stephen Halpin of Merchant Soul

As regular blog readers will know, I'm owed a significant unpaid sum by Stephen Halpin of Merchant Soul. I therefore thought I would detail the specific unpaid invoices, to let others know of this man's inability resolve his debts:
merchantsoul-invoice-oct012 (£1,920.00)
merchantsoul-invoice-nov-dec-2012 (£7,680.00)
merchantsoul-invoice-jan2013 (£2,400)
merchantsoul-invoice-feb2013 (£3,360)
merchantsoul-invoice-mar2013 (£3,480)
merchantsoul-invoice-apr2013 (£2,400)
merchantsoul-invoice-may2013 (£1,680)
merchantsoul-invoice-june2013 (£1,680)
merchantsoul-invoice-july2013 (£1,680)

 As you can see, I not only gave Mr Halpin significant credit but allowed him to not pay me for a significant amount of time. To date, Merchant Soul has not paid any of the above invoices nor has there been any explanation for this failure to pay.

Friday, April 11, 2014

UK financial comparison homepages - revisited

Over a year ago I posted an article on the homepage file size differences of several financial comparison websites.

I therefore have revisited this topic and thought it would be good to see what's changed... once again using the Firebug and Google page speed plugins for Firefox.

Homepage weight: 384.1K
Number of items: 60
Page Speed Score: 80/100 (Desktop)
Page Speed Score: 65/100 (Mobile)
A decrease in page weight but an increase in the number of items means that the page speed is now very slightly less than it was.

Homepage weight: 468.2k
Number of items: 34
Page Speed Score: 82/100 (Desktop)
Page Speed Score: 76/100 (Mobile)
A slight decrease in page weight and a decrease in the number of items has not had the desired effect on page speed.

Homepage weight: 419.9k
Number of items: 38
Page Speed Score: 81/100 (Desktop)
Page Speed Score: 66/100 (Mobile)
A small increase in the homepage weight but with a significant reductions in the number of items on the page has not improved the page speed score

Homepage weight: 1,100k (1.1Mb)
Number of items: 106 
Page Speed Score: 70/100 (Desktop)
Page Speed Score: 59/100 (Mobile)
An increase in the weight of the page by over 200k coupled with an increase in the number of items gives a significant reduction in page speed score (from 87 to 70 - the greatest difference in this test)

The transitional Chief Digital Officer role

We live in a changing world and titles come & go over time. One of the more modern roles to gain more of of the recent spotlight has been that of the Chief Digital Officer or CDO.

In a post a while back, I commented about why the role of CDO is needed, but that some organisations may choose to use a digitally-savvy Non-Exec Director instead.

However, I've give this topic further thought and can see the role of CDO becoming more popular over the next couple of years, but then gradually fading away... as the rest of the board becomes more digitally aware and knowledgeable.

If this scenario is likely to become a reality, it does then beg the question as to who would put themselves forward to become a Chief Digital Officer now?