If you run an e-commerce website, there is a high chance you use visitor tracking tools. Or at least you consider using some. First, the most obvious and attractive solution for website owners is Google Analytics (GA). But there are other alternatives on the market like Parse.ly, KISSMetrics, Clicky, Woopra or Piwik.
First of all, we needed raw visitor data to build a custom BI model upon it and we didn’t feel like spending 150,000$/year on Google Analytics Premium. We needed a tool that is able to handle heavy load (up to 1000 actions per minute) and tons of data. What is more, we didn’t want to give up a convenient user interface. So, in a nutshell, we were looking for GA Premium, but free of charge. Piwik meets the criteria and therefore is sometimes called a “serious contender”.
Piwik is a self-hosted, open source solution used by over 900,000 websites. Particularly popular in Europe - e.g. it has 16% market share in Germany (based on top-level domains). It is supported by a thriving community and a great team. Since it’s mainly a self-hosted software, it requires some time to invest in, but in exchange pays off a lot.
(disclaimer - we are not affiliated to any of the providers listed here in any way)
Privacy of the data is a major concern for many users. Your Internet activity can disclose any information about your life or work. Not mentioning the fact that majority of the websites are tracked by several companies. Within last years, the awareness in the topic is constantly growing.
Raw data science
The other advantage is the flexibility you get when having access to the raw data.
For big players who invest large amounts of money in marketing, spanning over multiple marketing channels, technologies, campaigns, and countries, it’s crucial to be able to precisely measure the impact of all their marketing actions.
These companies usually employ complex custom reports, business intelligence solutions aggregating data across multiple sources, or even implement a machine-learning aided solutions to help track defined KPIs.
In the world of annoying ads, ad blocker software has become very popular. Some reports claim that “ad blocking grew by 41% globally in the last 12 months”. In consequence, these ad blocking tools significantly limit marketing tracking capabilities from third party advertisers. As a result, your customer analytics data is likely to be incomplete or biased.
Obviously, there are ways for developers to work around these limitations with GA, but it’s usually doomed to be a lost battle against aggressive and resilient blockers.
In short, Piwik with disabled cookies, anonymised visitor IP addresses and an opt-out button on your website allows you comply with European law without the annoying cookie notice, while you still collect the customer analytics that you need.
Safe Harbor agreement is a set of rules prepared by U.S. Department of State. It was reached in 2000, and has provided a convenient way for US companies to get data from Europe, without violating European law.
On October 6, 2015, the Safe Harbor agreement has become invalid, though. What does it mean? Basically, as an European company, you have to be careful while sharing users' personal or sensitive data with third parties, because it’s your responsibility to guarantee adequate level of data protection.
Besides that, there are some areas that Piwik seems to beat GA in:
- tracking file downloads (GA requires sending custom event to do that),
- tracking outbound links,
- tracking cart abandonment,
- a clear roadmap with long term goals to achieve,
- tracking a particular visitor along with behaviour and shopping cart history, as opposed to gathering only general traffic statistics,
- allowing third-party plugins for extra features.
There are some down-sides to it as well:
- you won’t get insights into Google AdWords or AdSense, though,
- you won’t get the additional context information that Google provides, like: age, gender, interests.
We used Piwik raw data to measure traffic on the website during an advertising campaign in TV and track visitor behaviour. In short, the goal was - how to measure the whole impact of a TV campaign?
The tricky part is that TV advertising is not a directly traceable marketing channel - i.e. it’s difficult to figure out how many paying customers you gained after having displayed a particular spot in the TV. The most obvious idea is to observe online traffic and assume the people who visit your website within minutes after the TV spot airtime to be coming from your campaign. On top of it, there are more accurate techniques that take into account the interference coming from other marketing channels and campaigns, or seasonal trends.
Together with our business partners, we used an indirect methodology related to Temporal Canonical Correlation Analysis. In a nutshell, by decomposition of KPI time series into KPI time series related to a particular marketing channel, and KPI time series related to side effects, we were able to explain how the TV campaign affects given KPI.
As a result, we helped increase long-term ROI of TV commercial spendings. Moreover, based on gathered data we were also able to improve the budget allocation for future commercials.
Piwik can be a serious competitor for Google Analytics. They both offer amazing features. While GA is super simple to set up, with Piwik you have to invest a few hours to set up the infrastructure and then probably a few more to fine-tune the software to your needs and traffic.
If you are a regular user with no specific requirements, you are good to go with GA. But if the benefit of complete control over the data far surpasses the disadvantages of self-hosting, you should definitely consider Piwik a good option.
Stay tuned - our next post in Piwik series will shed more light on the technical side - setting up infrastructure on OpenShift and Amazon EC2, load-testing and performance tuning, operations and scalability.