Serverless Analytics

Once I had to confront with a specific case where the analytical data was to be kept in house and that meant not to use third party analytical scripts like GA or Matomo or any other in that line. This sparked the thought of playing with opensource Matomo old Piwik. Utilizing the piwik tracker public method setTrackerUrl we could change tracking url.

First thought about the age old, process of having apache-php-mysql, but soon dropped it and steered onto AWS Serverless. This turned out to be too simple than what I planned about. The CloudFormation template was prepared according to the illustrated architecture and the deployment went quite smooth. Initial index creation and mapping into ES was the only confusion, since had no prior experience into that stream. StackOverflow was the guide here also, not to mention the elastic.co documentation.

The piwik.js was deployed on S3 and delivered through cloudfront with an alias url like www.example.com/tracker/piwik.js which was configured into the cloudfront origin and behavior. URL www.example.com/tracker/ was then mapped to the apigateway and then to lambda. lambda was a simple one which just takes the query string and if any custom variables exist, map that as key:value and create a json. This json object is stored into the elasticsearch index.

Visualization is yet to be done, though it can be easily done using the kibana dashboard provided along with aws elasticsearch service. I am waiting for data to be arriving at the analytics index store, which can be run through aggregations for visualization.