Showing posts from 2017

Spot your churning users sooner with statistics

The girl (or guy) you're dating has called you on average five times a week lately. As a neurotic analytics enthusiast/stalker you happen to know this. Now it's been three days without a call. Should you be worried?

Becoming a data-driven company

How does a company become data-driven? Here's what I've learned at Zervant. Radiate information If you want to turn the culture more data-driven, radiate information. You'll never know what will stick. Radiating information also has the side-effect of evangelizing the work of analytics: people will remember you exist and they will take analytics into consideration more easily. Oh right, when we're building this we need to talk to analytics about what to measure!  Write an internal blog, give short presentations about an information nugget, etc. Frequency is more important than the perfect quality of content. Teach People love to learn new things. They will first tell you they're in no way analytical-minded and far from technical, but they will eagerly learn. De-mystifying analytics will help everyone in your organization to speak the same language and be on the same page about what analytics can do for them. As I've held workshops on analyt

Dramatically improve your conversions with session recording

Are you satisfied with your sign-up rate? Or activation rate? Do you think it could be higher? What do you think you can do to improve them? How do you know that? One of the most powerful things that you can do to improve your activation rate is to watch what your users are doing: what confuses them, what path is logical to them and where they fail. When you do that, you will find out that your call-to-action is too mild, your layout is confusing, your responsive design isn't adapting well in all cases or that the navigation path you thought everyone  takes isn't common at all. You cannot solve what you do not know. Luckily most of your issues are likely low-hanging fruits you can find easy remedy to. There are a gazillion services you can use, HotJar  being perhaps the biggest name. But it wasn't the right choice, at least for me. HotJar lets you record towards a quota and you can define pretty flexibly who you want to start recording and when. But HotJar doesn'

Optimize your marketing efforts with the power of raw data

If you're in marketing (I'm not), you'd want to know how much bang-for-buck your campaigns are producing. Now I'm no Google Analytics expert, but I'm going to make a claim that you too might have an easy time measuring conversions across your campaigns, but not so much how your sign ups differ between them? Is one bringing in more high quality people than others? If you are running a web shop or some other transaction-based business then sure, you can set up Google Analytics Ecommerce  to follow up on customer life time value. But what if you are running a SaaS and you bill monthly? Or you have a freemium model and you hope to monetize the high volume users through some additional features, how would you compare your campaigns then? With only Google Analytics, it is tough to analyze how the users coming through different marketing campaigns differ from each other. What you need is raw data and to know the path each of your user has taken as they sign up to y

Calculating active users and churn in a database

Have you ever struggled with calculating your active users per day, or month? How about your new activations or churned users? It's a common problem, but often the first solution you find will not scale. You should avoid sub queries, and use a window function instead. But that's only a part of the solution. Assuming your active user definition is "someone who has signed in within 7 days" and you have this data in a table with columns user_id , timestamp . The first thing you want to do is calculate when a user activates for the first time, when they churn and when they resurrect. CREATE OR REPLACE VIEW "vw_login" AS SELECT * , LEAST (LEAD("timestamp") OVER w, "timestamp" + 7) AS "activeExpiry" , CASE WHEN LAG("timestamp") OVER w IS NULL THEN true ELSE false AS "activated" , CASE WHEN LEAD("timestamp") OVER w IS NULL THEN true WHEN LEAD("timestamp") OVER w - "t

Power BI, Tableau or QlikView

Ok ok I went along with Power BI but hear me out. A year ago I joined Zervant to be responsible for the company's BI and analytics. Pretty soon I had to make a choice: which BI package should we take into use. Before this, Zervant didn't use anything apart from Excel and a Geckoboard. From my previous positions at Nokia and Microsoft, I have experience in Tableau, QlikView  and Power BI, in that chronological order. It quickly came down to a choice between these three. Or to be honest - between Tableau and Power BI. Give a man a fish, and you feed him for a day. Teach a man to fish, and you feed him for a lifetime. I'm a big believer of self-service BI. I would rather build an analytics team that enables others solve their decision needs rather than  giving the right conclusions. Therefore, building something that can be used by anyone in the company was one of the goals. Also, joining a start-up meant cost was a concern. QlikView is in my experience complex

Why WINDOW functions are changing the game for MySQL

Back when we were deciding on an SQL flavor at Zervant , the fact that MySQL didn't have WINDOW functions was a deal breaker. We had plenty of MySQL experience and not that much in PostgreSQL but I had to go with PostgreSQL anyway - just because of WINDOW functions. But that's about to change, as MySQL is getting this support soon , too. Imagine you need to calculate the number of active users you have on any given day. Now imagine that the way you have defined an active user is as someone who has done something  on your service within e.g. 30 days. The something  might be e.g. uploaded something, downloaded something, made a purchase, edited their profile - doesn't matter. And imagine you want to split by dimensions. Without WINDOW functions you don't have good options. You can accomplish this by either sub queries, which gets very expensive: SELECT, t.count, (SELECT SUM(t2.count) FROM t as t2 WHERE <= AS cumulative_

Popular posts from this blog

Snowflake UPSERT operation (aka MERGE)