Google Search Console and fat tails

Today I've exported the data from Google Search Console in CSV, and I've been playing with it. I've sorted the articles of this website from the most clicked to the least.

Here is the result:

The sample has 228 articles in total, and it only includes posts with a minimum of 1 click:

ArticlesTotal clicksContributionµσ
[1 - 228]50264100.00%220.461,024.14

Average (µ) of 220.46 visits per article with a standard deviation (σ) of 1,024.14 visits.


The most visited post got 26.76% clicks (13450), and the first 20 articles contributed to 80.53% clicks (40479):

ArticlesTotal clicksContributionµσ
[1 - 20]4047980.53%2,023.952,947.97


Let's calculate the z-score of the 3 first articles:


The probability of an event assuming a cumulative standard normal distribution:

P(Zz)=z12πeu22duP(Z \leq z) = \int_{-\infty}^z \frac{{1}}{{\sqrt{2\pi}}} {e^\frac{-u^2}{2}} du

Using the previous formula and calculating the probability for a z=12.92z = 12.92:

# Cumulative standard deviation
function y = f(u)
    y = (1/sqrt(2 * pi)) * exp(-u^2/2)

# Calculating the integral
[P, ier, nfun, err] = quad ("f", -Inf, 12.92)

# P = 1.0000

The probability of that event is Q=1P(Z12.92)=0Q = 1 - P(Z \leq 12.92) = 0.


A normal distribution considers that a 12.92σ event is unlikely to happen. The second post is a 5.09σ event, and it has also a probability of 0 to happen.

The conclusion is that a single post can move the mean and average of the whole distribution. Thus, it's incorrect to assume a normal distribution here.

For example, let's assume that I didn't publish the most clicked post:

ArticlesTotal clicksContributionµσ
[2 - 228]3681473.24%162.18525.04

The average number of clicks and the standard deviation would be reduced by approximately half.

In practical terms:

  1. Most of your articles will perform below average
  2. One single article will take it all

Hi, I'm Erik, an engineer from Barcelona. If you like the post or have any comments, say hi.