Quantcast
Channel: SQLBI
Viewing all articles
Browse latest Browse all 434

Calculate New, Returning, Lost, and Recovered Customers in #dax

$
0
0

Calculating the number of new and returning customers is a recurring question. I would say this is a “classical” Business Intelligence problem, very common in marketing department. I worked on these problems with many customers, with small and large datasets, and I wrote a DAX Pattern “New and Returning Customers” showing how to calculate:

  • New Customers: customers who never made any purchase
  • Returning Customers: customers who bought something in the past
  • Lost Customers: customers who bought something but did not buy in the last N days days
  • Recovered Customers: previously “lost customers” who made a new purchase

This is not a brand new topic, you can find many other blog posts on this topic (Chris Webb, Javier Guillén, Gerhard Brueckl, David Hager, Rob Collie), so my goal was to show very generic formulas that were generally the best solution in term of performance. This make the formula less readable, such as the following:

[Returning Customers] :=
COUNTROWS(
    CALCULATETABLE(
        VALUES(<customer_key_column> ),
        VALUES(<customer_key_column> ),
        FILTER(
            ALL(<date_column> ),
            <date_column> < MIN(<date_column> )
        )
    )
)

As you see, using CALCULATETABLE ( VALUES ( table[column] ), VALUES ( table[column] ), … ) seems a useless thing. Why counting the rows returned by VALUES and passing it also as a filter argument? This is a not so intuitive behavior of CALCULATE. The first argument is an expression that will be evaluated in a modified filter context. The third argument is a FILTER on the date column, which extends the range of dates considered, including all the past sales transactions. At this point, the first VALUES would return any customers who made a purchase in the past, but the second argument will only considered those that made a purchase in the current selection of time. The final result is an AND condition between two sets of customers (the intersection of the two sets), which is faster than trying to calculate the number of past transactions of each customer who made a purchase in the current selection of time, filtering only those that results in zero transactions.

In general, I prefer using more readable DAX formulas, also in DAX patterns, optimizing them only when necessary. But in this case the performance might be important (visible to the user) also with a few thousands of customers. As usual, any feedback on the New and Returning Customers pattern will be very welcome!


Viewing all articles
Browse latest Browse all 434

Trending Articles