Quantcast
Channel: SQLBI
Viewing all articles
Browse latest Browse all 434

Optimizing IF conditions using variables

$
0
0

This article describes a very common optimization pattern using variables to optimize conditional expressions in DAX.


In a previous article, we already shown the importance of using variables to replace multiple instances of the same measure in a DAX expression. A very common use case is that of the IF function. This article focuses the attention on the cost of the formula engine cost rather than of the storage engine.

Consider the following measure.

Margin := 
IF ( 
    [Sales Amount] > 0 && [Total Cost] > 0,
    [Sales Amount] - [Total Cost]
)

The basic idea is that the difference between Sales Amount and Total Cost should be evaluated only whether both measures are greater than zero. In such a condition, the DAX engine produces a query plan that evaluates each measure twice. This is visible in the storage engine requests generated for the following query.

EVALUATE
SUMMARIZECOLUMNS ( 
    'Date'[Year], 
    "Margin", [Margin] 
)

However, it is worth to consider that the physical query plan has 216 rows, which is a reference we’ll consider in following variations of the same measure.

Without going in more details that are already explained in a previous article, it is worth noting that the multiple references to the same measure are requiring separate evaluations, even if the result is the same. DAX is not so good at saving the value of common subexpressions evaluated in the same filter context. This is evident in the following variation of the Margin measure. The two branches of the IF function are identical, but the query plan adds other evaluations for both the storage engine and the formula engine.

Margin 2 := 
IF ( 
    [Sales Amount] > 0 && [Total Cost] > 0,
    [Sales Amount] - [Total Cost],
    [Sales Amount] - [Total Cost]
)

In this case there is an additional storage engine query and the number of rows in the physical query plan is now 342, adding more than 50% of lines to the previous workload.

The optimized version of this measure stores in two variables the two measures, so that they are evaluated only once in the IF function.

Margin Optimized := 
VAR SalesAmount = [Sales Amount]
VAR TotalCost = [Total Cost]
RETURN
    IF ( 
        SalesAmount = 0 && TotalCost > 0,
        SalesAmount – TotalCost
    )

This is visible in the storage engine requests, which are only two.

A version of the IF function with the second branch identical to the first one would produce the same storage engine queries.

The physical query plan reduced the number of rows from 216 to 126.

This is an important result. This optimization technique is particularly useful when there are multiple references to a measure that has a high cost in the formula engine, because the DAX cache only operates at the storage engine level.

Conclusion

Multiple references to the same measure in the same filter context can produce multiple executions of the same DAX expression producing an identical result. Saving the result of the measure in a variable generates a better query plan, improving performances of DAX code.


Viewing all articles
Browse latest Browse all 434

Trending Articles