This article explains how DAX handles dependencies between tables, columns and relationships, to help you avoid circular dependency errors.
If you work with Power BI data models, you create relationships between tables quite often. When building complex DAX code or creating calculated tables, you might encounter the problem of circular dependency errors as shown in the next figure:
You can avoid this error by paying attention to details that are not trivial at first sight.
This article explains how to avoid circular dependencies. Before we get to that, it is necessary to understand how the DAX engine manages dependencies. The sweet of the article is at the end, but you need to go through and understand the bitter first.
First, we need to define what a dependency is. Given two entities A and B, A depends on B when any change in B determines a change in A. For example, consider this calculated column:
Sales[Amount] = Sales[Quantity] * Sales[Net Price]
As a result, Amount depends on Quantity and Net Price. This example is simple, but in a data model there are multiple types of entities and the dependencies can become rather complex to analyze.
The entities involved in dependencies are tables, columns, and relationships. Each of these objects might depend on other objects. For example, a table might depend on a relationship, a column might depend on a table, and so on.
These are the two basic rules (a third rule will come later):
- An expression depends on all the columns, tables and relationships used in the expression.
- A relationship depends on the columns used for the relationship itself.
For example, in the following model – without any calculations – the presence of the two relationships already generates a set of dependencies:
Namely, relationship A depends on Customers[Customer] and Sales[Customer], whereas relationship B depends on Sales[Product] and Product[Product]. We use the following syntax to define dependencies:
A { Customers[Customer], Sales[Customer] } B { Product[Product], Sales[Product] }
The following code defines a calculated column in Customer, creating other dependencies:
Customers[TotalSales] = SUM ( Sales[Quantity] )
It is worth noting that there is no CALCULATE function in the expression. The TotalSales calculated column contains the amount of sales of all the rows in Sales, without filtering any customer. Nevertheless, the DAX code scans the Sales[Quantity] column through an iteration – remember that SUM is just syntax sugar for SUMX. The full set of dependencies is the following:
Customer[TotalSales] { Sales, Customers, Sales[Quantity], Sales[RowNumber] }
In this example, a simple calculated column generates four dependencies: two are from the source and target table, the other two are from the column referenced and from the internal RowNumber column.
The previous column example was for educational purpose only. Things become more complex if you want to compute a column that is actually useful, such as the sales of a given product. In that case, the code requires a CALCULATE statement and relationships get involved. Consider the following calculated column:
Customers[CustomerSales] = CALCULATE ( SUM ( Sales[Quantity] ) )
Now the calculation relies on relationship A, which is the relationship between Customer and Sales. Thus, the calculation will depend on all the entities that relationship A depends on.
Customers[CustomerSales] { Sales, Customers, Customers[Customer], Sales[RowNumber], Sales[Quantity] Relationship between Customer Sales (A) }
Dependency on relationships happens when:
- You use either the RELATED or RELATEDTABLE function,
- You use CALCULATE in a row context generating a context transition,
- You rely on expanded tables to move filters from one table to other tables.
In a word, your code depends on a relationship as soon as you directly or indirectly reference that relationship in a DAX function.
Now that you have seen the basic scenario, we can move one step further and build a calculated table containing the customer names. We will use two different functions to create the table: DISTINCT and VALUES. You will learn that the dependency will be different. Let us start with DISTINCT:
CustomerNames = DISTINCT ( Customers[Customer] )
What are the dependencies of this new table? Since it only uses the Customers[Customer] column, it depends on both the Customers table and the Customers[Customer] column:
CustomerNames { Customers, Customers[Customer] }
The result is different when using VALUES instead of DISTINCT. The following code produces an unexpected dependency on the relationship between Customer and Sales.
CustomerNames = VALUES ( Customers[Customer] )
The calculated table depends on three entities.
CustomerNames { Customers, Customers[Customer], Relationship between Customer Sales (A) }
The key to understanding unexpected circular dependencies lies in understanding why this calculated table depends on the relationship – which requires an understanding of the difference between DISTINCT and VALUES.
Both DISTINCT and VALUES return the list of values of a column. The difference is that DISTINCT only returns the actual values, whereas VALUES returns also the additional blank row created by the engine if there are invalid relationships. The same difference exists between ALLNOBLANKROW and ALL: the former does not return the optional blank row, whereas the latter does return it.
How does the blank row appear in the model? In the following picture, you can see Customers and Sales. The relationship is based on the Customer column of both tables. The last row in the Sales table has no corresponding row in Customers.
In this scenario, the engine adds an empty row (BLANK ROW) to the Customers table to ensure that all the rows in Sales have at least one corresponding row in Customers. This is standard behavior for DAX: the engine adds one and only one blank row. Any invalid reference from the Sales table relates to that same blank row added to Customers.
The presence of a blank row depends on the data, not on the model. If there is a customer in Sales with no reference in Customers, the blank row is found. If the relationship is always correct and the Customer column in Sales always references existing rows in Customers, then there is no blank row. In other words, the presence of a blank row in Customers[Customer] depends on the content of Sales[Customer] if there is a relationship in place.
If you use a function that might return that blank row, the result depends on all the incoming relationships of that table and not only on the table it is scanning. Therefore, when you use VALUES or ALL, the engine needs to create a dependency on all the relationships directed towards Customers. Please note that in the previous example there was only one relationship. In a real-world scenario, you might deal with many.
This description served the sole purpose of stating the last rule in handling dependencies:
- When you use a table function that depends on the blank row, the expression depends on all the relationships targeting the table.
In the following diagram, you see two versions of the customer names calculated table. CustomerDistinct uses DISTINCT and CustomerValues uses VALUES. The table created with DISTINCT can be related to Customers because it will depend solely on the Customers[Customer] column.
If you try to create a relationship between CustomerValues (on the many side) and Customers (on the one side), you will obtain this circular dependency error:
The error message is not very helpful. Translated in plain English, it reads:
“I am sorry, but if you create this relationship then you will end up having a relationship that depends on CustomerValues. At the same time, the CustomerValues table would depend on this new relationship. This circular dependency is not supported in DAX. Please, pay attention to your use of VALUES and ALL, because they are likely to be the cause of this error. Namely, I have checked that in the code of the calculated table you are using VALUES ( Customers[CustomerName] ). Try to replace that with DISTINCT, if you need to create the relationship.”
Following the chain of relationships is easy when there are only two tables. In a more complex model, it is likely that the chart of dependencies becomes so intricate that tracking down the circular dependency is more of a challenge. Nevertheless, most of the circular dependency errors originate in using VALUES instead of DISTINCT, and in using ALL instead of ALLNOBLANKROW.
As a final note, beware, sometimes the presence of ALL is hidden. For example, the following code seems totally legitimate:
ExpensiveProducts = CALCULATETABLE ( Product, Product[Price] >= 10 )
Although ALL does not appear anywhere in the code, it is present. If you remove the syntax sugar of the conditions in CALCULATETABLE, the full expression reads:
ExpensiveProducts = CALCULATETABLE ( Product, FILTER ( ALL ( Product[Price] ), Product[Price] >= 10 ) )
If you author a calculated table using CALCULATE or CALCULATETABLE, you will likely implicitly reference several ALL functions. If you expect to create relationships with this calculated table, use the full syntax and remove ALL, as in the following code:
ExpensiveProducts = CALCULATETABLE ( Product, FILTER ( ALLNOBLANKROW ( Product[Price] ), Product[Price] >= 10 ) )
This latter expression does not depend on the blank row and will not create any issue with further relationships.
Conclusion
The issue of circular dependencies is quite frequent in data models. Most of the time, circular dependencies occur when you use calculated tables. You can easily avoid them by paying attention to your choice of functions. The difference between DISTINCT and VALUES, or between ALL and ALLNOBLANKROW is a subtle difference. But once you get used to it, your code will be safer when it comes to relationships and circular references.