Udostępnij za pośrednictwem


Relationships in Power BI: Fixing “One of the columns must have unique values” error message

One of the main benefit of using Power BI as a Self-Service BI tool is the capability of Power BI Desktop to build data models.

What that means is instead of forcing every source of data into a single table, which will inevitably require over-simplifications, data loss or even errors, in Power BI it is possible to import multiple datasets and define relationships between them as easily as dragging and dropping the linking column from one to the other.

Power BI Desktop : Full Data Model

Here we have 5 different queries, connected to 5 different data sources, that are pulled into a single model. They are then linked by relationships which will have an effect on visualizations and filters.

When starting to use this technique, one of the first and most frustrating issue that may be encountered is the following:

Power BI Desktop : You can't create a relationship error message

Dragging the key column from one table to the other results in a message saying: “You can’t create a relationship between these two columns because one of the columns must have unique values.”

The cause is simple: Power BI can only create relationships between 2 tables if at least one of the two selected columns hosts a distinct, unique list of the key values. It is to be noted that this is not a limitation specific to Power BI, it is in fact one of the basic design principle of all relational database technologies (the concept of primary and foreign keys). We will see more about that later, but it makes sense that in order to be able to find something, it needs to have a unique address.

We can work around that in 2 ways:

  • Edit one the query and make the column a distinct list of the key values by using "Remove Duplicate Rows" on it. The issue with this approach is that we will lose information, as usually the key is duplicated for a good reason (see example below)
  • Add an intermediary table made of the list of distinct key values to the model, which will then be linked to both original columns

Let’s illustrate that through an example.

Our data is going to be coming from 2 Excel tables, one containing the employee expenses by month, the other their monthly budget allowance by category:

Sample Data

We can quickly load those datasets into Power BI Desktop via 2 queries, but when we try to build a chart comparing monthly expenses versus budget, pulling from the 2 resulting tables, the numbers just don’t make any sense:

Power BI Desktop : Needing a new relationship

To solve that we know that we need a relationship between those 2 tables, using their common attribute: the employee column.

But of course, at that point we will get an error message when we try to create that relationship:

Power BI Destkop : Error when creating a relationship

Now we know why: in both tables the employees are duplicated, repeated either by months or by categories. Neither column contains a distinct list of employees, we need to build it ourselves. We have to take this approach here, as doing otherwise would result in an impacting loss of information (no more categories, or periods).

To do that, we first need to create a new query, appending values from both existing ones: in the query editor, Home tab, select Append Query as New.

We want data coming from both of our tables as we could be in a situation where an employee has an allocated budget but no expense, or the other way around. If we want the comprehensive list, we need to use both data sources to generate it.

We will append as new, and not just append, as we don’t want to lose one of our existing queries as the result of this operation. Just appending will override one of the existing ones.

Power BI Desktop : Append as New Query in the Query Editor

We will then select the tables that will be used to generate the distinct list of employees, in our case Expenses and Budget:

Power BI Desktop : Appending Queries

We will then rename the query as “Employees”, remove all columns other than Employee, and Remove Duplicates to end up with the distinct list of employees coming from both our tables:

Power BI Desktop : Query Editor

This method is not the most efficient one to build the unique list, but it is the simplest to illustrate here. A better approach would be to create reference queries (with disabled load) from our existing ones, make those distinct lists of employees, append them together and make them unique again. It would hold much better again a large volume of rows.

We can then close and apply, and get to the relationships pane to create them:

Power BI Desktop : Data Model

No error message should bother us anymore!

We usually recommend to “Hide in Report View” the employee columns in the original tables, so that users are not confused by the fact that they are multiple ones, and always use the common one we just created.

Power BI Desktop : Hide in Report View

Finally, we can build a report with all the benefits of a complete data model with well-defined relationships:

Power BI Report using relationships

This technique is part of a greater ensemble called dimensional modeling, and here we actually created the premise of an Employee dimension for our data model. Dimensional modeling is a fundamental design practice of Business Intelligence. It allows to build sound data models, both easily understandable by end users and efficient to use in terms of query performance. Some great resources to dig into this approach can be found in the literature section of the associated Wikipedia article.

That is another benefit of Power BI: we’re not only learning about the tool in itself, but also discovering more about data modeling and gaining new data related skills.

Comments

  • Anonymous
    January 19, 2017
    Slowly losing my mind as I still get the error after creating my appended table. I have a table called Tickets and a table called Invoices... The common field is GUIDCustomer.
    • Anonymous
      February 14, 2017
      Hi Jesse,A common issue is when you have duplicated values in your table, even when you shouldn't have / don't expect it. You could try de-duplicating your key field in one of the query by highlighting that column (and only it) and doing "remove rows > duplicates".If then you can create the relationship, that means you have a data quality issue: one of your unique key really is not.I hope that helps!
  • Anonymous
    April 23, 2017
    Hi!I've found if you want to work it by editing the query and removing duplicates, and you have empty cells, you'll get the same error: "you can't create a relationship between these two columns because one of the columns must have unique values"You have to filter rows with empty cells for the relationship:= Table.SelectRows(contacts_table, each [relationship_column] null and [relationship_column] "")And then remove duplicates.Cheers!Javier Ergas
    • Anonymous
      August 02, 2017
      Hi Javier, that's a very good remark, thanks a lot for making it!It's not perfect that Power BI Desktop uses the same error message for empty values and for duplicated keys, hopefully that will change in the future.