DA100-Flatten out a parent-child hierarchy

Power BI DA-100 Exam Tips
Design a data model
Flatten out a parent-child hierarchy

In this series of blog posts, we will be examining some of the key features of Power BI Desktop, Power Query and the Power BI Service needed to pass exam DA-100: Analyzing Data with Microsoft Power BI.

In this post, we will examine DA-100 exam – Design a data model: Flatten out a parent-child hierarchy.

Flatten out a Parent-Child Hierarchy

This topic applies specifically to the parent and child functions found in the DAX language, and these functions are useful for extracting a hierarchy, where you have hierarchical data embedded in a single column. The DAX path functions enable you to create separate columns from which you can then build a hierarchy.

In this example, we have some basic sales data, and let us say we are interested in showing the user the hierarchy of the staff who have made these sales. So, if we move across to data mode, and move across to the Staff table, we can see that each staff member has a staff ID, and a manager ID, and the manager ID tells us who each staff member reports to.

So, in this example, we have some hierarchical data, whereby staff member A reports to staff member B, who in turn reports to staff member C, and what the DAX path functions enable us to do is to extract that hierarchy from the staff ID and manager ID columns, so that we can then have each level as a separate column. We can then build a hierarchy consisting of level one, level two, etc.

The PATH Function

The first of these functions, the key function, if you like, is simply PATH, and this function requires the column that contains each identifier, and then the parent column. Based on that information, it then returns the entire path for each item in the first column.

So, let us see the PATH function in action. We need to add a new column into the staff table, and here we click new column.

Let us call the new column Paths. So, we use the PATH function which takes two arguments: ID Column and Parent Column.

In our example, the ID column is Staff[Staff ID]; and the parent column is Staff[Manager ID]. That is all the path function needs, and from that, it then returns the entire path structure for each item in the Staff ID column.

The PATHCONTAINS Function

Next, let us look at the path contains function, and this simply returns true or false to specify whether a given item exists within a path.

Let us use the PATH function with the item 2023. The first argument is the path, which will be the Path column we have just created, and we are trying this with the item 2023.

So, every row that contains the item 2023, returns True, and rows without 2033 return False.

So, probably not a function that you will use too often.

The PATHLENGTH Function

One function that is quite useful is PATHLENGTH; how many items are there within each path. PATHLENGTH simply requires a path as its argument, and it returns the number of items within the path.

So, our only argument is Staff[Path] and this returns the length of each path.

If we click on the filter arrow, we can see that we have five levels in total, because the maximum path length is five.

The PATHITEM Function

The final two functions, which are variations on the same theme are PATHITEM and PATHITEMREVERSE. These are the functions which actually generate the items that you place in your hierarchy.

PATHITEM is the norm, and it will return a given item from the path. In terms of arguments, you specify the path, then you specify the position, and then you have a third optional argument, which enables you to determine the type of result. If we want the result returned as an integer, we use a one. If we want it returned as text, we either use zero or omit the third parameter, since text is the default.

So that we are returning something meaningful, we can use the LOOKUPVALUE function. The first argument is our result column, and this, of course, is Staff Name, and then the search column is Staff ID, and finally, the item that we are looking for will be our path item.

The first argument of PATHITEM will be Staff[Path], position will be one, and type integer.

So, by doing this, we end up with the name of the person who is at the top of the hierarchy.

Then of course, we can simply copy this formula and create our four additional levels.

Creating a Hierarchy

Now, we can go back to our report and create a hierarchy containing our Level 1, Level 2, Level 3, Level 4 and Level 5 columns.

Then we can perhaps add this hierarchy to a slicer visual. So, we can expand Level, and see the people who report to Andrew Miles, see who reports to them and work their way right down to level five.

So, that is the gist of this topic. It is all about using the DAX path functions to extract and construct a hierarchy from two initial columns, an ID column, and a parent ID column.