Power BI: DAX: Statistical Functions
Data Analysis Expressions (DAX) provides many functions for creating aggregations such as sums, counts, and averages. These functions are very similar to aggregation functions used by Microsoft Excel.
Read more about the following statistical functions:
- AVERAGE
- AVERAGEA
- AVERAGEX
- COUNT
- COUNTA
- COUNTAX
- COUNTBLANK
- COUNTROWS
- COUNTX
- MAX
- MAXA
- MAXX
- MIN
- MINA
- MINX
Original article (before wiki edits) was written by Jeannine Takaki and JuanPablo Jofre and formatted by Mary Browning, Microsoft SQL Server Technical Writers.
AVERAGE
Returns the average (arithmetic mean) of all the numbers in a column.
This function takes the specified column as an argument and finds the average of the values in that column. If you want to find the average of an expression that evaluates to a set of numbers, use the AVERAGEX function instead.
Nonnumeric values in the column are handled as follows:
If the column contains text, no aggregation can be performed, and the functions returns blanks.
If the column contains logical values or empty cells, those values are ignored.
Cells with the value zero are included.
When you average cells, you must keep in mind the difference between an empty cell and a cell that contains the value 0 (zero). When a cell contains 0, it is added to the sum of numbers and the row is counted among the number of rows used as the divisor. However, when a cell contains a blank, the row is not counted.
When there are no rows to aggregate, the function returns a blank. However, if there are rows, but none of them meet the specified criteria, the function returns 0. Excel also returns a zero if no rows are found that meet the conditions.
Related Functions
The AVERAGEX function can take as its argument an expression that is evaluated for each row in a table. This enables you to perform calculations and then take the average of the calculated values.
The AVERAGEA function takes a column as its argument, but otherwise is like the Excel function of the same name. By using the AVERAGEA function, you can calculate a mean on a column that contains empty values.
Syntax
AVERAGE(<column>)
column. The column that contains the numbers for which you want the average.
Returns a number that represents the arithmetic mean of the numbers in the column.
Example
The following formula returns the average of the values in the column, ExtendedSalesAmount, in the table, InternetSales.
=AVERAGE(InternetSales[ExtendedSalesAmount])
AVERAGEA
Returns the average (arithmetic mean) of the values in a column. Handles text and non-numeric values.
The AVERAGEA function takes a column and averages the numbers in it, but also handles non-numeric data types according to the following rules:
Values that evaluates to TRUE count as 1.
Values that evaluate to FALSE count as 0 (zero).
Values that contain non-numeric text count as 0 (zero).
Empty text ("") counts as 0 (zero).
If you do not want to include logical values and text representations of numbers in a reference as part of the calculation, use the AVERAGE function.
Whenever there are no rows to aggregate, the function returns a blank. However, if there are rows, but none of them meet the specified criteria, the function returns 0. The function also returns a zero if no rows are found that meet the conditions.
Syntax
AVERAGEA(<column>)
column. The column that contains the values for which you want the average.
Returns a number (R8).
Example
The following example returns the average of non-blank cells in the referenced column, given the following table. If you used the AVERAGE function, the mean would be 21/2; with the AVERAGEA function, the result is 22/5.
Transaction ID | Amount | Result |
0000123 | 1 | Counts as 1 |
0000124 | 20 | Counts as 20 |
0000125 | n/a | Counts as 0 |
0000126 | Counts as 0 | |
0000126 | TRUE | Counts as 1 |
Formula: =AVERAGEA([Amount])
AVERAGEX
Calculates the average (arithmetic mean) of a set of expressions evaluated over a table.
The AVERAGEX function enables you to evaluate expressions for each row of a table, and then take the resulting set of values and calculate its arithmetic mean. Therefore, the function takes a table as its first argument, and an expression as the second argument.
In all other respects, AVERAGEX follows the same rules as AVERAGE. You cannot include non-numeric or null cells. Both the table and expression arguments are required.
When there are no rows to aggregate, the function returns a blank. When there are rows, but none of them meet the specified criteria, then the function returns 0.
Syntax
AVERAGEX(<table>,<expression>)
table. Name of a table, or an expression that specifies the table over which the aggregation can be performed.
expression. An expression with a scalar result, which will be evaluated for each row of the table in the first argument.
Returns a number (R8).
Example
The following example calculates the average freight and tax on each order in the InternetSales table, by first summing Freight plus TaxAmt in each row, and then averaging those sums.
=AVERAGEX(InternetSales, InternetSales[Freight]+ InternetSales[TaxAmt])
If you use multiple operations in the expression used as the second argument, you must use parentheses to control the order of calculations. For more information, see DAX Syntax Specification in the TechNet Library.
COUNT
Counts the number of cells in a column that contain numbers.
The only argument allowed to this function is a column. You can use columns containing any type of data, but only numbers are counted. The COUNT function counts rows that contain the following kinds of values:
Numbers
Dates
If the row contains text that cannot be translated into a number, the row is not counted.
When the function finds no rows to count, it returns a blank. When there are rows, but none of them meet the specified criteria, then the function returns 0.
Syntax
COUNT(<column>)
column. The column that contains the numbers to be counted.
Returns a number (I8).
Example
The following example shows how to count the number of numeric values in the column, ShipDate.
=COUNT([ShipDate])
To count logical values or text, use the COUNTA or COUNTAX functions.
COUNTA
Counts the number of cells in a column that are not empty. It counts not just rows that contain numeric values, but also rows that contain nonblank values, including text, dates, and logical values.
If you do not need to count cells that contain logical values or text, (in other words, if you want to count only cells that contain numbers), use the COUNT or COUNTX functions.
When the function does not find any rows to count, the function returns a blank. When there are rows, but none of them meet the specified criteria, then the function returns 0.
Syntax
COUNTA(<column>)
column. The column that contains the values to be counted.
Returns a number (I8).
Example
The following example returns all rows in the Reseller table that have any kind of value in the column that stores phone numbers. Because the table name does not contain any spaces, the quotation marks are optional.
=COUNTA('Reseller'[Phone])
COUNTAX
Counts nonblank results when evaluating the result of an expression over a table. That is, it works just like the COUNTA function, but is used to iterate through the rows in a table and count rows where the specified expressions results in a nonblank result.
Like the COUNTA function, the COUNTAX function counts cells containing any type of information, including other expressions.
For example, if the column contains an expression that evaluates to an empty string, the COUNTAX function treats that result as nonblank. Usually the COUNTAX function does not count empty cells but in this case the cell contains a formula, so it is counted.
If you do not need to count logical values or text, use the COUNTX function instead.
Whenever the function finds no rows to aggregate, the function returns a blank. However, if there are rows, but none of them meet the specified criteria, the function returns 0. Microsoft Excel also returns 0 if no rows are found that meet the condition.
Syntax
COUNTAX(<table>,<expression>)
table. The table containing the rows for which the expression will be evaluated.
expression. The expression to be evaluated for each row of the table.
Returns a number (I8).
Example
The following example counts the number of nonblank rows in the column, Phone, using the table that results from filtering the Reseller table on [Status] = Active.
=COUNTAX(FILTER('Reseller',[Status]="Active"),[Phone])
COUNTBLANK
Counts the number of blank cells in a column.
The only argument allowed to this function is a column. You can use columns containing any type of data, but only blank cells are counted. Cells that have the value zero (0) are not counted, as zero is considered a numeric value and not a blank.
Whenever there are no rows to aggregate, the function returns a blank. However, if there are rows, but none of them meet the specified criteria, the function returns 0. Microsoft Excel also returns a zero if no rows are found that meet the conditions.
In other words, if the COUNTBLANK function finds no blanks, the result will be zero, but if there are no rows to check, the result will be blank.
Syntax
COUNTBLANK(<column>)
column. The column that contains the blank cells to be counted.
Returns a number (I8). If no rows are found that meet the condition, blanks are returned.
Example
The following example shows how to count the number of rows in the table Reseller that have blank values for BankName.
=COUNTBLANK(Reseller[BankName])
To count logical values or text, use the COUNTA or COUNTAX functions.
COUNTROWS
Counts the number of rows in the specified table, or in a table defined by an expression.
This function can be used to count the number of rows in a base table, but more often is used to count the number of rows that result from filtering a table, or applying context to a table.
Whenever there are no rows to aggregate, the function returns a blank. However, if there are rows, but none of them meet the specified criteria, the function returns 0. Microsoft Excel also returns a zero if no rows are found that meet the conditions.
Syntax
COUNTROWS(<table>)
table. The name of the table that contains the rows to be counted, or an expression that returns a table.
Returns a number (I8).
Example
The following example shows how to count the number of rows in the table Orders. The expected result is 52761.
=COUNTROWS('Orders')
The following example demonstrates how to use COUNTROWS with a row context. In this scenario, there are two sets of data that are related by order number. The table Reseller contains one row for each reseller; the table ResellerSales contains multiple rows for each order, each row containing one order for a particular reseller. The tables are connected by a relationship on the column, ResellerKey.
The formula gets the value of ResellerKey and then counts the number of rows in the related table that have the same reseller ID. The result is output in the column, CalculatedColumn1.
=COUNTROWS(RELATEDTABLE(ResellerSales))
The following table shows a portion of the expected results:
ResellerKey | CalculatedColumn1 |
1 | 73 |
2 | 70 |
3 | 394 |
COUNTX
Counts the number of rows that contain a number or an expression that evaluates to a number, when evaluating an expression over a table.
The COUNTX function takes two arguments. The first argument must always be a table, or any expression that returns a table. The second argument is the column or expression that is searched by COUNTX.
The COUNTX function counts only numeric values, or dates. Arguments that are logical values or text that cannot be translated into numbers are not counted. If the function finds no rows to count, it returns a blank. When there are rows, but none meets the specified criteria, then the function returns 0.
If you want to count logical values, or text, use the COUNTA or COUNTAX functions.
Syntax
COUNTX(<table>,<expression>)
table. The table containing the rows to be counted.
expression. An expression that returns the set of values that contains the values you want to count.
Returns an integer.
Example
The following formula returns a count of all rows in the Product table that have a list price.
=COUNTX(Product,[ListPrice])
The following formula illustrates how to pass a filtered table to COUNTX for the first argument. The formula uses a filter expression to get only the rows in the Product table that meet the condition, ProductSubCategory = "Caps", and then counts the rows in the resulting table that have a list price. The FILTER expression applies to the table Products but uses a value that you look up in the related table, ProductSubCategory.
=COUNTX(FILTER(Product,RELATED(ProductSubcategory[EnglishProductSubcategoryName])="Caps", Product[ListPrice])
MAX
Returns the largest numeric value in a column.
The MAX function takes as an argument a column that contains numeric values. If the column contains no numbers, MAX returns a blank. If you want to evaluate values that are not numbers, use the MAXA function.
Syntax
MAX(<column>)
column. The column in which you want to find the largest numeric value.
Returns a number (R8).
Example
The following example returns the largest value found in the ExtendedAmount column of the InternetSales table.
=MAX(InternetSales[ExtendedAmount])
MAXA
Returns the largest value in a column. Logical values and blanks are counted.
The MAXA function takes as argument a column, and looks for the largest value among the following types of values:
Numbers
Dates
Logical values, such as TRUE and FALSE. Rows that evaluate to TRUE count as 1; rows that evaluate to FALSE count as 0 (zero).
Empty cells are ignored. If the column contains no values that can be used, MAXA returns 0 (zero).
If you do not want to include logical values and blanks as part of the calculation, use the MAX function.
Syntax
MAXA(<column>)
column. The column in which you want to find the largest value.
Returns a number (R8).
Example
The following example returns the greatest value from a calculated column, named ResellerMargin, that computes the difference between list price and reseller price.
=MAXA([ResellerMargin])
The following example returns the largest value from a column that contains dates and times. This formula therefore gets the most recent transaction date.
=MAXA([TransactionDate])
MAXX
Evaluates an expression for each row of a table and returns the largest numeric value.
The table argument to the MAXX function can be a table name, or an expression that evaluates to a table. The second argument indicates the expression to be evaluated for each row of the table.
Of the values to evaluate, only the following are counted:
Numbers. If the expression does not evaluate to a number, MAXX returns 0 (zero).
Dates.
Empty cells, logical values, and text values are ignored. If you want to include non-numeric values in the formula, use the MAXA function.
If a blank cell is included in the column or expression, MAXX returns an empty column.
Syntax
MAXX(<table>,<expression>)
table. The table containing the rows for which the expression will be evaluated.
expression. The expression to be evaluated for each row of the table.
Returns a number (R8).
Example
The following formula uses an expression as the second argument to calculate the total amount of taxes and shipping for each order in the table, InternetSales. The expected result is 375.7184.
=MAXX(InternetSales, InternetSales[TaxAmt]+ InternetSales[Freight])
The following formula first filters the table InternetSales, by using a FILTER expression, to return a subset of orders for a specific sales region, defined as [SalesTerritory] = 5. The MAXX function then evaluates the expression used as the second argument for each row of the filtered table, and returns the highest cost for tax and shipping for just those orders. The expected result is 250.3724.
=MAXX(FILTER(InternetSales,[SalesTerritoryCode]="5"), InternetSales[TaxAmt]+ InternetSales[Freight])
MIN
Returns the smallest numeric value in a column. Ignores logical values and text.
The MIN function takes a column as an argument, and returns the smallest numeric value in the column. The following types of values in the columns are counted:
Numbers
Dates
If the column contains no numerical data, MIN returns blanks.
Empty cells, logical values, and text are ignored. If you want to include logical values and text representations of numbers in a reference as part of the calculation, use the MINA function.
Syntax
MIN(<column>)
column. The column in which you want to find the smallest numeric value.
Returns a number (R8).
Example
The following example returns the smallest value from the calculated column, ResellerMargin.
=MIN([ResellerMargin])
The following example returns the smallest value from a column that contains dates and times, TransactionDate. This formula therefore returns the date that is earliest.
=MIN([TransactionDate])
MINA
Returns the smallest value in a column, including any logical values and numbers represented as text.
The MINA function takes as argument a column that contains numbers, and determines the smallest value as follows:
If the column contains no numeric values, MINA returns 0 (zero).
Rows in the column that evaluates to logical values, such as TRUE and FALSE are treated as 1 if TRUE and 0 (zero) if FALSE.
Empty cells are ignored.
If you do not want to include logical values and text as part of the calculation, use the MIN function instead.
Syntax
MINA(<column>)
column. The column for which you want to find the minimum value.
Returns a number (R8).
Example
The following expression returns the minimum freight charge from the table, InternetSales.
=MINA(InternetSales[Freight])
The following expression returns the minimum value in the column, PostalCode. Because the data type of the column is text, the function does not find any numeric values, and the formula returns zero (0).
=MINA([PostalCode])
MINX
Returns the smallest numeric value that results from evaluating an expression for each row of a table.
The MINX function takes as its first argument a table, or an expression that returns a table. The second argument contains the expression that is evaluated for each row of the table.
The MINX function evaluates the results of the expression in the second argument according to the following rules:
Only numbers are counted. If the expression does not result in a number, MINX returns 0 (zero).
Empty cells, logical values, and text values are ignored. Numbers represented as text are treated as text.
If you want to include logical values and text representations of numbers in a reference as part of the calculation, use the MINA function.
Syntax
MINX(<table>, < expression>)
table. The table containing the rows for which the expression will be evaluated.
expression. The expression to be evaluated for each row of the table.
Returns a number (R8).
Example
The following example filters the table, InternetSales, and returns only rows for a specific sales territory. The formula then finds the minimum value in the column, Freight.
=MINX( FILTER(InternetSales, [SalesTerritoryKey] = 5),[Freight])
The following example uses the same filtered table as in the previous example, but instead of merely looking up values in the column for each row of the filtered table, the function calculates the sum of two columns, Freight and TaxAmt, and returns the smallest value resulting from that calculation.
=MINX( FILTER(InternetSales, InternetSales[SalesTerritoryKey] = 5), InternetSales[Freight] + InternetSales[TaxAmt])