Output data for Frequently bought together

Raw output tables

StoreEntities

Description - Contains mapping of retail entity IDs to store and party IDs

Fields -

  • RetailEntityId, PK, LongType: a unique ID representing a single store or retailer

  • StoreId, FK to Store, IntegerType: a unique ID representing a single store

  • PartyId, FK to Party, LongType: a unique ID representing a single retailer

PartyEntities

Description - Contains mapping of retail entity IDs to party IDs

Fields -

  • RetailEntityId, PK, LongType: a unique ID representing a single store or retailer

  • PartyId, FK to Party, LongType: a unique ID representing a single retailer

ItemsetDefinitions

Description - Provides a mapping of itemset IDs to the product IDs contained in an itemset. An itemset can consist of one or more individual products. For example, given products A, B and C, the itemset {A, B, C} occurs when the three products are bought together in the same transaction.

Fields -

  • ItemsetId, PK, LongType: a unique ID for the itemset.

  • ProductId, FK to RetailProduct: a unique ID for the product

ItemsetAttributes

Description - Contains details and metrics related to itemsets for each retail entity and time period.

Fields -

  • ItemsetId, FK to ItemsetDefinitions, LongType

  • RetailEntityId, FK to StoreEntities and PartyEntities, LongType

  • TimePeriodId, FK to TimePeriods, IntegerType

  • ItemsetLen, IntegerType: indicate the number of items in the itemset. Can be one or more.

  • Frequency, IntegerType: indicates the number of times the items in the itemset were purchased together. This value can occasionally be slightly lower than the actual number of times items were purchased together due to input data preparation performed by the model. If a transaction contains more items than the threshold set by the max_basket_size parameter, excess items are removed from the transaction, starting with the least frequent.

  • PurchaseFrequency, IntegerType: for itemsets of length 1 (single items), PurchaseFrequency contains the actual number of times the item was present in a transaction. For other itemsets, PurchaseFrequency is equal to Frequency.

  • Support, FloatType: the support metric of the itemset. This value is equal to the percentage of transactions in which the itemset appears.

RuleAttributes

Description - Contains details about the association rules found. Association rules describe the relationship between itemsets. An itemset can consist of one or more products. Association rules are composed of a left-hand side (the antecedent) and a right-hand side (the consequent) itemset.

For example, given products A, B and C, the rule {A, B} => {C} has the antecedent itemset {A, B} and the consequent itemset {C}. This rule is interpreted as customers who buy products A and B together also tend to buy product C with a given probability. In this implementation, the antecedent itemset and the consequent itemset are limited to having one product each. Therefore, association rules in this table only describe the relationship between pairs of products. For example, for the itemset {A, B}, we could have rules {A} => {B} and {B} => {A}.

Fields -

  • RuleId, PK, LongType: a unique ID for the association rule

  • RetailEntityId, FK to StoreEntities and PartyEntities, LongType

  • TimePeriodId, FK to TimePeriods, IntegerType

  • ItemsetId, FK to ItemsetDefinitions, LongType: the itemset ID of the itemset the association rule is constructed from. This field can be mapped to ItemsetId in the ItemsetDefinitions table to obtain the product IDs contained in the itemset.

  • AntecedentItemsetId, FK to ItemsetDefinitions, LongType: the itemset ID of the antecedent itemset

  • ConsequentItemsetId, FK to ItemsetDefinitions, LongType: the itemset ID of the consequent itemset

  • ItemsetSupport, FloatType: the support of the itemset. The field is identical to the support value for the itemset in ItemsetAttributes

  • Confidence, FloatType: the confidence metric of the association rule

  • Lift, FloatType: the lift metric of the association rule

  • Chi2, FloatType: the chi squared statistic of the association rule. This metric can be used to measure the statistical significance of the association rule and indicate the strength of association between the antecedent and consequent itemset. Chi squared is computed taking into account the support, confidence, and lift metrics together. The higher the chi squared, the stronger the association between the products. For more information on using the chi squared metric for association rules, see the paper Chi-squared computation for association rules: Preliminary results.

  • Chi2IsValid, BooleanType: indicates whether the chi squared metric is valid. Chi squared is statistical test, which imposes conditions on the level of support, confidence and lift of an association rule for the test itself to be valid. A value of chi squared is high but if the test is invalid, the result can't be relied upon.

  • Chi2IsSignificant, BooleanType: indicates whether the association rule is statistically significant according to the chi squared statistical test. The significance level of the chi squared test can be configured with the chi_2_alpha model parameter.

DataStatistics

Description - Provides statistics about the analysis results for each retail entity and time period.

Fields -

  • TimePeriodId, FK to TimePeriods, IntegerType

  • RetailEntityId, FK to StoreEntities and PartyEntities, LongType

  • FirstTimestamp, TimestampType: first transaction timestamp that appears in the time period and retail entity

  • LastTimestamp, TimestampType: last transaction timestamp that appears in the time period and retail entity

  • NumTransaction, IntegerType: number of transactions in the time period and retail entity

  • NumUniqueItemsPurchased, IntegerType: number of unique items that were purchased in the time period and retail entity

  • AverageBasketSize, FloatType: the average number of unique products purchased in one transaction

  • NumRulesFound, IntegerType: the total number of association rules discovered

  • NumSignificantRulesFound, IntegerType: the total number of association rules discovered that are significant according to the chi squared test

  • MinSupportThreshold, FloatType: the minimum support for an itemset to be considered frequent

  • MinItemsetSupport, FloatType: the minimum itemset support among the discovered frequent itemsets

  • MaxItemsetSupport, FloatType: the maximum itemset support among the discovered frequent itemsets

  • AverageItemsetSupport, FloatType: the average itemset support among the discovered frequent itemsets

  • MinLift, FloatType: the minimum lift among the discovered association rules

  • MaxLift, FloatType: the maximum lift among the discovered association rules

  • AverageLift, FloatType: the average lift among the discovered association rules

  • MinChi2, FloatType: the minimum chi squared among the discovered association rules

  • Max Chi2, FloatType: the maximum chi squared among the discovered association rules

  • Average Chi2, FloatType: the average chi squared among the discovered association rules

TimePeriods

Description - Contains details of the analysis time periods defined in the analysis config.

Fields -

  • TimePeriodId, PK, IntegerType: a unique ID for the analysis time period

  • TimePeriodStart, TimestampType: the date and time of the start of the analysis time period

  • TimePeriodEnd, TimestampType: the date and time of the end of the analysis time period

  • TimePeriodName, StringType: the user defined name of the analysis period

  • TimePeriodDescription, StringType: the user defined description of the analysis period

Output used by the Power BI Dashboard

TimePeriods

Same table as explained earlier. TimePeriods

FBTProductsUI

Description - The FBTProductsUI is the primary table serving the Power BI dashboard containing association rules for Frequently bought together products. The results contained in this table are limited to itemsets containing two products only. Additionally, it contains one association rule per itemset with the rule having the highest chi squared value being selected.

Fields -

  • RuleId, PK, LongType

  • TimePeriodId, PK, FK to TimePeriods, IntegerType

  • StoreId, PK, FK to Store, IntegerType

  • PartyId, PK, FK to Party, LongType

  • IsoCurrencyCode, PK, FK to Currency, StringType

  • CombinationRank, IntegerType: the rank of the association rule in terms of chi squared value

  • Product1Id, FK to RetailProduct (input table), LongType: the antecedent product ID of the association rule

  • Product2Id, FK to RetailProduct (input table), LongType: the consequent product ID of the association rule

  • CombinationTransactionCount, IntegerType: Number of times the two products were purchased together

  • Product1TransactionCount, IntegerType: Number of times Product1 was purchased irrespectively of Product2 during the time period per store or retailer

  • Product2TransactionCount, IntegerType: Number of times Product2 was purchased irrespectively of Product1 during the time period per store or retailer irrespectively of Product2

  • Product1CombinationPurchaseQuantity, DecimalType: Number of items (quantity) of Product1 when the product was bought together with Product2

  • Product2CombinationPurchaseQuantity, DecimalType: Number of items (quantity) of Product2 when the product was bought together with Product1

  • Product1TotalPurchaseQuantity, DecimalType: Number of items (quantity) of Product1 that was purchased irrespectively of Product2 during the time period per store or retailer

  • Product2TotalPurchaseQuantity, DecimalType: Number of items (quantity) of Product2 that was purchased irrespectively of Product1 during the time period per store or retailer

  • TotalTransactions, IntegerType: The total number of transactions during the time period per store or retailer

  • Product1CombinationPurchaseAmount, DecimalType: The Product1's purchase amount based on all transactions when the product was bought together with Product2

  • Product2CombinationPurchaseAmount, DecimalType: The Product2's purchase amount based on all transactions when the product was bought together with Product1

  • Product1TotalPurchaseAmount, DecimalType: The Product1's purchase amount based on all transactions (irrespectively if that product was bought together with Product2)

  • Product2TotalPurchaseAmount, DecimalType: The Product2's purchase amount based on all transactions (irrespectively if that product was bought together with Product1)

  • RuleQualityCategoryId, IntegerType: a number indicating the strength of association between the products. 0 indicates low, 1 indicates medium, and 2 indicates high strength of association. These values are computed by ranking the association rules according to the chi squared metric and splitting them equally into three equally sized groups.

  • IsSignificant, BooleanType: indicates whether the association rule is statistically significant according to the chi squared test and that the chi squared test itself was valid. The significance level of the chi squared test can be configured with the chi_2_alpha model parameter.

FBTProductsAssociationsUI

Description - A table containing the IDs of the top association rules per product. The number of association rules per product can be configured with the num_top_associated_products parameter. This table can be joined onto FBTProductsUI to filter it by product ID.

Fields -

  • ProductId – the ID of the product to filter for

  • AssociatedProductId – the ID of the other product in the association rule

  • RuleId – the ID of the association rule