Output data for Frequently bought together
Raw output tables
StoreEntities
Description - Contains mapping of retail entity IDs to store and party IDs
Fields -
RetailEntityId, PK, LongType: a unique ID representing a single store or retailer
StoreId, FK to Store, IntegerType: a unique ID representing a single store
PartyId, FK to Party, LongType: a unique ID representing a single retailer
PartyEntities
Description - Contains mapping of retail entity IDs to party IDs
Fields -
RetailEntityId, PK, LongType: a unique ID representing a single store or retailer
PartyId, FK to Party, LongType: a unique ID representing a single retailer
ItemsetDefinitions
Description - Provides a mapping of itemset IDs to the product IDs contained in an itemset. An itemset can consist of one or more individual products. For example, given products A, B and C, the itemset {A, B, C} occurs when the three products are bought together in the same transaction.
Fields -
ItemsetId, PK, LongType: a unique ID for the itemset.
ProductId, FK to RetailProduct: a unique ID for the product
ItemsetAttributes
Description - Contains details and metrics related to itemsets for each retail entity and time period.
Fields -
ItemsetId, FK to ItemsetDefinitions, LongType
RetailEntityId, FK to StoreEntities and PartyEntities, LongType
TimePeriodId, FK to TimePeriods, IntegerType
ItemsetLen, IntegerType: indicate the number of items in the itemset. Can be one or more.
Frequency, IntegerType: indicates the number of times the items in the itemset were purchased together. This value can occasionally be slightly lower than the actual number of times items were purchased together due to input data preparation performed by the model. If a transaction contains more items than the threshold set by the max_basket_size parameter, excess items are removed from the transaction, starting with the least frequent.
PurchaseFrequency, IntegerType: for itemsets of length 1 (single items), PurchaseFrequency contains the actual number of times the item was present in a transaction. For other itemsets, PurchaseFrequency is equal to Frequency.
Support, FloatType: the support metric of the itemset. This value is equal to the percentage of transactions in which the itemset appears.
RuleAttributes
Description - Contains details about the association rules found. Association rules describe the relationship between itemsets. An itemset can consist of one or more products. Association rules are composed of a left-hand side (the antecedent) and a right-hand side (the consequent) itemset.
For example, given products A, B and C, the rule {A, B} => {C} has the antecedent itemset {A, B} and the consequent itemset {C}. This rule is interpreted as customers who buy products A and B together also tend to buy product C with a given probability. In this implementation, the antecedent itemset and the consequent itemset are limited to having one product each. Therefore, association rules in this table only describe the relationship between pairs of products. For example, for the itemset {A, B}, we could have rules {A} => {B} and {B} => {A}.
Fields -
RuleId, PK, LongType: a unique ID for the association rule
RetailEntityId, FK to StoreEntities and PartyEntities, LongType
TimePeriodId, FK to TimePeriods, IntegerType
ItemsetId, FK to ItemsetDefinitions, LongType: the itemset ID of the itemset the association rule is constructed from. This field can be mapped to ItemsetId in the ItemsetDefinitions table to obtain the product IDs contained in the itemset.
AntecedentItemsetId, FK to ItemsetDefinitions, LongType: the itemset ID of the antecedent itemset
ConsequentItemsetId, FK to ItemsetDefinitions, LongType: the itemset ID of the consequent itemset
ItemsetSupport, FloatType: the support of the itemset. The field is identical to the support value for the itemset in ItemsetAttributes
Confidence, FloatType: the confidence metric of the association rule
Lift, FloatType: the lift metric of the association rule
Chi2, FloatType: the chi squared statistic of the association rule. This metric can be used to measure the statistical significance of the association rule and indicate the strength of association between the antecedent and consequent itemset. Chi squared is computed taking into account the support, confidence, and lift metrics together. The higher the chi squared, the stronger the association between the products. For more information on using the chi squared metric for association rules, see the paper Chi-squared computation for association rules: Preliminary results.
Chi2IsValid, BooleanType: indicates whether the chi squared metric is valid. Chi squared is statistical test, which imposes conditions on the level of support, confidence and lift of an association rule for the test itself to be valid. A value of chi squared is high but if the test is invalid, the result can't be relied upon.
Chi2IsSignificant, BooleanType: indicates whether the association rule is statistically significant according to the chi squared statistical test. The significance level of the chi squared test can be configured with the chi_2_alpha model parameter.
DataStatistics
Description - Provides statistics about the analysis results for each retail entity and time period.
Fields -
TimePeriodId, FK to TimePeriods, IntegerType
RetailEntityId, FK to StoreEntities and PartyEntities, LongType
FirstTimestamp, TimestampType: first transaction timestamp that appears in the time period and retail entity
LastTimestamp, TimestampType: last transaction timestamp that appears in the time period and retail entity
NumTransaction, IntegerType: number of transactions in the time period and retail entity
NumUniqueItemsPurchased, IntegerType: number of unique items that were purchased in the time period and retail entity
AverageBasketSize, FloatType: the average number of unique products purchased in one transaction
NumRulesFound, IntegerType: the total number of association rules discovered
NumSignificantRulesFound, IntegerType: the total number of association rules discovered that are significant according to the chi squared test
MinSupportThreshold, FloatType: the minimum support for an itemset to be considered frequent
MinItemsetSupport, FloatType: the minimum itemset support among the discovered frequent itemsets
MaxItemsetSupport, FloatType: the maximum itemset support among the discovered frequent itemsets
AverageItemsetSupport, FloatType: the average itemset support among the discovered frequent itemsets
MinLift, FloatType: the minimum lift among the discovered association rules
MaxLift, FloatType: the maximum lift among the discovered association rules
AverageLift, FloatType: the average lift among the discovered association rules
MinChi2, FloatType: the minimum chi squared among the discovered association rules
Max Chi2, FloatType: the maximum chi squared among the discovered association rules
Average Chi2, FloatType: the average chi squared among the discovered association rules
TimePeriods
Description - Contains details of the analysis time periods defined in the analysis config.
Fields -
TimePeriodId, PK, IntegerType: a unique ID for the analysis time period
TimePeriodStart, TimestampType: the date and time of the start of the analysis time period
TimePeriodEnd, TimestampType: the date and time of the end of the analysis time period
TimePeriodName, StringType: the user defined name of the analysis period
TimePeriodDescription, StringType: the user defined description of the analysis period
Output used by the Power BI Dashboard
TimePeriods
Same table as explained earlier. TimePeriods
FBTProductsUI
Description - The FBTProductsUI is the primary table serving the Power BI dashboard containing association rules for Frequently bought together products. The results contained in this table are limited to itemsets containing two products only. Additionally, it contains one association rule per itemset with the rule having the highest chi squared value being selected.
Fields -
RuleId, PK, LongType
TimePeriodId, PK, FK to TimePeriods, IntegerType
StoreId, PK, FK to Store, IntegerType
PartyId, PK, FK to Party, LongType
IsoCurrencyCode, PK, FK to Currency, StringType
CombinationRank, IntegerType: the rank of the association rule in terms of chi squared value
Product1Id, FK to RetailProduct (input table), LongType: the antecedent product ID of the association rule
Product2Id, FK to RetailProduct (input table), LongType: the consequent product ID of the association rule
CombinationTransactionCount, IntegerType: Number of times the two products were purchased together
Product1TransactionCount, IntegerType: Number of times Product1 was purchased irrespectively of Product2 during the time period per store or retailer
Product2TransactionCount, IntegerType: Number of times Product2 was purchased irrespectively of Product1 during the time period per store or retailer irrespectively of Product2
Product1CombinationPurchaseQuantity, DecimalType: Number of items (quantity) of Product1 when the product was bought together with Product2
Product2CombinationPurchaseQuantity, DecimalType: Number of items (quantity) of Product2 when the product was bought together with Product1
Product1TotalPurchaseQuantity, DecimalType: Number of items (quantity) of Product1 that was purchased irrespectively of Product2 during the time period per store or retailer
Product2TotalPurchaseQuantity, DecimalType: Number of items (quantity) of Product2 that was purchased irrespectively of Product1 during the time period per store or retailer
TotalTransactions, IntegerType: The total number of transactions during the time period per store or retailer
Product1CombinationPurchaseAmount, DecimalType: The Product1's purchase amount based on all transactions when the product was bought together with Product2
Product2CombinationPurchaseAmount, DecimalType: The Product2's purchase amount based on all transactions when the product was bought together with Product1
Product1TotalPurchaseAmount, DecimalType: The Product1's purchase amount based on all transactions (irrespectively if that product was bought together with Product2)
Product2TotalPurchaseAmount, DecimalType: The Product2's purchase amount based on all transactions (irrespectively if that product was bought together with Product1)
RuleQualityCategoryId, IntegerType: a number indicating the strength of association between the products. 0 indicates low, 1 indicates medium, and 2 indicates high strength of association. These values are computed by ranking the association rules according to the chi squared metric and splitting them equally into three equally sized groups.
IsSignificant, BooleanType: indicates whether the association rule is statistically significant according to the chi squared test and that the chi squared test itself was valid. The significance level of the chi squared test can be configured with the chi_2_alpha model parameter.
FBTProductsAssociationsUI
Description - A table containing the IDs of the top association rules per product. The number of association rules per product can be configured with the num_top_associated_products parameter. This table can be joined onto FBTProductsUI to filter it by product ID.
Fields -
ProductId – the ID of the product to filter for
AssociatedProductId – the ID of the other product in the association rule
RuleId – the ID of the association rule