Difference between Matched Rewards and Observed Rewards in Azure Personalizer?

Ross Perry 1 Reputation point
2022-10-10T14:52:15.677+00:00

I've been searching around and can't seem to find much of an explanation about the two.

Say you have a model that uses 20% of the rank calls for exploration. I suspect matched rewards are how many times out of the 80% it was rewarded.

Can anyone confirm this?

By recording locally, I can confirm that learned events and observed rewards match up but struggling to explain why matched events are so low. Graph for example:

249093-image.png

Azure AI Personalizer
Azure AI Personalizer
An Azure artificial intelligence service that enables applications to personalize user experiences by learning from collective real-time user behavior.
34 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,000 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 48,126 Reputation points Microsoft Employee
    2022-10-12T08:56:13.987+00:00

    @Ross Perry Based on the discussion with dev, Matched rewards are the percent of time Personalizer's best action has matched the baseline policy. If exploration is set to 20%, then the matched rewards, on average, will never exceed 80%.

    If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.