Sending negative reward score to Azure Personalizer

Rman-1501 0 Reputation points
2023-07-13T00:53:24.1266667+00:00

The reward score I send is a number between 0 and 1 (as recommended in the docs). My usecase is in a way that I send multiple rewards based on what behaviour user does. However, recently I changed the reward aggregation method from "Earliest" (first sent reward) to "Sum" which I believe is going to add all the rewards I send for an event.

My questions are:

  1. would it be fine if the sum of all sent rewards is more than 1?
  2. what happens if I send a negative reward? Because there are some times when the user does something which shows they became less interested, since I have "Sum" aggregation method, if I send a negative amount, will it subtract it from final reward?
  3. What if the sum of all sent rewards become a negative number? what does a negative reward number mean for the algorithm in Azure Personalizer?

Appreciate it,

Thank you.

Azure AI Personalizer
Azure AI Personalizer
An Azure artificial intelligence service that enables applications to personalize user experiences by learning from collective real-time user behavior.
34 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,000 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 48,126 Reputation points Microsoft Employee
    2023-07-13T09:17:44.86+00:00

    @Rman-1501 Based on my understanding of the service, I think the following would happen for the scenarios mentioned.

    • would it be fine if the sum of all sent rewards is more than 1?

    The reward value is a scalar value between 0 and 1, inclusive. The reward value represents the quality of the action that was taken. A higher reward value indicates a better action." Therefore, if the sum of all rewards is greater than 1, it means that the user has performed multiple actions, and the algorithm will consider the quality of all actions combined.

    • what happens if I send a negative reward? Because there are some times when the user does something which shows they became less interested, since I have "Sum" aggregation method, if I send a negative amount, will it subtract it from final reward?

    You can send a zero reward score instead of a negative score. As per the document using a negative score is possible only in certain scenarios and should only be used if you are experienced with reinforcement learning (RL). Personalizer trains the model to achieve the highest possible sum of rewards over time.

    • What if the sum of all sent rewards become a negative number? what does a negative reward number mean for the algorithm in Azure Personalizer?

    Personalizer trains the model to achieve the highest possible sum of rewards over time. So, if the sum becomes negative it might be treated as if the user did not perform any actions.

    I hope this helps!!

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.