Skip to content

Rewards

The default task for the game, composed of the on_goal_reached and action_cost reward functions.

A reward function that returns a negative value when an action is taken. All actions have a cost of cost, except for noops.

Parameters:

Name Type Description Default
prev_state State

The previous state of the game.

required
action Array

The action taken.

required
new_state State

The new state of the game.

required
cost float

The cost of taking an action.

0.01

Returns:

Name Type Description
Array Array

A scalar array f32[] with value -cost if the action is not a noop, and 0 otherwise.

Compose multiple reward functions into a single reward function. The functions are called in order and the results are reduced using the operator function.

Parameters:

Name Type Description Default
*reward_functions Callable[[State, Array, State], Array]

A list of reward functions.

()
operator Callable

The operator to reduce the results of the reward functions.

sum

Returns:

Name Type Description
Callable Callable

A composed reward function that applies the operator to the results of the reward functions.

A reward function that always returns 0, to simulate reward-free learning.

Parameters:

Name Type Description Default
state State

The current state of the game.

required

Returns:

Name Type Description
Array Array

A scalar array f32[] with value 0.

A reward function that returns a positive value when the agent uses the action done in front of a door.

Parameters:

Name Type Description Default
state State

The current state of the game.

required

Returns:

Name Type Description
Array Array

A scalar array f32[] with value 1 if the agent uses the action done in front of a door, and 0 otherwise.

A reward function that returns 1 when the goal is reached, and 0 otherwise.

Parameters:

Name Type Description Default
state State

The current state of the game.

required

Returns:

Name Type Description
Array Array

A scalar array f32[] with value 1 if the goal is reached, and 0 otherwise.

A reward function that returns a negative value as time passes, paying a cost of cost at each time step.

Parameters:

Name Type Description Default
prev_state State

The previous state of the game.

required
action Array

The action taken.

required
new_state State

The new state of the game.

required
cost float

The cost of time passing.

0.01

Returns:

Name Type Description
Array Array

A scalar array f32[] with value -cost.

A reward function that returns a negative value when the agent hits a wall, paying a cost of cost for each wall hit.

Parameters:

Name Type Description Default
state State

The current state of the game.

required
cost float

The cost of hitting a wall.

0.01

Returns:

Name Type Description
Array Array

A scalar array f32[] with value -cost if the agent hits a wall, and 0 otherwise.