**Elevator Planning During Evening Rush Hour**

Example name | Elevators |

Action space | Dict |

State space | Dict |

The Elevator domain models evening rush hours when people from different floors in a building want to go down to the bottom floor using elevators. Potential passengers arrive at a floor according to the Poisson process, with specific rates that can be different across different floors. An elevator can move upwards to pick up passengers; once it opens its door to let people in, it can only move down towards the bottom floor, though the elevator can stop at intermediate floors to pick up more passengers.

Per each person in an elevator, there is a small penalty in rewards. Additionally, there is a penalty for each person waiting for an elevator.

When an elevator delivers a passenger to their destination, a large positive reward is given per person.
So, a good policy should be able to minimize the penalties while maximizing the positive rewards by delivering many people so that they can go back home!

ARRIVE-PARAM(floor) float32 The rate of Poisson arrival process TOP-FLOOR(floor) bool Indicator for the top floor BOTTOM-FLOOR(floor) bool Indicator for the bottom floor ADJACENT-UP(floor, floor) bool Indicator for adjacent floors PRECEDENCE(elevator, elevator) bool Which elevator takes precedence over the other IN-ELEVATOR-PENALTY float32 Penalty for people in elevators PEOPLE-WAITING-PENALTY float32 Penalty for people waiting on floors REWARD-DELIVERED float32 Reward for delivering each person to the bottom floor

Constant | Type | Desc |
---|---|---|

ARRIVE-PARAM(floor) | float32 | The rate of Poisson arrival process |

TOP-FLOOR(floor) | bool | Indicator for the top floor |

BOTTOM-FLOOR(floor) | bool | Indicator for the bottom floor |

ADJACENT-UP(floor, floor) | bool | Indicator for adjacent floors |

PRECEDENCE(elevator, elevator) | float32 | Which elevator takes precedence over the other |

IN-ELEVATOR-PENALTY | float32 | Penalty for people in elevators |

PEOPLE-WAITING-PENALTY | float32 | Penalty for people waiting on floors |

REWARD-DELIVERED | float32 | Reward for delivering each person to the bottom floor |

All of these can be read from the RDDLEnv interface and from the RDDL files.

The actions are the elevator control operations.

Action | Type | Desc |
---|---|---|

move-current-dir(elevator) | Discrete(2) | Whether to move in the current direction (cannot move if door is open) |

open-door(elevator) | Discrete(2) | Whether to open the door to let people in (direction changes to down afterwards) |

close-door(elevator) | Discrete(2) | Whether to close door |

he state space represents the temerature and the previous production of each of the plants in the problem.

State | Type | Desc |
---|---|---|

num-person-waiting(floor) | int | Number of people waiting on a floor |

num-person-in-elevator(elevator) | int | Number of people in an elevator |

elevator-dir-up(elevator) | bool | Direction of an elevator (True if going up) |

elevator-closed(elevator) | bool | Whether the door is open or closed |

elevator-at-floor(elevator,floor) | bool | Whether an elevator is on a specific floor |

$r_t = - p_e * \sum_{e \in elevators} num-person-in-elevator(e) - p_w * \sum_{f \in floors} num-person-waiting(f) + rew * \sum_{e \in elevators} (num-person-in-elevator(e) * I_bottom(e))$

where,

- p_e: IN-ELEVATOR-PENALTY
- p_w: PEOPLE-WAITING-PENALTY
- rew: REWARD-DELIVERED
- I_bottom(e): 1 if e is at the bottom floor; otherwise, 0