The paper's key technical contribution is in providing a general ADP method that can learn from the ILP based assignment found in ride-pooling. Additionally, they handle the extra combinatorial complexity from combinations of passenger requests by using a Neural Network based approximate value function and show a connection to Deep Reinforcement Learning that allows us to learn this value-function with increased stability and sample-efficiency.
附件: Neural Approximate Dynamic Programming for On-Demand Ride-Pooling.pdf [登录 后查看]
更多 [ 报告 ] 文章