Reinforcement Learning for Solving the Knight’s Tour Problem: Unterschied zwischen den Versionen
(Die Seite wurde neu angelegt: „{{Vortrag |vortragender=Dan Jia |email=ullrm@student.kit.edu |vortragstyp=Proposal |betreuer=Edouard Fouché |termin=Institutsseminar/2023-03-09 |vortragsmodus…“) |
Keine Bearbeitungszusammenfassung |
||
Zeile 6: | Zeile 6: | ||
|termin=Institutsseminar/2023-03-09 | |termin=Institutsseminar/2023-03-09 | ||
|vortragsmodus=online | |vortragsmodus=online | ||
|kurzfassung= | |kurzfassung=The knight’s tour problem is an instance of the Hamiltonian path problem that is a typical NP-hard problem. A knight makes L-shape moves on a chessboard and tries to visit all the squares exactly once. The tour is closed if a knight can finish a complete tour and end on a square that is a neighbourhood of its starting square; Otherwise, it is open. Many algorithms and heuristics have been proposed to solve this problem. The most well-known one is warnsdorff’s heuristic. Warnsdorff’s idea is to move to the square with the fewest possible moves in a greedy fashion. Although this heuristic is fast, it does not always return a closed tour. Also, it only works on boards of certain dimensions. Due to its greedy behaviour, it can get stuck into a local optimum easily. That is similar to the other existing approaches. Our goal in this thesis is to come up with a new strategy based on reinforcement learning. Ideally, it should be able to find a closed tour on chessboards of any size. We will consider several approaches: value-based methods, policy optimization and actor-critic methods. Compared to previous work, our approach is non-deterministic and sees the problem as a single-player game with a tradeoff between exploration and exploitation. We will evaluate the effectiveness and efficiency of the existing methods and new heuristics. | ||
}} | }} |
Aktuelle Version vom 23. Februar 2023, 15:14 Uhr
Vortragende(r) | Dan Jia | |
---|---|---|
Vortragstyp | Proposal | |
Betreuer(in) | Edouard Fouché | |
Termin | Do 9. März 2023 | |
Vortragssprache | ||
Vortragsmodus | online | |
Kurzfassung | The knight’s tour problem is an instance of the Hamiltonian path problem that is a typical NP-hard problem. A knight makes L-shape moves on a chessboard and tries to visit all the squares exactly once. The tour is closed if a knight can finish a complete tour and end on a square that is a neighbourhood of its starting square; Otherwise, it is open. Many algorithms and heuristics have been proposed to solve this problem. The most well-known one is warnsdorff’s heuristic. Warnsdorff’s idea is to move to the square with the fewest possible moves in a greedy fashion. Although this heuristic is fast, it does not always return a closed tour. Also, it only works on boards of certain dimensions. Due to its greedy behaviour, it can get stuck into a local optimum easily. That is similar to the other existing approaches. Our goal in this thesis is to come up with a new strategy based on reinforcement learning. Ideally, it should be able to find a closed tour on chessboards of any size. We will consider several approaches: value-based methods, policy optimization and actor-critic methods. Compared to previous work, our approach is non-deterministic and sees the problem as a single-player game with a tradeoff between exploration and exploitation. We will evaluate the effectiveness and efficiency of the existing methods and new heuristics. |