Institutsseminar/2023-03-09

Aus SDQ-Institutsseminar
Version vom 20. Februar 2023, 08:24 Uhr von Edouard Fouché (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „{{Termin |datum=2023-03-09T10:00:00.000Z |online=https://kit-lecture.zoom.us/j/67744231815 }}“)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Termin (Alle Termine)
Datum Donnerstag, 9. März 2023
Uhrzeit 10:00 – 10:20 Uhr (Dauer: 20 min)
Ort
Webkonferenz https://kit-lecture.zoom.us/j/67744231815
Vorheriger Termin Fr 3. März 2023
Nächster Termin Fr 17. März 2023

Termin in Kalender importieren: iCal (Download)

Vorträge

Vortragende(r) Dan Jia
Titel Reinforcement Learning for Solving the Knight’s Tour Problem
Vortragstyp Proposal
Betreuer(in) Edouard Fouché
Vortragssprache
Vortragsmodus online
Kurzfassung The knight’s tour problem is an instance of the Hamiltonian path problem that is a typical NP-hard problem. A knight makes L-shape moves on a chessboard and tries to visit all the squares exactly once. The tour is closed if a knight can finish a complete tour and end on a square that is a neighbourhood of its starting square; Otherwise, it is open. Many algorithms and heuristics have been proposed to solve this problem. The most well-known one is warnsdorff’s heuristic. Warnsdorff’s idea is to move to the square with the fewest possible moves in a greedy fashion. Although this heuristic is fast, it does not always return a closed tour. Also, it only works on boards of certain dimensions. Due to its greedy behaviour, it can get stuck into a local optimum easily. That is similar to the other existing approaches. Our goal in this thesis is to come up with a new strategy based on reinforcement learning. Ideally, it should be able to find a closed tour on chessboards of any size. We will consider several approaches: value-based methods, policy optimization and actor-critic methods. Compared to previous work, our approach is non-deterministic and sees the problem as a single-player game with a tradeoff between exploration and exploitation. We will evaluate the effectiveness and efficiency of the existing methods and new heuristics.
Neuen Vortrag erstellen

Hinweise