Day 16: Reindeer Maze
Megathread guidelines
- Keep top level comments as only solutions, if you want to say something other than a solution put it in a new post. (replies to comments can be whatever)
- You can send code in code blocks by using three backticks, the code, and then three backticks or use something such as https://topaz.github.io/paste/ if you prefer sending it through a URL
FAQ
- What is this?: Here is a post with a large amount of details: https://programming.dev/post/6637268
- Where do I participate?: https://adventofcode.com/
- Is there a leaderboard for the community?: We have a programming.dev leaderboard with the info on how to join in this post: https://programming.dev/post/6631465
Python
Part 1: Run Dijkstra’s algorithm to find shortest path.
I chose to represent nodes using the location (i, j)
as well as the direction dir
faced by the reindeer.
Initially I tried creating the complete adjacency graph but that lead to max recursion so I ended up populating graph for only the nodes I was currently exploring.
Part 2: Track paths while performing Dijkstra’s algorithm.
First, I modified the algorithm to look through neighbors with equal cost along with the ones with lesser cost, so that it would go through all shortest paths.
Then, I keep track of the list of previous nodes for every node explored.
Finally, I use those lists to run through the paths backwards, taking note of all unique locations.
Code:
import os
# paths
here = os.path.dirname(os.path.abspath(__file__))
filepath = os.path.join(here, "input.txt")
# read input
with open(filepath, mode="r", encoding="utf8") as f:
data = f.read()
from collections import defaultdict
from dataclasses import dataclass
import heapq as hq
import math
# up, right, down left
DIRECTIONS = [(-1, 0), (0, 1), (1, 0), (0, -1)]
# Represent a node using its location and the direction
@dataclass(frozen=True)
class Node:
i: int
j: int
dir: int
maze = data.splitlines()
m, n = len(maze), len(maze[0])
# we always start from bottom-left corner (facing east)
start_node = Node(m - 2, 1, 1)
# we always end in top-right corner (direction doesn't matter)
end_node = Node(1, n - 2, -1)
# the graph will be updated lazily because it is too much processing
# to completely populate it beforehand
graph = defaultdict(list)
# track nodes whose all edges have been explored
visited = set()
# heap to choose next node to explore
# need to add id as middle tuple element so that nodes dont get compared
min_heap = [(0, id(start_node), start_node)]
# min distance from start_node to node so far
# missing values are treated as math.inf
min_dist = {}
min_dist[start_node] = 0
# keep track of all previous nodes for making path
prev_nodes = defaultdict(list)
# utility method for debugging (prints the map)
def print_map(current_node, prev_nodes):
pns = set((n.i, n.j) for n in prev_nodes)
for i in range(m):
for j in range(n):
if i == current_node.i and j == current_node.j:
print("X", end="")
elif (i, j) in pns:
print("O", end="")
else:
print(maze[i][j], end="")
print()
# Run Dijkstra's algo
while min_heap:
cost_to_node, _, node = hq.heappop(min_heap)
if node in visited:
continue
visited.add(node)
# early exit in the case we have explored all paths to the finish
if node.i == end_node.i and node.j == end_node.j:
# assign end so that we know which direction end was reached by
end_node = node
break
# update adjacency graph from current node
di, dj = DIRECTIONS[node.dir]
if maze[node.i + di][node.j + dj] != "#":
moved_node = Node(node.i + di, node.j + dj, node.dir)
graph[node].append((moved_node, 1))
for x in range(3):
rotated_node = Node(node.i, node.j, (node.dir + x + 1) % 4)
graph[node].append((rotated_node, 1000))
# explore edges
for neighbor, cost in graph[node]:
cost_to_neighbor = cost_to_node + cost
# The following condition was changed from > to >= because we also want to explore
# paths with the same cost, not just better cost
if min_dist.get(neighbor, math.inf) >= cost_to_neighbor:
min_dist[neighbor] = cost_to_neighbor
prev_nodes[neighbor].append(node)
# need to add id as middle tuple element so that nodes dont get compared
hq.heappush(min_heap, (cost_to_neighbor, id(neighbor), neighbor))
print(f"Part 1: {min_dist[end_node]}")
# PART II: Run through the path backwards, making note of all coords
visited = set([start_node])
path_locs = set([(start_node.i, start_node.j)]) # all unique locations in path
stack = [end_node]
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
path_locs.add((node.i, node.j))
for prev_node in prev_nodes[node]:
stack.append(prev_node)
print(f"Part 2: {len(path_locs)}")
prev_nodes[neighbor].append(node)
I think you’re adding too many neighbours to the prev_nodes here potentially. At the time you explore the edge, you’re not yet sure if the path to the edge’s target via the current node will be the cheapest.
only improvement I can think of is to implement a dead end finder to block for the search algorithm to skip all dead ends that do not have the end tile (“E”). by block I mean artificially add a wall to the entrance of the dead end. this should help make it so that it doesn’t go down dead ends. It would be improbable but there might be an input with a ridiculously long dead end.
Interesting, how would one write such a finder? I can only think of backtracking DFS, but that seems like it would outweigh the savings.
ah well, my idea is at high level view. Here is a naive approach that should accomplish this. Not sure how else I would accomplish this without more thought put in to make it faster:
edit: whoops, sorry had broke the regex string and had to check for E and S is not deleted lol
This is how the first example would look like:
###############
#...#####....E#
#.#.#####.###.#
#.....###...#.#
#.###.#####.#.#
#.###.......#.#
#.#######.###.#
#...........#.#
###.#.#####.#.#
#...#.....#.#.#
#.#.#.###.#.#.#
#.....#...#.#.#
#.###.#.#.#.#.#
#S###.....#...#
###############
This is how the second example would look like:
#################
#...#...#...#..E#
#.#.#.#.#.#.#.#.#
#.#.#.#...#...#.#
#.#.#.#####.#.#.#
#...#.###.....#.#
#.#.#.###.#####.#
#.#...###.#.....#
#.#.#####.#.###.#
#.#.###.....#...#
#.#.###.#####.###
#.#.#...###...###
#.#.#.#####.#####
#.#.#.......#####
#.#.#.###########
#S#...###########
#################
for this challenge, it will only have a more noticeable improvement on larger maps, and especially effective if there are no loops! (i.e. one path) because it would just remove all paths that will lead to a dead end.
For smaller maps, there is no improvement or worse performance as there is not enough dead ends for any search algorithm to waste enough time on. So for more completeness sake, you would make a check to test various sizes with various amount of dead ends and find the optimal map size for where it would make sense to try to fill in all dead ends with walls. Also, when you know a maze would only have one path, then this is more optimal than any path finding algorithm, that is if the map is big enough. That is because you can just find the path fast enough that filling in dead ends is not needed and can just path find it.
for our input, I think this would not help as the map should NOT be large enough. This is naive approach is too costly. It would probably be better if there is a faster approach than this naive approach.
actually, testing this naive approach on the smaller examples, it does have a slight edge over not filling in dead ends. This means that the regex is likely slowing down as the map get larger. so something that can find dead ends faster would be a better choice than the one line regex we have right now.
I guess location of both S and E for the input does matter, because the maze map could end up with S and E being close enough that most, if not all, dead ends are never wasting the time of the Dijkstra’s algorithm. however, my input had S and E being on opposite corners. So the regex is likely the culprit in why the larger map makes filling in dead ends slower.
if you notice from the profiler output, on the smaller examples, the naive approach makes a negligible loss in time and improves the time by a few tenths of a millisecond for your algorithm to do both part1 and part 2. however, on the larger input, the naive approach starts to take a huge hit and loses about 350 ms to 400 ms on filling in dead ends, while only improving the time of your algorithm by 90 ms. while filling in dead ends does improve performance for your algorithm, it just has too much overhead. That means that with a less naive approach, there would be a significant way to improve time on the solving algorithm.
took some time out of my day to implement a solution that beats only running your solution by like 90 ms. This is because the algorithm for filling in all dead ends takes like 9-10 milliseconds and reduces the time it takes your algorithm to solve this by like 95-105 ms!
decent improvement for so many lines of code, but it is what it is. using .index and .rindex on strings is just way too fast. there might be a faster way to replace with or just switch to complete binary bit manipulation for everything, but that is like incredibly difficult to think of rn.
but here is the monster script that seemingly does it in ~90 milliseconds faster than your current script version. because it helps eliminated time waste in your Dijkstra’s algorithm and fills all dead ends with minimal impact on performance. Could there be corner cases that I didn’t think of? maybe, but saving time on your algo is better than just trying to be extra sure to eliminate all dead ends, and I am skipping loops because your algorithm will handle that better than trying to do a flood fill type algorithm. (remember first run of a modified script will run a little slow.)
as of rn, the slowest parts of the script is your Dijkstra’s algorithm. I could try to implement my own solver that isn’t piggy-backing off your Dijkstra’s algorithm. however, I think that is just more than I care to do rn. I also was not going to bother with reducing LOC for the giant match case. its fast and serves it purpose good enough.