Day 9: Disk Fragmenter

Megathread guidelines

  • Keep top level comments as only solutions, if you want to say something other than a solution put it in a new post. (replies to comments can be whatever)
  • You can send code in code blocks by using three backticks, the code, and then three backticks or use something such as https://topaz.github.io/paste/ if you prefer sending it through a URL

FAQ

4 points
*

Haskell

This was fun, I optimized away quite a bit, as a result it now runs in 0.04s for both parts together on my 2016 laptop.

In part 1 I just run through the array with a start- and an end-index whilst summing up the checksum the entire time.
In part 2 I build up Binary Trees of Free Space which allow me to efficiently search for and insert free spaces when I start traversing the disk from the back. Marking the moved files as free is omitted because the checksum is calculated for every file that is moved or not moved directly.

Code
import Control.Monad
import Data.Bifunctor

import Control.Arrow hiding (first, second)

import Data.Map (Map)
import Data.Set (Set)
import Data.Array.Unboxed (UArray)

import qualified Data.Map as Map
import qualified Data.Set as Set
import qualified Data.Ord as Ord
import qualified Data.List as List
import qualified Data.Char as Char
import qualified Data.Maybe as Maybe
import qualified Data.Array.Unboxed as UArray

toNumber = flip (-) (Char.ord '0') <<< Char.ord 

type FileID = Int
type FileLength = Int
type DiskPosition = Int
type File = (FileID, (DiskPosition, FileLength))
type EmptyMap = Map FileLength (Set DiskPosition)

readDisk :: DiskPosition -> [(Bool, FileLength)] -> [(Bool, (DiskPosition, FileLength))]
readDisk _ [] = []
readDisk o ((True, l):fs)  = (True, (o, l))  : readDisk (o+l) fs
readDisk o ((False, l):fs) = (False, (o, l)) : readDisk (o+l) fs

parse2 :: String -> ([File], EmptyMap)
parse2 s = takeWhile (/= '\n')
        >>> map toNumber
        >>> zip (cycle [True, False]) -- True is File, False is empty
        >>> readDisk 0
        >>> List.partition fst
        >>> join bimap (map snd)
        >>> first (zip [0..])
        >>> first List.reverse
        >>> second (filter (snd >>> (/= 0)))
        >>> second (List.sortOn snd)
        >>> second (List.groupBy (curry $ (snd *** snd) >>> uncurry (==)))
        >>> second (List.map (snd . head &&& map fst))
        >>> second (List.map (second Set.fromDistinctAscList))
        >>> second Map.fromDistinctAscList
        $ s

maybeMinimumBy :: (a -> a -> Ordering) -> [a] -> Maybe a
maybeMinimumBy _ [] = Nothing
maybeMinimumBy f as = Just $ List.minimumBy f as

fileChecksum fid fpos flen = fid * (fpos * flen + ((flen-1) * (flen-1) + (flen-1)) `div` 2)

type Checksum = Int
moveFilesAccumulate :: (Checksum, EmptyMap) -> File -> (Checksum, EmptyMap)
moveFilesAccumulate (check, spaces) (fid, (fpos, flen)) = do
        let bestFit = Map.map (Set.minView)
                >>> Map.toList
                >>> List.filter (fst >>> (>= flen))
                >>> List.filter (snd >>> Maybe.isJust)
                >>> List.map (second Maybe.fromJust) -- [(FileLength, (DiskPosition, Set DiskPosition))]
                >>> List.filter (snd >>> fst >>> (< fpos))
                >>> maybeMinimumBy (\ (_, (p, _)) (_, (p', _)) -> Ord.compare p p')
                $ spaces

        case bestFit of
                Nothing -> (check + fileChecksum fid fpos flen, spaces)
                Just (spaceLength, (spacePosition, remainingSet)) -> do
                        

                        -- remove the old empty entry by replacing the set
                        let updatedMap  = Map.update (const $! Just remainingSet) spaceLength spaces

                        -- add the remaining space, if any
                        let remainingSpace = spaceLength - flen
                        let remainingSpacePosition = spacePosition + flen
                        let updatedMap' = if remainingSpace == 0 then updatedMap else Map.insertWith (Set.union) remainingSpace (Set.singleton remainingSpacePosition) updatedMap

                        (check + fileChecksum fid spacePosition flen, updatedMap')

parse1 :: String -> UArray Int Int
parse1 s = UArray.listArray (0, sum lengthsOnly - 1) blocks
        where
                lengthsOnly = filter (/= '\n')
                        >>> map toNumber
                        $ s :: [Int]
                blocks = zip [0..]
                        >>> List.concatMap (\ (index, n) -> if index `mod` 2 == 0 then replicate n (index `div` 2) else replicate n (-1))
                        $ lengthsOnly :: [Int]

moveBlocksAccumulate :: Int -> Int -> UArray Int Int -> Int
moveBlocksAccumulate start stop array
        | start      == stop   = if startBlock == -1 then 0 else start * startBlock
        | start      >  stop   = 0
        | stopBlock  == -1     = moveBlocksAccumulate start (stop - 1) array
        | startBlock == -1     = movedChecksum + moveBlocksAccumulate (start + 1) (stop - 1) array
        | startBlock /= -1     = startChecksum + moveBlocksAccumulate (start + 1) stop array
        where
                startBlock    = array UArray.! start
                stopBlock     = array UArray.! stop
                movedChecksum = stopBlock * start
                startChecksum = startBlock * start

part1 a = moveBlocksAccumulate 0 arrayLength a
        where
                (_, arrayLength) = UArray.bounds a
part2 (files, spaces) = foldl moveFilesAccumulate (0, spaces)
        >>> fst
        $ files

main = getContents
        >>= print
        . (part1 . parse1 &&& part2 . parse2)
permalink
report
reply
4 points
*

Uiua

Just a port of my Dart solution from earlier really, and it shows, as it takes about 30 seconds on the live data.

(edit: I just noticed the little alien in the code (⋅⋅∘|⋅∘|∘) which is literally flipping the stack (╯°□°)╯︵ ┻━┻!)

How to read this

Try it live!

Data   ← "2333133121414131402"
FS     ← ↙⌊÷2⧻.▽≡⋕:♭⍉⊟⊃(⇡|↯:¯1)⧻.Data  # Build up a map of the FS.
MoveB  ← ⍜(⊡|⋅)⊃(⋅⋅∘|⋅∘|∘) ⊡¯1.:⊢⊚⌕¯1. # Find a space, move block into it.
MoveBs ← ⍢(⍢(↘¯1|=¯1⊣)↘¯1MoveB|>0⧻⊚⌕¯1)

TryMove ← ⨬(◌|∧⍜⊏⇌⍉)/×/>.
MoveFile ← (
  ⊃(⊚⌕↯:¯1⧻|∘)⊚⌕⊙.⊙.         # get posns from, start posn to.
  ⨬(◌◌|TryMove ⊟+⊙◌°⊏,⊢)>0⧻. # check posn to is good, swap.
)
Check ← /+/×⊟⇡⧻.↥0
&p Check MoveBs FS
&p Check ∧MoveFile⇌+1⇡/↥.FS

(edit: improved. Part1 is instant, part2 is about 17sec, but the alien has left)

Data   ← "2333133121414131402"
FS     ← ▽≡⋕:↙⧻:♭⍉⊟⊃(⇡|↯:¯1)⧻..Data # Build up a map of the FS.
Ixs    ← ⊃(⊚¬|⇌⊚)≥0                 # Get indices of space, and of blocks reversed.
SwapBs ← ▽⊸≡/>⍉⊟∩↙⟜:↧∩⧻,,           # Join them where space < block.

Files ← ⇌≡(□⊚)⊞=⇡+1/↥.
Move  ← ∧(⍜⊏⇌)⍉⊟+⇡⧻,⊢ # (tos, froms, fs)
MoveFile ← (
  ⊚⌕⊙,↯:¯1⧻.                # List of possible starts
  ⨬(◌◌|⨬(◌◌|Move)>∩⊢,,)>0⧻. # Only valid, leftwards starts 
)
Check ← /+/×⊟⇡⧻.↥0
&p Check ∧⍜⊏⇌SwapBs⊸Ixs FS
&p Check ∧◇MoveFile Files .FS
permalink
report
reply
3 points
*

PYTHON

Execution Time: Part1 = 0.02 seconds. Part2 = ~2.1 seconds. total = ~2.1 seconds

Aiming for simplicity over speed. This is pretty fast for not employing simple tricks like trees and all that.

code

because of text limit and this code being slow, I put it in a topaz paste: [ link ]

Edit:

New version that is using a dictionary to keep track of the next empty slot that fits the current index.

Execution Time: Part1 = 0.02 seconds. Part2 = ~0.08 seconds. total = ~0.08 seconds 80 ms

code

you can also find this code in the Topaz link: [ link ]

Edit: final revision. I just realized that the calculating for “last_consecutive_full_partition” was not necessary and very slow. if I know all the next available slots, and can end early once my current index dips below all next available slots then the last_consecutive_full_partition will never be reached. This drops the time now to less than ~0.1 seconds

Probably Final Edit: I found someone’s O(n) code for OCaml. I tried to convert it to be faith fully in pure python. seems to work really really fast. 30-50 ms time for most inputs. seems to scale linearly too

FastCode
def int_of_char(x):
    return ord(x) - ord('0')

# Represent content as tuples:
# ('Empty', size) or ('File', id, size)
def parse(line):
    arr = []
    for i in range(len(line)):
        c = int_of_char(line[i])
        if i % 2 == 0:
            arr.append(('File', i // 2, c))
        else:
            arr.append(('Empty', c))
    return arr

def int_sum(low, high):
    return (high - low + 1) * (high + low) // 2

def size(elem):
    t = elem[0]
    if t == 'Empty':
        return elem[1]
    else:
        return elem[2]

def part1(array):
    total = 0
    left = 0
    pos = 0
    right = len(array) - 1

    while left < right:
        if array[left][0] == 'File':
            # File
            _, fid, fsize = array[left]
            total += fid * int_sum(pos, pos + fsize - 1)
            pos += fsize
            left += 1
        else:
            # Empty
            _, esize = array[left]
            if array[right][0] == 'Empty':
                right -= 1
            else:
                # Right is File
                _, fid, fsize = array[right]
                if esize >= fsize:
                    array[left] = ('Empty', esize - fsize)
                    total += fid * int_sum(pos, pos + fsize - 1)
                    pos += fsize
                    right -= 1
                else:
                    array[right] = ('File', fid, fsize - esize)
                    total += fid * int_sum(pos, pos + esize - 1)
                    pos += esize
                    left += 1

    # If one element remains (left == right)
    if left == right and left < len(array):
        if array[left][0] == 'File':
            _, fid, fsize = array[left]
            total += fid * int_sum(pos, pos + fsize - 1)

    return total

def positions(arr):
    total = 0
    res = []
    for e in arr:
        res.append(total)
        total += size(e)
    return res

def array_fold_right_i(f, arr, acc):
    pos = len(arr) - 1
    for elt in reversed(arr):
        acc = f(elt, pos, acc)
        pos -= 1
    return acc

def part2(array):
    def find_empty(size_needed, max_pos, pos):
        while pos <= max_pos:
            if array[pos][0] == 'File':
                raise Exception("Unexpected: only empty at odd positions")
            # Empty
            _, esize = array[pos]
            if esize >= size_needed:
                array[pos] = ('Empty', esize - size_needed)
                return pos
            pos += 2
        return None

    emptys = [1 if i < 10 else None for i in range(10)]
    pos_arr = positions(array)

    def fold_fun(elt, i, total):
        if elt[0] == 'Empty':
            return total
        # File
        _, fid, fsize = elt
        init_pos = emptys[fsize]
        if init_pos is None:
            new_pos = pos_arr[i]
        else:
            opt = find_empty(fsize, i, init_pos)
            if opt is None:
                new_pos = pos_arr[i]
            else:
                new_pos = pos_arr[opt]
                pos_arr[opt] += fsize
                emptys[fsize] = opt
        return total + fid * int_sum(new_pos, new_pos + fsize - 1)

    return array_fold_right_i(fold_fun, array, 0)

def main():
    with open('largest_test', 'r') as f:
        line = f.read().replace('\r', '').replace('\n', '')
    arr = parse(line)
    arr_copy = arr[:]
    p1 = part1(arr_copy)
    print("Part 1 :", p1)
    p2 = part2(arr)
    print("Part 2 :", p2)

if __name__ == "__main__":
    main()

permalink
report
reply
2 points

So cool, I was very hyped when I managed to squeeze out the last bit of performance, hope you are too. Especially surprised you managed it with python, even without the simple tricks like trees ;)

I wanted to try it myself, can confirm it runs in under 0.1s in performance mode on my laptop, I am amazed though I must admin I don’t understand your newest revision. 🙈

permalink
report
parent
reply
3 points
*

Just to let you know, I posted the fastest python version I could come up with. Which took heavy inspiration from [ link to github ]

supposedly O(n) linear time, and does seem to work really fast.

permalink
report
parent
reply
2 points
*

Thanks! your Haskell solution is extremely fast and I don’t understand your solution, too. 🙈 lol

My latest revision just keeps a dict with lists of known empty slots with the length being the dict key, including partially filled slots. I iteratively find the slot that has the lowest index number and make sure the lists are properly ordered from lowest to highest index number.

looking at the challenge example/description, it shows a first pass only type of “fragmenting”. we can be confident that if something did not fit, it can just stay in the same spot even if another slot frees up enough space for it to fit. so just checking if current index is lower than the lowest index number of any of the slot lengths would just be enough to stop early. That is why I got rid of last_consecutive_full_partition because it was slowing it down by up to 2 seconds.

in example, even if 5555, 6666, or 8888 can fit in the new spot created by moving 44, they are staying put. Thus a first pass only sort from back to front.

00...111...2...333.44.5555.6666.777.888899
0099.111...2...333.44.5555.6666.777.8888..
0099.1117772...333.44.5555.6666.....8888..
0099.111777244.333....5555.6666.....8888..
00992111777.44.333....5555.6666.....8888..
permalink
report
parent
reply
2 points

Thank you for the detailed explanation!, it made me realize that our solutions are very similar. Instead of keeping a Dict[Int, List[Int]] where the value list is ordered I have a Dict[Int, Tree[Int]] which allows for easy (and fast!) lookup due to the nature of trees. (Also lists in haskell are horrible to mutate)

I also apply the your technique of only processing each file once, instead of calculating the checksum afterwards on the entire list of file blocks I calculate it all the time whenever I process a file. Using some maths I managed to reduce the sum to a constant expression.

permalink
report
parent
reply
2 points
*

I only now found your edit after I had finished my previous comment. I think splitting into two lists may be good: one List of Files and one of Empty Blocks, I think this may not work with your checksumming so maybe not.

permalink
report
parent
reply
3 points

Was really blanking on how to do this one nicely, so a bunch of stacked loops it is…
Also ended up writing two separate solutions for the first and second part, since I couldn’t get acceptable performance otherwise. Still takes half a second on my machine, mainly on the second part.

This is technically the second implementation, the first one took minutes to calculate, so I wasn’t really okay with stamping it as my solution-of-choice.

Can definitely still be improved, but I’ve been poking and prodding at this code for hours on end now, so it’s long past time to let it sit for a while and see if I get any better ideas later.

C#
int[] layout = new int[0];
public void Input(IEnumerable<string> lines)
{
  layout = string.Join("", lines).ToCharArray().Select(c => int.Parse(c.ToString())).ToArray();
}

public void Part1()
{
  ushort?[] blocks = BuildBlockmap().ToArray();

  var it = 0;
  for (var i = blocks.Length - 1; i > it; i--)
  {
    if (blocks[i] == null)
      continue;

    while (it < blocks.Length && blocks[it] != null)
      ++it;

    if (it >= blocks.Length)
      break;

    (blocks[it], blocks[i]) = (blocks[i], null);
  }

  long checksum = 0;
  foreach (var part in blocks.OfType<ushort>().Select((b, i) => i * b))
    checksum += part;

  Console.WriteLine($"Checksum: {checksum}");
}

public void Part2()
{
  var sparse = BuildSparsemap().ToList();

  for (var i = sparse.Count - 1; i >= 0; i--)
  {
    if (sparse[i].Item1 == null)
      continue;

    for (var j = 0; j < i; ++j)
    {
      if (sparse[j].Item1 != null)
        continue;

      if (sparse[i].Item2 > sparse[j].Item2)
        continue;

      var size = sparse[j].Item2;
      size -= sparse[i].Item2;

      (sparse[j], sparse[i]) = (sparse[i], (null, sparse[i].Item2));

      if (i + 1 < sparse.Count && sparse[i + 1].Item1 == null)
      {
        sparse[i] = (null, (ushort)(sparse[i].Item2 + sparse[i + 1].Item2));
        sparse.RemoveAt(i + 1);
      }

      if (sparse[i - 1].Item1 == null)
      {
        sparse[i - 1] = (null, (ushort)(sparse[i - 1].Item2 + sparse[i].Item2));
        sparse.RemoveAt(i);
      }

      if (size > 0)
        sparse.Insert(j + 1, (null, size));

      j = i + 1;
    }
  }

  int ind = 0;
  long checksum = 0;
  foreach (var (val, cnt) in sparse)
    for (var i = 0; i < cnt; ++i)
    {
      checksum += (val ?? 0) * ind;
      ++ind;
    }

  Console.WriteLine($"Checksum: {checksum}");
}

IEnumerable<ushort?> BuildBlockmap()
{
  ushort blockit = 0;
  bool block = true;
  foreach (var value in layout)
  {
    for (int i = 0; i < value; ++i)
      yield return block ? blockit : null;
    if (block)
      blockit++;
    block = !block;
  }
}

IEnumerable<(ushort?, ushort)> BuildSparsemap()
{
  ushort blockit = 0;
  bool block = true;
  foreach (var value in layout)
  {
    if (block)
      yield return (blockit++, (ushort)value);
    else
      yield return (null, (ushort)value);
    block = !block;
  }
}
permalink
report
reply
3 points

Julia

Oh today was a struggle. First I did not get what exactly the task wanted me to do and then in part 2 I tried a few different ideas which all failed because I changed the disk while I was indexing into it. Finally now I reworked part 2 not moving the blocks at all, just using indexes and it works.

I feel that there is definitely something to learn here and that’s what I like about AoC so far. This is my first AoC but I hope that I won’t have to put this much thought into the rest, since I should definitely use my time differently.

Code
function readInput(inputFile::String)
	f = open(inputFile,"r"); diskMap::String = readline(f); close(f)
	disk::Vector{String} = []
	id::Int = 0
	for (i,c) in enumerate(diskMap)
		if i%2 != 0 #used space
			for j=1 : parse(Int,c)
				push!(disk,string(id))
			end
			id += 1
		else #free space
			for j=1 : parse(Int,c)
				push!(disk,".")
			end
		end
	end
	return disk
end

function getDiscBlocks(disk::Vector{String})::Vector{Vector{Int}}
	diskBlocks::Vector{Vector{Int}} = []
	currBlock::Int = parse(Int,disk[1]) #-1 for free space
	blockLength::Int = 0; blockStartIndex::Int = 0
	for (i,b) in enumerate(map(x->(x=="." ? -1 : parse(Int,x)),disk))
		if b == currBlock
			blockLength += 1
		else #b!=currBlock
			push!(diskBlocks,[currBlock,blockLength,blockStartIndex,i-2])
			currBlock = b
			blockLength = 1
			blockStartIndex = i-1 #start of next block
		end
	end
	push!(diskBlocks,[currBlock,blockLength,blockStartIndex,length(disk)-1])
	return diskBlocks
end

function compressDisk(disk::Vector{String})::Vector{Int} #part 1
	compressedDisk::Vector{Int} = []
	startPtr::Int=1; endPtr::Int=length(disk)
	while endPtr >= startPtr
		while endPtr>startPtr && disk[endPtr]=="."
			endPtr -= 1
		end
		while startPtr<endPtr && disk[startPtr]!="."
			push!(compressedDisk,parse(Int,disk[startPtr])) about AoC
			startPtr += 1
		end
		push!(compressedDisk,parse(Int,disk[endPtr]))
		startPtr+=1;endPtr-=1
	end
	return compressedDisk
end

function compressBlocks(diskBlocks::Vector{Vector{Int}})
	for i=length(diskBlocks) : -1 : 1 #go through all blocks, starting from end
		diskBlocks[i][1] == -1 ? continue : nothing
		for j=1 : i-1 #look for large enough empty space
			diskBlocks[j][1]!=-1 || diskBlocks[j][2]<diskBlocks[i][2] ? continue : nothing #skip occupied blocks and empty blocks that are too short
			diskBlocks[i][3] = diskBlocks[j][3] #set start index
			diskBlocks[i][4] = diskBlocks[j][3]+diskBlocks[i][2]-1 #set end index
			diskBlocks[j][3] += diskBlocks[i][2] #move start of empty block
			diskBlocks[j][2] -= diskBlocks[i][2] #adjust length of empty block
			break
		end
	end
	return diskBlocks
end

function calcChecksum(compressedDisk::Vector{Int})::Int
	checksum::Int = 0
	for (i,n) in enumerate(compressedDisk)
		checksum += n*(i-1)
	end
	return checksum
end

function calcChecksumBlocks(diskBlocks::Vector{Vector{Int}})::Int
	checksum::Int = 0
	for b in diskBlocks
		b[1]==-1 ? continue : nothing
		for i=b[3] : b[4]
			checksum += b[1]*i
		end
	end
	return checksum
end

disk::Vector{String} = readInput("input/day09Input")
@info "Part 1"
println("checksum: $(calcChecksum(compressDisk(disk)))")
@info "Part 2"
println("checksum: $(calcChecksumBlocks(compressBlocks(getDiscBlocks(disk)))")
permalink
report
reply

Advent Of Code

!advent_of_code@programming.dev

Create post

An unofficial home for the advent of code community on programming.dev!

Advent of Code is an annual Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

AoC 2024

Solution Threads

M T W T F S S
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25

Rules/Guidelines

  • Follow the programming.dev instance rules
  • Keep all content related to advent of code in some way
  • If what youre posting relates to a day, put in brackets the year and then day number in front of the post title (e.g. [2024 Day 10])
  • When an event is running, keep solutions in the solution megathread to avoid the community getting spammed with posts

Relevant Communities

Relevant Links

Credits

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

console.log('Hello World')

Community stats

  • 494

    Monthly active users

  • 107

    Posts

  • 1.1K

    Comments