AoC 2022 D7: No Space Left On Device

Haskell | Problem statement | Source code | Tags: Data structuresPuzzle

← Previous Back to AoC Index Next →

Part 1

The main goal is to construct a dirent tree, represented as:

data Dirent = File Int | Directory (Map Text Dirent)
hs

I also added a show for debugging purposes.

instance Show Dirent where
show (Directory subDir) = Map.foldlWithKey join "" subDir
where
join :: String -> Text -> Dirent -> String
join acc name dirent =
acc
++ "- "
++ T.unpack name
++ ( case dirent of
File _ -> show dirent
Directory _ -> ":\n" ++ unlines (map (" " ++) $ lines $ show dirent)
)
show (File size) = " (" ++ show size ++ ")\n"
hs

The main idea is to process the commands one by one. The state we keep track of is (Dirent, [Text]), representing the full directory tree and the current working directory as a list of path components. Note that the CWD is stored in reverse so that we can push/pop efficiently at the front. logs is the list of commands, each one starting immediately after "$ " and ending before the next "$ ".

(dir, _) = foldl' processCmd (Directory Map.empty, []) logs

processCmd :: TraverseState -> Text -> TraverseState
processCmd st log
| command == "cd" = cd (T.drop 3 log) st
| command == "ls" = ls (tail $ T.lines log) st
| otherwise = error ("Unknown command: " ++ command)
where
command = T.unpack $ T.take 2 log
hs

cd changes the CWD:

cd :: Text -> TraverseState -> TraverseState
cd dir (tree, path)
| dir == T.pack ".." = (tree, tail path)
| dir == T.pack "/" = (tree, [])
| otherwise = (tree, dir : path)
hs

ls updates the tree with the contents of the current directory.

ls logs (tree, path) = (newTree, path)
where
newTree = updateSubtree (reverse path) tree $ Directory $ addEntries $ getDirents path tree
hs

Here are the main steps:

  1. Go down the tree to reach the CWD
  2. Iterate through the ls logs and add entries to the current directory
  3. Traverse back up the tree, updating each parent directory with the modified subtree

First step is to go down:

getDirents :: [Text] -> Dirent -> Map Text Dirent
getDirents [] (Directory root) = root
getDirents _ (File _) = error "Cannot get subdirectories of a file"
getDirents (dirName : parentPath) dir = case Map.lookup dirName (getDirents parentPath dir) of
Just (Directory subDir) -> subDir
_ -> error ("Directory not found: " ++ T.unpack dirName)
hs

Second step is to add entries and create a new directory:

addEntries :: Map Text Dirent -> Map Text Dirent
addEntries dirents = foldr (addEntry . parseLsLog) dirents logs
where
addEntry (FileEntry name size) dirents = Map.insert name (File size) dirents
addEntry (DirEntry name) dirents = case Map.lookup name dirents of
Just _ -> dirents
_ -> Map.insert name (Directory Map.empty) dirents
hs

parseLsLog parses a single line of ls output:

data ParsedEntry = FileEntry Text Int | DirEntry Text deriving (Show)

parseLsLog :: Text -> ParsedEntry
parseLsLog log
| size == T.pack "dir" = DirEntry name
| otherwise = FileEntry name (readT size)
where
[size, name] = T.words log
hs

Third step is to update the subtree back up to the root, each time popping the highest path segment from CWD and updating the corresponding directory:

updateSubtree :: [Text] -> Dirent -> Dirent -> Dirent
updateSubtree [] _ newDir = newDir
updateSubtree _ (File _) _ = error "Cannot update a file"
updateSubtree (topSeg : rest) (Directory root) newDir =
Directory
( Map.insert topSeg (updateSubtree rest subtree newDir) root
)
where
subtree = case Map.lookup topSeg root of
Just dir -> dir
Nothing -> error ("Directory not found: " ++ T.unpack topSeg)
hs

Since we need to compute the sizes of all directories, I collect all of them in a list:

sizes :: Dirent -> [Int]
sizes (File size) = [size]
sizes (Directory subDir) = (subdirsSize + filesSize) : concat allSizes
where
(files, subdirs) = partition isFile $ Map.elems subDir
allSizes = map sizes subdirs
subdirsSize = sumMap head allSizes
filesSize = sumMap (\(File size) -> size) files

isFile :: Dirent -> Bool
isFile (File _) = True
isFile _ = False
hs

Part 2

Since the sizes list is ordered from top down, the root size is the head. To make the final size less than 40000000, we need to delete at least rootSize - 40000000. We find the smallest directory that is at least that big by sorting the sizes and using find:

fromJust $ find (>= total - 40000000) (sort dirSizes)
hs

← Previous Back to AoC Index Next →