My household has at all times beloved phrase puzzles. A clan of journalists, librarians, academics, and editors, it’s no shock that we ended up crowded round a recreation of Boggle or Scrabble at each vacation.
My mom and I nonetheless play Scrabble each time I go to, and we prefer to ship phrase video games forwards and backwards to one another after we’re aside. We’re, frankly, fed up by Wordle. Although it’s removed from new, we lately turned enamored with The New York Occasions’ Letter-Boxed, a refreshingly completely different and pretty difficult puzzle.
First, the guidelines of the puzzle. Letter Boxed presents 4 units of three letters every. Beginning anyplace, the participant should kind legitimate 3+ letter phrases, shifting to a special aspect with every letter.
As soon as a phrase is accomplished, the subsequent phrase should start with the final letter of the prior phrase. The puzzle is solved when all letters have been used no less than as soon as.
We discovered ourselves speculating on the precise problem: are options uncommon, or are they simply exhausting to seek out? The issue bar is normally set at 4- or 5-word options. How spectacular is it to discover a 3-word answer? 2 phrases? Because the lone software program engineer of the household, I set to discovering some solutions as a enjoyable weekend undertaking.
I by no means supposed to show it into an article, however the outcomes have been fascinating, and the answer showcases a less-known knowledge construction that I’ve a private fondness for, so right here we’re. With out additional ado, I’d prefer to stroll you thru the design of my solver.
Primarily based on the foundations of the puzzle, I got here up with a comparatively quick record of what’s wanted to unravel the issue:
For the official dictionary, I googled round and located just a few .txt information on GitHub purporting to be lists of all legitimate English phrases. After some testing, I concluded that Letter-Boxed is utilizing one thing just like the Scrabble dictionary, and that this is a good copy of it.*
*It’s price noting that this file solely incorporates phrases as much as 8 letters, as Scrabble requires uncommon board states for phrases longer than 8. A superior dictionary may doubtlessly expose extra few-word options, even 1-word options (!)
As for looking for options, this appears like a traditional use case for good ol’ recursive backtracking. Ideally, this idea is no less than considerably acquainted. If not, don’t fear, this must be a very good instance to study from. The standard issues apply: to be able to clear up shortly, we’ll want to cut back the dimensions and dimensionality of the issue house as a lot as we are able to.
Now the fascinating half: Quick checking of phrase validity. To easily verify “is that this string a sound phrase?” is trivial, since we have already got an inventory of legitimate phrases. Simply put them in a set or another O(1)
lookup construction and also you’re good to go. There are additionally some fascinating fuzzy options like Bloom Filters that may be extraordinarily quick.
However solely having a “phrase or not?” verify leaves us in a tough state of affairs. We may hold becoming a member of strings of letters and checking if the result’s legitimate, however even when that verify could be very quick, how would we all know when to cease? Exploring letters blindly will get fairly furry, fairly shortly. Exhausting simply all 8-letter combos in a 12-letter puzzle would take 12⁸ guesses (~430 million), and we may have to discover a phrase so long as 12 letters. Even when we wrangled that quantity down just a few orders of magnitude, we’d nonetheless find yourself going via all of the guesses for “clearly fallacious” begins, like ‘qzd-’.
So what we’d like is a knowledge construction that goes past “Is that this string of letters a sound phrase?”, and might additional reply “Are there any legitimate phrases that begin with this string?”
That is what a trie, a.ok.a. a prefix tree, is for. A trie permits us to go looking via a set of legitimate phrases in a pure and environment friendly method. At its core, recursive backtracking thrives in conditions the place the query “is it a waste of time to discover farther from right here?” is well requested and answered, so this can be a nice match.
Extra formally, a trie is a k-ary search tree the place mother or father nodes recursively symbolize the prefixes of their kids, and nodes have a boolean flag marking them as legitimate/invalid, with no less than all leaf nodes being legitimate. They can be utilized for something with an applicable notion of ‘prefix’ (numbers, strings, lists, and many others.), however we’ll be sticking to strings right here.