Wordle Sold: Is It Harder Now?

Picture created by Creator with Pixlr E

The present web sensation Wordle was lately acquired by The New York Times for an quantity within the low seven figures. The primary noticeable change is that Wordle has discovered a brand new dwelling in NYT’s recreation hub.

Nevertheless, individuals are questioning if there are extra delicate modifications below the hood:

  1. Have any phrases been eliminated? ➖
  2. Have any new phrases been added? ➕
  3. Which phrases stay the identical? 🟰
  4. Has the sport turn out to be tougher to play now? 😓

In a completely unrelated method, and nearly on the similar time, Google’s Information Studio launched new data blending enhancements.

Instantly one thing popped into my head: is it attainable to mix the 2 to reply these urgent Wordle questions?

Surprisingly… the reply is sure!

As I defined in my earlier story that mentioned the right way to enhance your profitable odds simply through the use of three particular phrases, Wordle makes use of two completely different lists of curated phrases:

  • Options: the phrases that resolve the each day recreation, the commonest ones. For instance: contemporary, crust, stool…
  • Guesses: phrases that you should utilize to resolve the sport however are by no means the answer itself, lots of them (fairly) obscure. For instance: kauru, sekts, zazen…

Wordle is coded to select a phrase from the answer listing primarily based on at the moment’s date: that’s what makes it attainable for everyone to play the identical phrase every day. Let’s suppose the answer listing seems one thing like this:

word_1, word_2, word_3, word_4, word_5, word_6, word_7, ... word_n

Say Wordle picked word_3 as the first February’s resolution, then word_4 shall be 2nd February’s resolution, word_5 shall be third February’s resolution and so forth and so forth. This truth will turn out to be useful in a while to find out if Wordle’s issue has modified.

Information mixing has been attainable in Information Studio since 2018. Nevertheless, it was restricted to left outer joins. With the most recent enhancements, the next be a part of operators at the moment are accessible:

  • Left outer be a part of: Returns matching rows from the correct desk, plus non-matching rows from the left desk.
  • Proper outer be a part of: Returns matching rows from the left tables, plus non-matching rows from the correct desk.
  • Internal be a part of: Returns solely matching rows from the left and proper tables.
  • Full outer be a part of: Returns all matching rows from the left tables or the correct desk.
  • Cross be a part of: Returns each attainable mixture of rows from the left and proper tables.
Graphical illustration of joins — Picture created by Creator with Pixlr E

If we contemplate the unique Wordle listing of phrases as a blue set, the NYT listing of phrases as a yellow set and we intersect them we get this diagram:

Graphical illustration of a full outer be a part of — Picture created by Creator with Pixlr E

The completely different colours might be interpreted as:

  • blue: phrases faraway from the NYT model of the sport
  • inexperienced: phrases frequent to each variations
  • yellow: phrases added by the NYT model of the sport

There’s a small downside although: in case you examine the picture above, you’ll discover there is no such thing as a be a part of that resembles the blue or yellow form. Happily, there’s a easy workaround: use a full outer be a part of mixed with a filter.

Let’s see how a full outer be a part of would work in a easy instance first. Take into account the next two tables:

Instance tables — Picture created by Creator with Pixlr E

The total outer be a part of of those two tables seems like this:

Instance full outer be a part of — Picture created by Creator with Pixlr E

The completely different colours might be interpreted as:

  • blue: phrases solely present in Desk A
  • inexperienced: phrases frequent to each Desk A and Desk B
  • yellow: phrases solely present in Desk B

Discover how all of the rows in blue have Phrase B equal to NULL. Likewise, all rows in yellow must Phrase A equal to NULL. And that is our cue to create a filter that can exclude rows to compose the blue or yellow form.

Let’s replicate these steps in Information Studio to compose the blue form:

  1. Create a brand new mix referred to as Traditional vs NYT that mixes the Authentic and NYT tables and choose all fields from each tables:
Blended knowledge supply — Picture by Creator

2. Choose the total outer be a part of operator and the be a part of situation on the phrase subject from every desk:

Blended knowledge supply be a part of configuration — Picture by Creator

3. Create a chart (for instance a desk) that makes use of the blended knowledge supply and choose the phrases from Wordle unique:

Chart desk configuration — Picture by Creator

4. Create a filter on the mix created earlier that simply consists of rows the place the phrase from the NYT desk is Null:

Filter configuration — Picture by Creator

5. Lastly, apply the filter to a chart that makes use of the mix:

Filter being utilized to chart — Picture by Creator

Discover that the chart was displaying phrases from Wordle unique and the filter was eradicating rows the place there was no NYT match: phrases that aren’t current in Wordle NYT (eliminated)

Conversely, if I have to recreate the yellow form, I simply have to observe the identical steps besides that in step 4 I have to create a filter that simply consists of rows the place the phrase from the Authentic desk is Null:

Filter configuration — Picture by Creator

Now that we perceive how Wordle works, the completely different sorts of joins that we are able to use, and the best way to configure Information Studio the best way we’d like, the subsequent order of enterprise is extracting the phrase lists from each video games.

For the reason that unique Wordle internet web page not exists and as an alternative redirects to The New York Instances video games hub, I have to resort to a cached model of it. I may even use a cached model of the brand new Wordle in order that the code I put up right here may even be just right for you if NYT decides to vary the sport code.

Each time I want a cached model of an online web page I instantly consider the Internet Archive. They’re a non-profit constructing a digital library of Web websites and different cultural artifacts in digital type. Right now they’ve 25+ years of internet historical past accessible via the Wayback Machine. Let’s put it to good use:

Web Archive Wayback Machine Authentic Wordle snapshots — Picture by Creator

Looking the unique URL (https://powerlanguage.co.uk/wordle/), Wayback Machine reveals that the web page was cached a number of instances, even a number of instances a day.

Enjoyable truth: In accordance with Wayback Machine snapshots, Wordle was moved to The New York Instances video games hub on Feb 10, 2022 someday between 18:56:02 GMT and 20:33:54 GMT.

As standard, I’ll be utilizing Google Cloud Shell as my serverless setting to extract phrases:

Google Cloud Shell, my shell of selection 😀 — Picture by Creator

The script that extracts the phrases from the unique Wordle is so simple as fetching the JavaScript recreation file cached by Wayback Machine on Feb 10, 2022, at 03:43:40 like this:

When the script is run, the output seems like this:

Fetching recreation file...
Extracting phrases...
Variety of phrases extracted from Wordle unique:
12972

Equally, to extract the phrases from the NYT Wordle, I simply have to fetch the JavaScript recreation file cached by Wayback Machine on Feb 10, 2022, at 18:56:02 and reserve it to a special CSV file:

Working the script produces this output:

Fetching recreation file...
Extracting phrases...
Variety of phrases extracted from Wordle NYT:
12947

Aha! The numbers don’t match. Let’s discover solutions!

Information Studio is in a position to hook up with a number of knowledge sources: Google Analytics, Google Adverts, BigQuery, Google Sheet, MySQL, Microsoft SQL Server and plenty of extra.

For this train, I’ll use the File Add connector which takes CSV information like those generated when extracting phrases from Wordle:

Information Studio Connectors — Picture by Creator

So let’s create a few knowledge sources by importing wordle_original.csv and wordle_nyt.csv:

Information Studio knowledge sources — Picture by Creator

Professional Tip: The File Add connector is ready to mix a number of CSV information supplied they’ve the identical schema.

Let’s do that characteristic by creating a brand new knowledge supply that mixes each CSV information:

A number of information uploaded to a single File Add knowledge connector — Picture by Creator

Now that the information sources are in place, let’s construct a fast chat to examine all knowledge has been appropriately uploaded:

Uploaded knowledge — Picture by Creator

Excellent, every little thing seems good. And simply by wanting on the knowledge that was loaded, we are able to see the NYT’s model of the sport has 25 much less phrases: 6 options and 19 guesses.

Now we have every little thing we’d like, let’s begin answering Wordle questions!

Utilizing the total outer be a part of operator plus an NYT phrase is a null filter, the next desk reveals any eliminated phrases:

Eliminated phrases (redacted) — Picture by Creator

The NYT has eliminated 25 obscure, insensitive and offensive phrases from the unique Wordle recreation. I’ve redacted them however you may simply fill within the lacking letters in case you are actually curious…

Once more, utilizing the total outer be a part of operator plus an unique phrase is a null filter, the next desk reveals any added phrases:

Added phrases — Picture by Creator

Seems, NYT has not added any new phrases. A minimum of not within the cached web page I fetched from the Web Archive.

Now let’s check out the phrases that stay the identical in NYT. Have they been modified in any manner? Have any guess phrases been promoted to resolution phrases? Conversely, have any resolution phrases been demoted to guess phrases?

Evaluating the “phrase kind” from each unique and NYT yields these solutions. Phrase’s kind has been not modified both:

Whole phrases and similar kind phrases — Picture by Creator

OK, let’s a recap: to date we all know NYT has eliminated some phrases, has not added any and no varieties have been modified.

So… if NYT has simply eliminated 25 phrases, is there any manner they may make the sport tougher to play?

Really… sure!

Sometimes, Wordle gamers will complain that at the moment’s phrase was laborious. Possibly it’s a uncommon phrase not used typically. Regardless of the purpose, individuals are unanimous: at the moment was tough!

The one manner NYT may probably alter the sport at this level is by reordering the phrases to place probably the most tough first. How can we examine this concept?

Once I extracted the phrases from each video games I took the time so as to add a quantity to every phrase that indicated their sequence within the full listing.

The answer phrases from the unique Wordle (wordle_original.csv) appear to be this:

recreation,sequence,phrase,word_type
wordle_original,1,cigar,resolution
wordle_original,2,rebut,resolution
wordle_original,3,sissy,resolution
wordle_original,4,humph,resolution
wordle_original,5,awake,resolution
...
wordle_original,2311,*****,resolution
wordle_original,2312,*****,resolution
wordle_original,2313,*****,resolution
wordle_original,2314,*****,resolution
wordle_original,2315,*****,resolution
...

Equally, the answer phrases from NYT (wordle_nyt.csv) appear to be this:

recreation,sequence,phrase,word_type
wordle_nyt,1,cigar,resolution
wordle_nyt,2,rebut,resolution
wordle_nyt,3,sissy,resolution
wordle_nyt,4,humph,resolution
wordle_nyt,5,awake,resolution
...
wordle_nyt,2305,*****,resolution
wordle_nyt,2306,*****,resolution
wordle_nyt,2307,*****,resolution
wordle_nyt,2308,*****,resolution
wordle_nyt,2309,*****,resolution

I’ve redacted phrases that haven’t been performed but to forestall spoilers… 🫣

Evaluating the sequence quantity from the unique listing to NYT listing it’s simple to see which phrases, if any, have been reordered. Moreover, calculating the delta (distinction) between the 2 sequence numbers actually provides it away.

More Posts