Wordle game analysis with Python

Wordle is an interesting word game in the style of the old mastermind game. You can try it out on NYT puzzle page or other places. Check it out and try a round!

Analysis with a script

You can find some solutions for certain word games like Wordle using the dictionary list on any Linux system:

with open('/usr/share/dict/words') as infile:
        for line in infile:
            #(Do what you need to here with the line to see if it is a valid word)

For verifying Wordle words, 5 letter words, one can first strip ending whitespace and newline to get just the word, check that the word is indeed five letters… replace nonletter characters and count those with 5:

import re
def validWords():
    with open('/usr/share/dict/words') as infile:
        for line in infile:
            line = line.strip()
            #No special char and 5 letters:
            if( line == re.sub(r'[^a-zA-Z]', '', line) and len(line) == 5):
                #This is valid.
                #print(line)
                yield line.lower()

Now the “yield” is like a return but lets one call repeatedly iterating with a for loop:

def validWith(partialWord):
    partialWord = partialWord.lower()
    possibilities = []
    for word in validWords():
        valid = True
        for i in range(5):
            partialChar = re.sub(r'[^a-zA-Z]', '', partialWord[i])
            if len(partialChar):
                if word[i] != partialChar:
                    valid = False
        if valid:
            possibilities.append(word)
    return possibilities

Above you can see the function utilizing it and counting every possibility with the letters given in the given spaces. For example validWith(‘w_rd_’) would give “wards” or “words” as possible solutions. To search out possible answers for your partial Wordle game this would be pretty handy!

Divide and conquer the words!

If you have played that old grade school game to guess-the-number, you may recall the best practice is to land right in the middle of the possibilities. Number between 1 and 100? Guess 50 first. Are you told it is greater than 50? Try 75. Greater than 75? try 87 or so. In this way the number can be narrowed down quickly.

For every guess of a word, one knows if a letter is exactly in the answer, or in the answer. This gives information on the words that are and are not the answer. Ideally it might be good to choose a first guess that splits the space of possible answers. Which ones “split” the set of all possible words? Since there are a number of words and each try must be a real word guess, we can go through and county by how they “split” the possible answers by what matches each one’s letters:

    wordchoice = ''
    off = 9999
    for word in validWords():
        unmatched = 0
        matched = 0
        for possiblematch in validWords():
            matches = 0
            for i in range(5):
                if word[i] == possiblematch[i]:
                    matches += 1
            if matches > 0:
                matched += 1
            else:
                unmatched += 1
        if off > abs(unmatched - matched):
            #This is close.
            off = abs(unmatched - matched)
            wordchoice = word
            print( 'word %s %s %s' % (wordchoice, unmatched, matched))

The output will take awhile and should find increasingly better words that more nearly exactly split the possibilities:

word afaik 4807 1416
word ansis 3618 2605
word awacs 3521 2702
word agnes 3132 3091
word coors 3130 3093
word doris 3114 3109
word serbs 3112 3111

As you can see, these last words are good ones to split the possibilities of ones that do and do not have a match! If the answer is a total non match it’s the half of the answers that do not match, check words that have other letters not in the original word.

If it is in the half of the possible words that have some match, rearrange the letters and try words with those letters, as is the usual strategy 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

× 2 = eighteen