janc
|
|
response 299 of 1578:
|
Jan 8 06:37 UTC 2003 |
I've posted notes on how I do analysis a couple times, but it's improved
a bit since the last time. At this point I have it worked out so it
would be pretty plausible to program (interesting too, since you'd
likely want to do some recursion), but I don't really intend to. I do
it by scribbling over a couple square inches of scrap paper, but I'll be
more formal here.
We keep three main data structures:
NEGATIVES: For each of the 5 positions, a list of letters that we
know cannot be there.
POSSIBLES: For each of the 5 positions, a list of letters that have
been guessed for that position, not including those in NEGATIVES.
PATTERNS: A list of five-letter patterns that the word must fit one
of. These can have ? marks in them to indicate positions that
can be filed by any letter not in NEGATIVES or POSSIBLES for that
position.
Suppose we have this data so far:
crash 0
felts 0
swear 0
fleat 0
grain 1
eerie 2
grasp 1
First step is to handle all the zeros buy loading them into NEGATIVES.
NEGATIVES[1] cfs
[2] rewl
[3] ale
[4] sta
[5] hsrt
Now we pick one of the other words to work on. The order doesn't effect
the final result, but some orders are easier to do than others. We
won't worry about it and just do them in the order given, starting with
"grain 1".
First, we knock out letters from "grain" that appear in NEGATIVES. This
leaves "g##in 1", that is one (and only one) of "g" "i" or "n" must
appear in the word. These letters get added to POSSIBLES. So the word
must be "g????" or "???i?" or "????n". So we add these three
posibilities to PATTERNS. So we now have:
POSSIBLES[1] g
[2]
[3]
[4] i
[5] n
PATTERNS
g????
???i?
????n
Next word is "eerie 2". Again, we knock out the negatives, giving
"e#rie 2". We need to merge this into our list of patterns, one pattern
at a time. Let's start with old pattern "???i?". The "i" is already
there, so we need one other letter from "e#rie" to ensure that there are
two matches with it. It has to be one of "e??i?" or "??ri?" or "???ie",
so those three will be in the new pattern list.
Next we do old pattern "g????". We need to find all ways that that can
match two letters of "e#rie". Well, obviously the initial "e" can't be
right in this case, because we already know the initial letter is "g".
So you can have two of "rie". But remember that in "g????" the ? can
not be any letter in either NEGATIVE or POSSIBLES. In particular the
fourth position cannot be an "i", because "i" is listed among the
POSSIBLES for that column. If we allowed an "i" here, "grain" would
have been a 2, not a 1, so we know it can't be an "i". Thus, the only
way to merge "e#rie 2" into "g????" is "g?r?e".
Similarly the only way to merge "e#rie 2" into "????n" is "e?r?n".
If you wanted to do this more mechanically, you could draw a grid, like
this, with all possible pairs from "e#rie" across the top, and the old
patterns down the side. Compare each pair to get the new PATTERN list.
e?r?? e??i? e???e ??ri? ??r?e ???ie
+-----------------------------------------
g???? | - - - - g?r?e -
???i? | - e??i? - ??ri? - ???ie
????n | e?r?n - - - - -
It's obvious why you can't combine "e?r??" with "g????". e!=g. You
can't combine "e??i?" with "g????" because in the 4th position or
"g????" the ? cannot be an "i". You can't combine "e?r??" with "???i?"
for almost the same reason.
Either way you do the analysis, the new result is
NEGATIVES (unchanged)
POSSIBLES[1] ge
[2]
[3] r
[4] i
[5] ne
PATTERNS
g?r?e
e??i?
??ri?
???ie
e?r?n
Next word is "grasp 1". Knocking out the negatives gives "g###p 1". So
either the first letter must be a 'g' or the last must be a 'p'. We can
merge that into the list of patterns easily enough. Note that the
pattern "e?r?n" is incompatible with this, and so is dropped. Note also
that ? in column 1 cannot be turned into 'g' because 'g' is on the
POSSIBLE list for column 1.
NEGATIVES[1] cfs
[2] rewl
[3] ale
[4] sta
[5] hsrt
POSSIBLES[1] ge
[2]
[3] r
[4] i
[5] nep
PATTERNS
g?r?e
e??ip
??rip
At this point, we are in pretty good shape to guess. "gorse" doesn't
work (the 's' is a negative). The second case could only be 'equip'.
The word 'atrip' is possible for the last. 'adrip' seems like more of a
word, but m-w.com does not agree. So 'equip' is a pretty solid guess at
this point.
This process wouldn't be too hard to implement. The merge step is the
hardest, and that'd probably work cutely with a recursive algorithm. Of
course the run time is exponential in the number of letters. I'm
willing to bet that the problem is actually NP-complete (it just stinks
of satisfiability), so there is little hope of doing better. But as
long is N is just 5, that needn't concern us.
Of course, if you have a good word list, a much simpler approach is just
to read through the word list, discarding any words that don't fit all
the results to date. But that's no fun.
|