Problems using nested loops to find words

chanyman · Mar 23, 2011

Hey guys,

Im trying to use nested loops to find words from letters. For example if I type in 'four' I would like to find 'for' from it. I can get it to find 'ship' from 'mothership' but I cant get it to find 'thor' from 'mothership'. It compares the letters that are all added up together to an array with the words i want to find.

Basically what im doing is looping but the next nexted loop has a offset of 1.
Its a really long trigger if anyone wants an example, I just dont want to post it here because its really long, and im sure it can be optimized but at this point im really just looking for an example.

Thanks

AionCannonzSC2 · Mar 23, 2011

you want a function that will find all the possible words that use the combination of letters in an entered word?

chanyman · Mar 23, 2011

Yeah, Ive worked out how to do it, but for 10 letters it works out to be something like 100,000,000,000 actions, which would take far to long, im going to shift it down to 6 letters.

Dr Super Good · Mar 23, 2011

You will need some form of efficent data system for this. A system which allows for constraints (letters and their quantities) and returns a subset of those. The quantity of data needed though will definatly need to be generated from a script otherwise you could spend years entering words.

chanyman · Mar 23, 2011

I dont quite understand what your saying their supergood. I've been trying to come up with a more efficient answer to it because at the moment anyone without a quad core who plays the game their system will take forever to generate the words.

I have got a working system but the problem is it takes a lot to go through all the possible combinations of a 10 letter word:
10 Letters - 10^10 - 10,000,000,000
9 Letters - 9^9 - 387,420,489
8 Letters - 8^8 - 16,777,216
7 Letters - 7^7 - 823,543

(this doesnt include checking the round up against the array)
Also triggers can only handle about 50,000 actions (depending on trigger) per execution

So if I used my current implementation with 7 letters comparing against the array it takes about 40 blizzard seconds, and by then people want to be playing, but this method does succeed in finding all possible answers.

Dr Super Good · Mar 24, 2011

You need to work out a better algorithm...

Such a problem can probably be solved in under a second via an indexed database.

chanyman · Mar 24, 2011

Ok, I sort of see where you going but could you clarify once more?

Do you mean haveing an array with all the possible words in it? If so thats what I already have.

Dr Super Good · Mar 24, 2011

No, I mean writie a proper algerthim to find the results. One with an efficiency better than O(N^N) even O(N!) would be a start.

chanyman · Mar 24, 2011

Lol My maths is terrible, so your going to have to dumb speak to me. i have no idea what O(N!) is.

EDIT: quick google means factorial,

I tried to do the factorial earlier but it doesnt go through all possible combinations.

Dr Super Good · Mar 24, 2011

YOu are not after going through all possible combinations. You are after going through stuff that could potentially have an answer. For example you could break the word down into a number of letters and store each word along with its number of letters. Thus you then search for results which have <= the entered numbers of each letters.

That would bring it down to O(numberofwords) complexity for any input. By arranging in orderd form you could reduce scope size dramatically to even less as if a word has no "A" you could skip itterating through all words which do have "A" (for laters ofcourse not just "A").

The end result could have a complexity of only a few thousand at most for all words in the dictionary, which is more than reasonable for a search such as that.

chanyman · Mar 24, 2011

Ok thanks for clearing that up, but I dont understand how you would do it, If you have time would you be able to do up a quick example?

Sorry if I'm being dumb, but I'm not sure how you would do the loops.

EDIT: Ok I just thought of something with that last post. this system is used to find all the words that come from a 10 letter word, and 10 is the maximum number of letters a word can be. So for example it would pick 'mothership' it would then go to find all the words in the word 'mothership' comparing each change to an array.

Also their is not that many 10 letter words so it would not end up reducing the number of iterations by that much.

One more thing, to exclude every word without a 'a' it would then require the letters or words to be stored as well in seperate letter form, this would create lots of arrays and ultimately create more iterations because you would need to compare them to the current word.

Dr Super Good · Mar 24, 2011

Are you even taking an algerthims and data structures course? You really need one to create effective systems such as this.

You need to organize the data such that when searching it will have to search through less (can stop itterating when an result has exceeded any possibility of being correct).

Lets say you organize a list of numbers in numerical order and perform a linear search on it. It is clear you can stop itterating through the elements once the elements being checked are greater than what you are after as from then on all elements will be greater.

This would give you a best case performance of O(1) if what you are after is less than or equal to the first element and O(n) if what you are after is greater than or equal to the last element. This is a minor improvement over an unsorted list where you have a best case performance of O(1) if it is the first element and O(n) when it is the last element or an element not in the list. By sorting it you can greatly improve the efficency of not in list elements down from O(n).

However, that does not abuse the fact the data is sorted well. Sorting it allows you to perform a binary search that has slightly more demanding itterations but a best case performance of O(1) if the middle element or worst case performance of O(log2(n)) for edge values or values not in the list. This is increadably fast as the number of elements checked per extra itteration increases exponentially.

A linear search on unsorted data with 1000000 elements would have a worst case order of O(1000000).
A binary search on sorted data with 1000000 elements would have a wost case order of O(20).

Even though itterations of binary search are slower than linear search, linear search needs 50000 times more itterations which more than makes up for it as it is only marganaly slower.

*log2(n) is the log to the base 2 of n which returns x from the equation n = 2^x. An inverse to the exponential opperation.

So where am I going with this? Well I am not trying to say this problem is solvable directly with binary search but I am saying that you need to use algorthims like it to greatly reduce the itterations needed.

So what algorthims should be used? I do not know for sure sadly and this very problem has been at the back of my mind for years since Highschool but I never have come around to try and impliment it. I would definatly say you need to use ones with O(~1) or O(log2(n)) order where ever possible if you expect it to be real time.

Examples.
O(~1) - hashtable with large enough supporting array
O(log2(n)) - binary search

chanyman · Mar 25, 2011

First of all, thank you so much for your help, you have really got me thinking on something that will help.

The initial method I was using was terrible with N^N because it means it will cycle through combinations that are not possible as you can only use each letter once.

So in order to marginalize the amount I trying to use factorials with what you said before O(n!). This only allows for each letter to be used once.

Would using instead of 10x9x8.....3x2x1 use 1x2x3....8x9x10 be more efficient? Like you said, I then only need to search for words which have a greater number of letters, this again reducing the required iterations.

I did just wikipedia the binary search and it seems way more efficient then factorials or any other algorithm for that matter. So in order to use a binary search I would have to order the words I'm looking for by the number of letters they have. This seems like a way better idea. So correct me if I'm wrong but is this how a binary search works. At the start it will compare the number of letters in the search string to the number of letters in the middle of the array. Should it equal the number of letters it returns the position of it, if it does not it performs a search on the remaining, albeit greater or lesser than the searchable number of letters.

The thing with this sort of search is that I'm not searching for a typical word in a string, what I am doing is I'm trying to find words inside a 10 letter word, 10 letters being the biggest number possible, and 4 being the least. In this case, I dont understand how you would apply a binary search to this as I would always be searching for exactly the same amount of iterations as I'm looking for all possible words instead of just 1 word.

On a side note, I have never taken a course on algorithms or data structures. I only finished high school last year. I have dealt with lots of databases as such but never with this sort of complexity such as hashtables.

Vex · Apr 9, 2011

I understand what you want and think the best course of action would be to ask in a more programming-related forum. The complexity of this request exceeds what normal "mortals" would know right off the bat

Like me, I've made some binary search trees, but would have to do a lot of research to learn if I could help with your problem.

Good luck figuring it out!

Edit: Ok, I came up with something. This is not a solution, more like an idea that will take up a lot of storage space, but will be relatively fast (don't ask me about math

).

- You make a search tree of depth n where each node represents a character in the alphabet and splits in to 26 nodes (a-z). The number of nodes will be 1+ 26^n I guess.

- Each bottom node will have a list of words that all start with the combination of characters on the path from the root to the last node. For instance, if you go through the nodes containing 's' and 'i' and 'i' was the last node on the path, you would find words like 'ck' (sick), 'ckle' (sickle), 'nce' (since) in its string list. Any combination of characters that doesn't produce any words, will just have a null or false variable, so you fast and easy know its a dead-end.

- The idea is to split up the string you want words from into a char array, then go through the tree from root to end recursively. Meaning, if you have the string 'apple', you start with a, mark 'a' as used, then from the 'a' node, go through the 'p' node, mark 'p' as used, then go through the 'p' node connected to the last 'p' node, mark the second 'p' as used, then 'l', then 'e' until you are
1. ...out of characters, if you didn't reach a string-list, the word you're going through doesn't exist (in your data-structure at least).
2. ...reach a string-list, in which you iterate through and use nested loops or whatever to filter out all words that matches the combination of the remaining characters (that haven't been marked as 'used').
3. ...reach a dead-end, in other words a false boolean or empty string-list, meaning the current combination of characters doesn't form any word known here.
For each node you come to where you have multiple unmarked characters remaining, you will have to visit one node for each one (make a "thread" for each).

- To summarize:
Lets say you have the word 'apple'. Instead of making 5! search calls. This method will try a,p,p,l,e then a,p,l and stop there (instead of trying a,p,l + any combination of the remaining characters), because no words exist that start with 'apl'. You could to the same with a string array, but then a search call would take forever.

Problems using nested loops to find words

chanyman

chanyman

AionCannonzSC2

AionCannonzSC2

chanyman

chanyman

Dr Super Good

Dr Super Good

Resources

chanyman

chanyman

Dr Super Good

Dr Super Good

Resources

chanyman

chanyman

Dr Super Good

Dr Super Good

Resources

chanyman

chanyman

Dr Super Good

Dr Super Good

Resources

chanyman

chanyman

Dr Super Good

Dr Super Good

Resources

chanyman

chanyman

Vex

Vex

Similar threads