Multiple hashtables

kingkwong · Nov 24, 2015

Purpose of having multiple hashtables

I usually only ever use a single hashtable in my maps.

Can someone please tell me possible reasons to have more than one hashtable?

I've seen other peoples maps and spells with different hashtables and I think I'm missing something important.

Wietlol · Nov 25, 2015

There is not really something important, except for easier usage, you may want to have multipl hashtables when you want to store data on the same id without having to bother yourself with adding an offset to the ids.

WaterKnight · Nov 25, 2015

The key dimensions may be lacking. Of course you won't fill 2^64 slots but the domain of definition may be large. Or you just want to hassle less with contingent management. Scattering the data has a better complexity. There are functions like FlushParentHashtable clearing the whole data structure efficiently in one swoop.

kingkwong · Nov 25, 2015

Wietlol said:
There is not really something important, except for easier usage, you may want to have multipl hashtables when you want to store data on the same id without having to bother yourself with adding an offset to the ids.

Can you please explain this id offset?

WaterKnight said:
The key dimensions may be lacking. Of course you won't fill 2^64 slots but the domain of definition may be large. Or you just want to hassle less with contingent management. Scattering the data has a better complexity. There are functions like FlushParentHashtable clearing the whole data structure efficiently in one swoop.

I was just playing around with it and realized that it would be easier to clean up.

PurgeandFire · Nov 25, 2015

If you are working on your own map, then you can usually manage keys easily by yourself. However, if you are designing systems for other people's maps, sometimes you'll just make a hashtable for your own system and ship it as it is. Since the system designer won't know what sort of hashtable implementation you have or which keys you are using, most of them will make their own. However, most of the GUI resources have a variable that you can switch to use your own hashtable, and most JASS resources will use a Table library that uses one hashtable for everything.

That is the main reason why other people's maps will have multiple hashtables.

Sometimes it might make it easier for organizational purposes, too. From a programming standpoint, you usually want to separate hashtables since their lookup/storage can degrade with more entries. It depends on the hashing/design heuristics they use. But with the wc3 implementation, you won't really notice any difference in performance even with millions of entries, so I wouldn't stress it.

DracoL1ch · Nov 25, 2015

sometimes its for extra readability, mostly - for vanilla jassers. I have no idea which keys has been used, and even more, which of them were cleared after usage. overlapping is bad.

kingkwong · Nov 25, 2015

Can you guys tell me how you use a new hashtables?

Do you use one for:
Each ability
Each hero type
Each player
Each ability type (eg. all knockback spells)

Im trying to set up everything nicely before I begin. Looking at some of my old maps, they just got messier and messier over time. I wish there was a Gold Standard format I could learn so no matter what I decide to add in the future, it will be easy to implement.

Wietlol · Nov 25, 2015

kingkwong said:
Can you please explain this id offset?

A hashtable works exactly like a 2-dimensional array of different data types.
(Possibly datasets in each slot allowing multiple data of different types to be stored under the same index, but I haven't tested it out

)

In any case, lets take this (String) array:

When I place "Text" in slot 0, I can load "Text" from slot 0.
But when I also place "Blablabla" in slot 0, then I cannot load "Text" any more.
"Text" has been overwritten by "Blablabla":

In hashtables, the exact same thing happens in it.
But instead of having 1 dimension, you have 2 dimensions (and you are independent from the object type):

In the hashtable, I can have "Text" in slot (0, 0) and "Blablabla" in slot (0, 1) and they will both be saved.

However, in most maps, you want to save something under the same index.
When I save "Text" in (0, 0) and "Blablabla" also in (0, 0) then "Text" will be overwritten again.
I want (for example) store the damage of a unit in my hastable.
So I store it under GetHandleId(whichUnit) and 0.
However, the same is already used by your missile system (or whatever) which uses that very same index to store a nice unique id for each missile.
Now you will have overwritten data because you use the same hashtable for everything.

To avoid things from using the same indexes, you add an offset to either of them to give it a different index.

For example, I want to have a "hashtable" with only 2 rows.
1, for damage and 1 for armor.
Then I use that to store the damage and armor of the unit.
Then I use "another" "hashtable" with only 1 row for the missile system.
That one will instead of writing to 0, write to 0 + offset (which is the total number of rows used by "other" "hashtables" before it... which is 2 in this case.

Then when I want another hashtable of 20 rows, those rows will be 23, 24, 25, ..., 42.

A very good implementation of this is Table.
NewTable (by Bribe) is something that pops up in my mind

Personally, I just don't really care about having everything inside one hashtable because I don't use hashtables that lot.
Table is basically made to avoid having 256+ hashtables (which exceed the limit) and because I don't make them that often, I don't really have the need for Table at all.

WaterKnight · Nov 25, 2015

My practice is that I have a set of hashtables for object instances. The object takes a parent key and child keys of that parent key are used for attributes. You can then use FlushChildHashtable to very efficiently get rid of all the object's data when it's being destroyed. It just matches the pattern well. To scatter the load you can randomize/cycle the used hashtable for an object and store the hashtable in a simple member variable.

Further hashtables are dedicated to special purposes like coordinates/multiple dimensions-based assignments (eg. the terrain type at a point), particular data structures or low level stuff (eg. caching boolexprs/triggers on the code primitive).

kingkwong · Nov 25, 2015

Wietlol said:
A hashtable works exactly like a 2-dimensional array of different data types.
(Possibly datasets in each slot allowing multiple data of different types to be stored under the same index, but I haven't tested it out )
it.

I normally have
save <variable> as <string ID for variable> of <key handle of unit> in Hashtable.
Surely if I use it liike that there should be no overlaps right? Is there a limitation on the rows that you can create? Is it the 2^64 that Waterknight mentioned before? The link is a bit complex for me right now but will try to learn the table.

WaterKnight said:
My practice is that I have a set of hashtables for object instances. The object takes a parent key and child keys of that parent key are used for attributes. You can then use FlushChildHashtable to very efficiently get rid of all the object's data when it's being destroyed. It just matches the pattern well. To scatter the load you can randomize/cycle the used hashtable for an object and store the hashtable in a simple member variable.

Further hashtables are dedicated to special purposes like coordinates/multiple dimensions-based assignments (eg. the terrain type at a point), particular data structures or low level stuff (eg. caching boolexprs/triggers on the code primitive).

How does scattering the load help? Sorry, although I'm a native english speaker, I do not understand a lot of what you are saying.
So for each unit you have, you save its data into this hashtable. When the unit dies, you clear the child hashtable. Is this right?

I don't really know how to use hashtables besides the simple methods they offer in the tutorials (missile system). How would you use it for other things.

Wietlol · Nov 25, 2015

kingkwong said:
I normally have
save <variable> as <string ID for variable> of <key handle of unit> in Hashtable.
Surely if I use it liike that there should be no overlaps right? Is there a limitation on the rows that you can create? Is it the 2^64 that Waterknight mentioned before? The link is a bit complex for me right now but will try to learn the table.

2^64 would be awesome

However the length of an integer is 4 bytes so that is 2^32.
Then there is the point that you also want to be able to use 0 (the first integer), so you would have 2^32 -1.
Then there are negative numbers which basically are the same but then the first bit is 1 instead of 0 so you have a limit of 2^(32-1) as negative value and 2^(32-1) -1 as positive limit.

But anyway, it doesn't really matter how low the chance is that you overwrite something, it just happens now and then.
So for stuff that require a massive sized table, you can create a new hashtable.
For each thing that only needs 1, 2, 3, 10, 100 rows, then you create a Table of it. (As long as you can use vJASS structs. If not, then try using a hashtable but only when necessary.)

Dr Super Good · Nov 26, 2015

There are only 2 reasons.

To avoid hash key collisions. One cannot combine keys derived by different mechanics together because there exists a chance that there might be a key collision. Such a collision can result in hard to detect game breaking bugs. As such for each key source you use you need a separate hashtable.
To improve performance. The underlying hashtable implementation seems to have a maximum limit on bucket array size. It will keep correctly storing and recalling mappings no matter how many are stored. As the bucket array load factor approaches 1 the performance starts to degrade towards O(n) as opposed to the usual and expected O(1). This is a problem when in the order of several hundred thousand mappings. Using multiple hashtables and spreading the data improves performance as you are equivalently increasing the size of underlying bucket array.

People should be using standard names for hashtables. These should describe the sources of parent and child keys. Some common examples...

JASS:

globals
    hashtable HT_TYPE_SKEY // Hashtable for object type parent and unique string derived key for mapping data to object types.
    hashtable HT_HANDLE_SKEY // Hashtable for handle parent and unique string derived key for mapping data to objects.
endglobals

Multiple hashtables

kingkwong

kingkwong

Wietlol

Wietlol

Resources

WaterKnight

WaterKnight

Resources

kingkwong

kingkwong

PurgeandFire

PurgeandFire

Resources

DracoL1ch

DracoL1ch

Resources

kingkwong

kingkwong

Wietlol

Wietlol

Resources

WaterKnight

WaterKnight

Resources

kingkwong

kingkwong

Wietlol

Wietlol

Resources

Dr Super Good

Dr Super Good

Resources

Similar threads