sized arrays vs hashtables

Aniki · Oct 7, 2015

vJass has this feature called sized arrays which turns an array declaration like this:

JASS:

globals
    integer array foo[40955]
endglobals

into something like this:

JASS:

globals
// processed:     integer array foo[40955]

integer array s__foo
integer array s__2foo
integer array s__3foo
integer array s__4foo
integer array s__5foo
integer array s__6foo // funny... generated but not used (silly vJass =))

endglobals

function sg__foo_get takes integer i returns integer
    if(i<8191) then
        return s__foo[i]
    elseif(i<24573) then
        if(i<16382) then
            return s__2foo[i-8191]
        else
            return s__3foo[i-16382]
        endif
    elseif(i<32764) then
        return s__4foo[i-24573]
    else
        return s__5foo[i-32764]
    endif
endfunction

I am curious if it's better than the hashtable native LoadInteger?
Doesn't seem like it, I mean a function call, some comparisons and an array read
vs whatever the hell Blizzard's hashtables do.
It would be rather funny if hashtables are slower right?

So yeah this is kind of a benchmark request, kind of =)...

Bribe · Oct 7, 2015

Hashtables could potentially be slower, but the biggest problems with sized arrays are: limited instance count; indices can be any integer with hashtables.

BPower · Oct 7, 2015

I guess you have to consider:
1. How large your sized array is.
2. How much data is stored within the hashtable.

But some stuff can't be squeezed easily and safely into arrays, like i.e. handle ids or type ids ( unit, destructable, items, ... ).

I guess that finally, like most times, we are talking about speed differences, which do not affect the
map and gameplay at all.

If I would put my money on sized arrays in case of a speed test.
I mean if you need an array size of 20000, you have to compare it with an hashtable with at least 10000 entries.

Zwiebelchen · Oct 8, 2015

Weren't hashtables benchmarked to be only like 60-80% slower than normal array reads (for flushed single-entry hashtables)?
In that case, I'd say hashtables are always superior to sized arrays beyond 8191 array indices.

However, the more stuff you put into the hashtable, the slower it gets (as it get closer to O(n) complexity), so for large quantities of data, a sized array might eventually break even in speed and even surpass hashtable reads.

The question is: how much data is needed for sized arrays to break even with hashtables? Someone should run a benchmark on this!

Aniki · Oct 9, 2015

Kind of off topic but I am really curious to know why vJass generates the if-elseif-else branches this way:

JASS:

function sg__big_array_get takes integer i returns integer
    if i < 8191 then
        return s__big_array[i]
    elseif i < 40955 then
        if i < 16382 then
            return s__2big_array[i - 8191]
        elseif i < 24573 then
                return s__3big_array[i - 16382]
        elseif i < 32764 then
            return s__4big_array[i - 24573]
        else
            return s__5big_array[i - 32764]
        endif
    elseif i < 49146 then
        return s__6big_array[i - 40955]
    elseif i < 65528 then
        if i < 57337 then
            return s__7big_array[i - 49146]
        else
            return s__8big_array[i - 57337]
        endif
    elseif i < 73719 then
        return s__9big_array[i - 65528]
    else
        return s__10big_array[i - 73719]
    endif
endfunction

instead of this:

JASS:

function sg__big_array_get takes integer i returns integer
    if i < 40955 then
        if i < 24573 then
            if i < 8191 then
                return s__big_array[i]
            elseif i < 16382 then
                return s__2big_array[i - 8191]
            else
                return s__3big_array[i - 16382]
            endif
        else
            if i < 32764 then
                return s__4big_array[i - 24573]
            else
                return s__5big_array[i - 32764]
            endif
        endif
    else
        if i < 65528 then
            if i < 49146 then
                return s__6big_array[i - 40955]
            elseif i < 57337 then
                return s__7big_array[i - 49146]
            else
                return s__8big_array[i - 57337]
            endif
        else
            if i < 73719 then
                return s__9big_array[i - 65528]
            else
                return s__10big_array[i - 73719]
            endif
        endif
    endif
endfunction

The second seems to require less comparisons (<) on average.

This is for 10 arrays (obviously).

Bribe · Oct 9, 2015

JassHelper is open source so you can mod it if you want. Otherwise you can code that using bracket operators "[]" to preserve array syntax.

LeP · Oct 9, 2015

Aniki said:

Kind of off topic but I am really curious to know why vJass generates the if-elseif-else branches this way:

JASS:

function sg__big_array_get takes integer i returns integer
    if i < 8191 then
        return s__big_array[i]
    elseif i < 40955 then
        if i < 16382 then
            return s__2big_array[i - 8191]
        elseif i < 24573 then
                return s__3big_array[i - 16382]
        elseif i < 32764 then
            return s__4big_array[i - 24573]
        else
            return s__5big_array[i - 32764]
        endif
    elseif i < 49146 then
        return s__6big_array[i - 40955]
    elseif i < 65528 then
        if i < 57337 then
            return s__7big_array[i - 49146]
        else
            return s__8big_array[i - 57337]
        endif
    elseif i < 73719 then
        return s__9big_array[i - 65528]
    else
        return s__10big_array[i - 73719]
    endif
endfunction

instead of this:

JASS:

function sg__big_array_get takes integer i returns integer
    if i < 40955 then
        if i < 24573 then
            if i < 8191 then
                return s__big_array[i]
            elseif i < 16382 then
                return s__2big_array[i - 8191]
            else
                return s__3big_array[i - 16382]
            endif
        else
            if i < 32764 then
                return s__4big_array[i - 24573]
            else
                return s__5big_array[i - 32764]
            endif
        endif
    else
        if i < 65528 then
            if i < 49146 then
                return s__6big_array[i - 40955]
            elseif i < 57337 then
                return s__7big_array[i - 49146]
            else
                return s__8big_array[i - 57337]
            endif
        else
            if i < 73719 then
                return s__9big_array[i - 65528]
            else
                return s__10big_array[i - 73719]
            endif
        endif
    endif
endfunction

The second seems to require less comparisons (<) on average.

This is for 10 arrays (obviously).

http://www.wc3c.net/showthread.php?t=99939

Vexorian said:
As for the binary split, I decided not to apply it directly, instead it will wait till there are more than 4 or 3 arrays and the remaining arrays are even. This speeds up the execution for low indexes, which are supposedly used more often.

Aniki · Oct 9, 2015

Yeah I guess that low indexes are more likely to be used than higher ones so... probably the right call by Vexorian here =).
But yeah that was before the hashtable natives were introduced which I am pretty sure are faster (1 native call vs "a function call, some comparisons and an array read").

Thank you for the reference LeP!

edo494 · Oct 9, 2015

I cant imagine hashtables being slow, just because native call is always faster than jass function call, + jass is interpreted and the native function is coded in well, native code(e.g. assembly in the binary)

Dr Super Good · Oct 9, 2015

I cant imagine hashtables being slow, just because native call is always faster than jass function call, + jass is interpreted and the native function is coded in well, native code(e.g. assembly in the binary)

Hashtables are slower than arrays due to the function call, multiple arguments and hashtable mechanics. Hashtables can also get very slow due to collisions as they lack a dynamic bucket array (or the bucket array has a maximum size).

JASS is not purely interpreted in the sense that it does compile to byte code during map load which is executed during a session. Sure the byte code is interpreted but that is done very efficiently. JASS is interpreted in the sense that every named element has to be dynamically looked up at run time (no linking) which is actually why it is so slow. This applies to custom functions, natives and all variables since they are named elements. Each lookup involves hashing the name at least once and performing a hashtable access to resolve a link to the underlying implementation of an element. This is why an optimization in JASS is to reduce name lengths since the hashing algorithm is O(n) in complexity where n is name length referring to a named element.

Native functions do run faster than custom functions but that is because native functions are not JASS so execute at native speed after they have been resolved and called.

edo494 · Oct 9, 2015

I know how it looks, and resolving collisions in native, compiled code is a lot faster than you may think. Probably 3 orders of magnitude faster than single Jass function call. Dont forget his array is nice, but this is basically a competition between native function call + some native code vs a Jass function call, a bunch of Jass comparisions, Jass math operation, array index and then finally Jass return.

Imo thats a lot of operations needed to be done.

sized arrays vs hashtables

Aniki

Aniki

Resources

Bribe

Bribe

Resources

BPower

BPower

Resources

Zwiebelchen

Zwiebelchen

Resources

Aniki

Aniki

Resources

Bribe

Bribe

Resources

LeP

LeP

Aniki

Aniki

Resources

edo494

edo494

Resources

Dr Super Good

Dr Super Good

Resources

edo494

edo494

Resources

Similar threads