LUA tests and benchmarks

Trokkin · Jun 26, 2019

I've been asked to check several statements about LUA, so here's a bunch of tests and benchmarks from me.
This compilation might be supplemented should any questions appear. Ask any!

I'm using Trokkin/CeresStdLib for utility.

This test shows that coroutines are not multithreaded and could even be seen as direct function calls.
Which is pretty expected results, that we don't have proper multithreading here, considering wc3 net architecture.

Code:

busy = false
multithreaded = false

function a(n)
    local i = 0
    if busy then
        multithreaded = true
    end
    busy = true
    while i < 1000000 do
        i = i + 1
    end
    busy = false
    print("Thread #" .. n .. " finished it's work")
end

function check()
    if busy then
        print("Still busy")
    end
    if multithreaded then
        print("We're multithreaded!")
    else
        print("We're not multithreaded :(")
    end
end

init(function ()
    for i = 1,10 do
        coroutine.resume(coroutine.create(a), i)
    end
    check()
    TimerStart(CreateTimer(), 0.00, false, check)
    TimerStart(CreateTimer(), 1.00, false, check)
end)

Yay, dynamic Lua execution.

Code:

shell = CreateTrigger()
TriggerRegisterPlayerChatEvent(shell, Player(0), '%', false)
TriggerAddAction(shell, function()
    local s = GetEventPlayerChatString()
    s = SubString(s, 1, StringLength(s))
    local f = load(s)
    if not f then
        print('invalid shell command \'' .. s .. '\'')
    end
    f()
end)

This test takes a lot of time to iterate 2^31 times but proves the statement.
Meanwhile, JASS outputs 230768 on an equivalent test.

Code:

i = 0
function a()
    while i < 2147483647 do
        i = i + 1
    end
end

function check()
    print(i)
end

init(function()
    TimerStart(CreateTimer(), 0.00, false, check)
    a()
end)

Please note that results are highly prone to errors, I never get same result twice. I estimate precision to be +-3% of 'true' value.

Also, never compare language feature tests done on different machines.

Results (ms / sec per 1e6 calls):

nothing -> 0.025
test -> 0,0923
direct call -> 0,1164
indirect call -> 0,1159
indirect call #2 -> 0,1176
indirect call #3 -> 0,1190
indirect call #4 -> 0,1176
coroutines -> 3,636
coroutines #2 -> 2,143
pcall -> 0,1530
pcall #2 -> 0,1770
pcall #3 -> 3,345
pcall #4 -> 5,070
ForForce -> 3,550

Thus direct function calls are way superior to any other function calling method. Btw, hail code arrays, lol.

Code:

init(function()
    local function test()
        for i=1,10 do
        end
    end

    benchmark('nothing', function() end)
    benchmark('test', test)
    benchmark('direct call', function()
        test()
    end)

    local f = test
    benchmark('indirect call', function()
        f()
    end)

    local f = function()
        for i=1,10 do
        end
    end
    benchmark('indirect call #2', function()
        f()
    end)
 
    local arr = {}
    arr[0] = function ()
        for i=1,10 do
        end
    end
    arr[1] = test
    benchmark('indirect call #3', function()
        arr[0]()
    end)
    benchmark('indirect call #4', function()
        arr[1]()
    end)

    benchmark('coroutines', function()
        local c = coroutine.create(test)
        coroutine.resume(c)
    end)

    local c = coroutine.create(function ()
        while true do
            test()
            coroutine.yield()
        end
    end)
    benchmark('coroutines #2', function()
        coroutine.resume(c)
    end)

    benchmark('pcall', function()
        pcall(test)
    end)

    benchmark('pcall #2', function()
        pcall(function ()
            test()
        end)
    end)

    benchmark('pcall #3', function()
        pcall(function ()
            test()
            error("oops")
        end)
    end)

    benchmark('pcall #3', function()
        pcall(function ()
            test()
            a = "a" + 1 -- error: attempt to perform arithmetic on a string value
        end)
    end)

    local force = CreateForce()
    ForceAddPlayer(force, GetLocalPlayer())
    benchmark('ForForce', function()
        ForForce(force, test)
    end)
    DestroyForce(force)
end)

Code:

replaceNative("Player", function(i) return players[i] end)

ceres.addHook("main::before", function()
    localplayer = GetLocalPlayer()
    replaceNative("GetLocalPlayer", function() return localplayer end)
end)

Results (ms / sec per 1e6 calls):

GetLocalPlayer() -> 0.062
Player(0) -> 0,079
Native.GetLocalPlayer() -> 0,402
Native.Player(0) -> 2,336
players[0] -> 0,057
localplayer -> 0,048

I'm building an assumption that the less JASS natives/functions you use, the faster your code runs.

Code:

init(function()
    benchmark(
        'GetLocalPlayer()',
        function()
            local a = GetLocalPlayer()
        end
    )
    benchmark(
        'Player(0)',
        function()
            local a = Player(0)
        end
    )
    benchmark(
        'Native.GetLocalPlayer()',
        function()
            local a = Native.GetLocalPlayer()
        end
    )
    benchmark(
        'Native.Player(0)',
        function()
            local a = Native.Player(0)
        end
    )
    benchmark(
        'localplayer',
        function()
            local a = localplayer
        end
    )
    benchmark(
        'players[0]',
        function()
            local a = players[0]
        end
    )
end)

Result - 4,302 vs 3,854
10% advantage in favor of not using JASS again.

Code:

local time1 = 0
local time2 = 0
local function benchmark_timers()
    doAfter(0.01, function()
        local clock
        local start
        doAfter(0, function()
            clock = os.clock
            start = clock()
        end)
        for i = 1, 10000 do
            TimerStart(CreateTimer(), 0, false, function()
                DestroyTimer(GetExpiredTimer())
            end)
        end
        doAfter(0, function()
            time1 = time1 + (clock() - start)
        end)
    end)
    doAfter(0.02, function()
        local clock
        local start
        doAfter(0, function()
            clock = os.clock
            start = clock()
        end)
        for i = 1, 10000 do
            local t = CreateTimer()
            TimerStart(t, 0, false, function()
                DestroyTimer(t)
            end)
        end
        doAfter(0, function()
            time2 = time2 + (clock() - start)
        end)
    end)
end

init(function()
    doPeriodicalyCounted(0.03, 100, benchmark_timers)
    doAfter(3, function()
        Log.info(time1 .. ' vs ' .. time2)
    end)
end)

Bribe · Jun 26, 2019

When you say "coroutines", do you mean like assigning variables to functions and then calling said function later on? I'm wondering if that has a performance impact VS simply calling said function directly.

ie.

function f()
end

f()

vs.

f = function()
end

f()

Trokkin · Jun 27, 2019

Bribe said:
When you say "coroutines", do you mean like assigning variables to functions and then calling said function later on? I'm wondering if that has a performance impact VS simply calling said function directly.

ie.

function f()
end

f()

vs.

f = function()
end

f()

When I say "coroutines", I mean LUA coroutines, which is native language feature. And they are indeed much heavier than plain function calls but enables function to yield more than one value at a time. Also, easy producer-consumer solving and other benefits that aren't quite fit to what is required for WC3.

Bribe · Jun 27, 2019

I've got a lot to learn! This makes me really look forward to a possible future where you can take an existing JASS-based map and convert it to a Lua-baaed on (preferrably with vJass/Lua coexistence like TriggerHappy enabled in his JNGP.

Trokkin · Jun 27, 2019

Updated.

I could have made similar benchmarks for JASS to compare their performance, but I honestly don't think it is worth the trouble. I don't even consider JASS worth of using anymore. I mean, prove me wrong, but I can't see anything that JASS does better than LUA, with the fact that a simple compiler could easily convert legacy JASS code and even vJASS into LUA.

Bribe · Jun 27, 2019

So a 10% loss in speed via passing function values - that's absolutely an jnsane jump in performance! Event systems could've never dreamed of such efficiency.

Trokkin · Jun 27, 2019

Updated again: fixed indirect calls to not waste time on local variables and added another coroutine benchmark, showing that .resume is twice as heavy as .create. Also rerun tests on 10x to increase precision.
Coroutines provide a unique way to run code, so although they are way less performant, they're still valuable feature.

Btw, Bribe, I guess it's more like 20%, since benchmarking 'test' gives a lot less than 'function () test() end'
Sad but still awesome compared to old ways to run code.

Bribe · Jun 27, 2019

Trokkin said:
Updated again: fixed indirect calls to not waste time on local variables and added another coroutine benchmark, showing that .resume is twice as heavy as .create. Also rerun tests on 10x to increase precision.
Coroutines provide a unique way to run code, so although they are way less performant, they're still valuable feature.

Btw, Bribe, I guess it's more like 20%, since benchmarking 'test' gives a lot less than 'function () test() end'
Sad but still awesome compared to old ways to run code.

So with coroutines are we actually delegating WarCraft 3 to run on other CPU cores? Makes me wonder if the game engine and user script are currently in sync and - if so - would the game engine continue running once the coroutine kicked in? Maybe this could be used instead of a second timer for certain behavior.

Unless of course the main thread is suspended while the user coroutine is deployed.

Dr Super Good · Jun 27, 2019

Trokkin said:
I've been asked to check several statements about LUA

You can firstly check how to spell it. It is Lua not LUA.

Lua (programming language) - Wikipedia

Trokkin said:
This test shows that coroutines are not multithreaded and could even be seen as direct function calls.

As specified by the Lua manual...

Lua supports coroutines, also called collaborative multithreading. A coroutine in Lua represents an independent thread of execution. Unlike threads in multithread systems, however, a coroutine only suspends its execution by explicitly calling a yield function.

From what I can tell collaborative multithreading refers to a thread based implementation of cooperative multitasking. By very definition this means only 1 thread will be executing at any time.

This is required since otherwise Lua has limited ability to handle race conditions.

Trokkin said:
Apparently, loadstring() leads to a silent thread crash or something where it is called.

Obviously as "loadstring" does not exist. That function is part of the C level API for Lua.

Any sort of I/O based functions do not work in Warcraft III's Lua. This is likely for security reasons.

Trokkin said:
Results (ms / sec per 1e6 calls):

test -> 0,0923

direct call -> 0,1164

indirect call -> 0,1159

indirect call #2 -> 0,1176

indirect call #3 -> 0,1190

indirect call #4 -> 0,1176

coroutines -> 3,636

coroutines #2 -> 2,143

ForForce -> 3,550

Thus direct function calls are way superior to any other function calling method. Btw, hail code arrays, lol.

No one really cares for the performance of Lua with respect to itself as that is well documented online. People care about the performance of Lua with respect to JASS and exactly how much faster it is than JASS.

Bribe said:
When you say "coroutines", do you mean like assigning variables to functions and then calling said function later on? I'm wondering if that has a performance impact VS simply calling said function directly.

No he is referring to the Lua Coroutine functionality. Their purpose is to implement basic asynchronous operation support where a calculation or operation takes a significant amount of time and may need to suspend execution multiple times to allow other threads to run. This is a very old and light weight form of multi tasking which only supports a single thread running at any given time and hence has no concern of race conditions due to all instructions executed having a well defined order.

An example in Warcraft III where one might use coroutines is with terrain generation. Terrain generation can take a significant amount of time, in the order of seconds. If the time taken is longer than 0.02 ms then there will be a visible performance problem due to a reduction in frame rate. To stop this one could run the generation algorithm as a coroutine which every 1,000 or 10,000 operations yields. Then a periodic timer could be used that every 0.02 ms resumes the coroutine. This prevents the Lua virtual machine from stalling the execution of Warcraft III for seconds while terrain is generated and hence allows the frame rate to remain high.

Another example is to implement proper, accurate Wait function calls. A function with accurate wait is started from the callback of a timer by resuming its coroutine. This resumed function then advances until it wants to wait in which case it then yields, passing an argument of the desired wait time. The timer callback then starts the timer with the returned wait time and then finishes. After the desired game time the timer will again run the callback which will resume the coroutine which will then continue execution from after the yield call. Of course this example is theoretical, it seems likes it would work but I have not gotten around to testing it.

Bribe said:
So with coroutines are we actually delegating WarCraft 3 to run on other CPU cores?

See above. Lua is specified to run only on a single execution unit with cooperative multitasking (collaborative multithreading) support. Same as JASS and Galaxy and pretty much all scripting languages. This is required to avoid race conditions which can occur with pre-emptive or simultaneous multithreading.

You can only feel lucky they do this since race conditions are some of the hardest errors to track down and would be responsible for immeasurable numbers of crashes and out of sync bugs.

Bribe said:
Makes me wonder if the game engine and user script are currently in sync and - if so - would the game engine continue running once the coroutine kicked in?

If all Coroutines yield and all executing Lua threads finish then the game engine will resume until it next invokes the Lua virtual machine. At this time new Lua threads will start to execute and coroutines can be resumed by those.

Cokemonkey11 · Jun 28, 2019

Dr Super Good said:
You can firstly check how to spell it. It is Lua not LUA.

Lua (programming language) - Wikipedia

Who cares?

Dr Super Good said:
As specified by the Lua manual...

From what I can tell collaborative multithreading refers to a thread based implementation of cooperative multitasking. By very definition this means only 1 thread will be executing at any time.

This is required since otherwise Lua has limited ability to handle race conditions.

Have you really never heard of GIL?

Dr Super Good said:
No one really cares for the performance of Lua with respect to itself as that is well documented online. People care about the performance of Lua with respect to JASS and exactly how much faster it is than JASS.

Of course we care. We're interested in knowing which from different suppoted LUA methods for solving the same problem would be best selected for new projects.

="Dr Super Good said:
No he is referring to the Lua Coroutine functionality.

This question was already answered.

="Dr Super Good said:
See above. Lua is specified to run only on a single execution unit with cooperative multitasking (collaborative multithreading) support. Same as JASS and Galaxy and pretty much all scripting languages. This is required to avoid race conditions which can occur with pre-emptive or simultaneous multithreading.

This doesn't necessitate running on a single CPU core. Sure, it almost definitely does that, but you explanation is both false, and boring.

="Dr Super Good said:
You can only feel lucky they do this since race conditions are some of the hardest errors to track down and would be responsible for immeasurable numbers of crashes and out of sync bugs.

Implying that running on multiple CPU cores necessarily yields race conditions. It does not.

Dr Super Good said:
If all Coroutines yield and all executing Lua threads finish then the game engine will resume until it next invokes the Lua virtual machine. At this time new Lua threads will start to execute and coroutines can be resumed by those.

Citation needed

Dr Super Good · Jun 28, 2019

Cokemonkey11 said:
This doesn't necessitate running on a single CPU core. Sure, it almost definitely does that, but you explanation is both false, and boring.

Only 1 Lua thread will be executing at any given time due to how Lua has specified that coroutines operate. As I explained above. There is no ambiguity in it, and any other behaviour would be a pointless deviation from the Lua standard.

Cokemonkey11 said:
Implying that running on multiple CPU cores necessarily yields race conditions. It does not.

Yes it will cause race conditions if given into the hands of the average map maker. These are the same map makers that thought that running every function call in its own thread in StarCraft II made stuff run faster. Fortunately that also uses a collaborative multithreading model so it only made stuff run slower instead of creating hundreds of help requests to diagnose race conditions. Ultimately people like myself would then have to help diagnose these race conditions, which should not and will not exist in the first place.

An example of a race condition would be if 2 simultaneously executing threads were to try and change a units stats, say by getting the current health of the unit and adding some amount to it. The result could be correct that both health additions applied. The result could also be wrong with only 1 of the health additions applying. Some garbage amount of health could occur due to how memory was read and cached. The application might out right crash because of something that happens under the hood becoming inconsistent such as a bitfield flag being set. Now one can fix this by Java style synchronization blocks around the entire code, but what that has just done is turn the simultaneously executing threads into cooperatively executing threads which is what we have already. Even if one guarded just each native individually that would not solve all the race conditions described. Let us not forget that all that synchronization adds overhead.

Most tasks that people want to do with triggers will not and mostly cannot benefit from simultaneous multithreading.

Cokemonkey11 said:
Citation needed

William has used coroutines to create an accurate wait function. As described in my example above. It worked. You can try it yourself, there is nothing secret or complicated about it.

Trokkin · Jun 30, 2019

Dr Super Good, is there a way to use Lua functions from jass? If there's any, then I'd be willing to benchmark jass alongside with Lua of course.

Dr Super Good · Jun 30, 2019

Trokkin said:
Dr Super Good, is there a way to use Lua functions from jass? If there's any, then I'd be willing to benchmark jass alongside with Lua of course.

That is not possible. Only 1 virtual machine can be used. As such a map will either use all JASS, or it will use all Lua. To help with porting there is a JASS To Lua transpiler, but the other direction does not exist.

The way I would recommend benchmarking is to do some common coding operations which can be implemented like for like in both JASS2 and Lua, loop them a lot so that the execution is non-trivial and compare the overall time that WC3 is frozen. Be aware that JASS has an oplimit which crashes the thread and so must not be run into for fair comparisons.

Trokkin · Jun 30, 2019

Dr Super Good said:
The way I would recommend benchmarking is to do some common coding operations which can be implemented like for like in both JASS2 and Lua, loop them a lot so that the execution is non-trivial and compare the overall time that WC3 is frozen.

That's exactly how I benchmarked Lua here, if you wonder. I was asking because Jass has zero functionality to measure time spent frozen, without dll injections with stopwatch natives. If someone with those injections is willing do this benchmarking, I'm willing to provide tests.

Trokkin · Jun 30, 2019

pcall seems to be very performant (~20% less than direct calls), especially compared to old ways of error handling.
Except for cases when an error is thrown - a pcall with an error throw is equivalent to both coroutine creation and exploit, other kinds of errors seems to take even more time.
xpcall, by the way, is performant equivalently to pcall, error handling function excluded.

Dr Super Good · Jun 30, 2019

Trokkin said:
That's exactly how I benchmarked Lua here, if you wonder. I was asking because Jass has zero functionality to measure time spent frozen, without dll injections with stopwatch natives. If someone with those injections is willing do this benchmarking, I'm willing to provide tests.

Well traditionally people counted how many seconds WC3 is frozen with a stop watch. And I mean manually starting and stopping it, not very scientific but good enough if the freeze time is long enough to minimize error.

For example if it is frozen ~35 seconds while Lua is frozen 33.3 then one can say they are pretty much as fast for that kind of operation. However if JASS takes ~40 seconds while Lua takes 11.2 then JASS is nearly 4 times slower for that sort of operation.

This sort of information has the potential to be very useful at persuading people to port their JASS maps to Lua. Especially if their map suffers from performance issues a large order of magnitude improvement in trigger performance could be a big deal. On the other hand if for common sorts of operations JASS is only a few percent slower then then there is very little to be gained from such a port and instead the developer will need to focus on general optimizations.

TriggerHappy · Jun 30, 2019

Trokkin said:
If someone with those injections is willing do this benchmarking, I'm willing to provide tests.

It should be pretty easy to test on your own

Custom Natives on 1.29.2
[Need Help] Back to 1.29

Trokkin · Jun 30, 2019

TriggerHappy said:
It should be pretty easy to test on your own

Custom Natives on 1.29.2
[Need Help] Back to 1.29

Last time I tried that out, my linux complained about something something achtung dll injection something something. Anyways, my interest in researching JASS fades away any second I write in Lua, so I'm definitely not going to do that.
I'm a bad guy, I know that :grin:

Dr Super Good said:
Well traditionally people counted how many seconds WC3 is frozen with a stop watch. And I mean manually starting and stopping it, not very scientific but good enough if the freeze time is long enough to minimize error.

For example if it is frozen ~35 seconds while Lua is frozen 33.3 then one can say they are pretty much as fast for that kind of operation. However if JASS takes ~40 seconds while Lua takes 11.2 then JASS is nearly 4 times slower for that sort of operation.

This sort of information has the potential to be very useful at persuading people to port their JASS maps to Lua. Especially if their map suffers from performance issues a large order of magnitude improvement in trigger performance could be a big deal. On the other hand if for common sorts of operations JASS is only a few percent slower then then there is very little to be gained from such a port and instead the developer will need to focus on general optimizations.

lol, that's hilarious but is kinda working way.

Those tests I've done already shows the superiority of Lua over JASS. ForForce is the fastest way to run function reference in JASS, and ForForce is 30+ times slower than function calls within Lua. Magnitudes.
Then, performance isn't the ultimate argument, and if JASS isn't lagging that hard then why bother about performance anyway. For those who care about performance usually also works an argument that Lua is also much more convenient for coding.

lishi1608 · Jul 1, 2019

Trokkin said:
Updated.

I could have made similar benchmarks for JASS to compare their performance, but I honestly don't think it is worth the trouble. I don't even consider JASS worth of using anymore. I mean, prove me wrong, but I can't see anything that JASS does better than LUA, with the fact that a simple compiler could easily convert legacy JASS code and even vJASS into LUA.

There are lots of good useful snippets/systems in vJASS, that's why I still choose it. If there are alternatives in lua I probably will choose lua.

Trokkin · Jul 1, 2019

lishi1608 said:
There are lots of good useful snippets/systems in vJASS, that's why I still choose it. If there are alternatives in lua I probably will choose lua.

It isn't hard to translate vjass to lua, at least without language specific optimizations. I've heard there is already a translator that does it automatically. Also I'm currently working on kind of a standard library for Lua, so I'd like to hear what exactly is holding you back on vJass, that might get a priority on my todo list.

lishi1608 · Jul 1, 2019

Trokkin said:
It isn't hard to translate vjass to lua, at least without language specific optimizations. I've heard there is already a translator that does it automatically. Also I'm currently working on kind of a standard library for Lua, so I'd like to hear what exactly is holding you back on vJass, that might get a priority on my todo list.

that will be awesome to have a compiler from vJASS to Lua, could you share a link?

I use a lot of libs like:
UnitDex by TriggerHappy
Table, SpellEffectEvent by Bribe
RegisterNativeEvent, RegisterPlayerUnitEvent by Bannar
TimerUtils
GroupUtils by Rising_Dusk
DamageEngine by Bribe (I edited it to fully in vJASS in my map)
xecast by Vexorian

and some other less commonly used systems

I suspect some vJASS system exist because people don't have useful language features/standard libs to make things. Maybe with lua, they will be less important.

And I suggest you look at the wurst project. If memory serves they have translated some useful libs to wurst which you may find useful for your todo list.

edit: it's here. These may be very commonly use lib: wurstscript/WurstStdlib2

Dr Super Good · Jul 1, 2019

One can currently convert from vJASS to JASS (JASSHelper) and then from JASS to Lua (the inbuilt JASS2ToLua transpiler). The issue is that there are some bugs with this and also the performance is sub optimal compared with using Lua directly. In theory one could write a vJASSToLua transpiler which could create very efficient code by taking advantage of Lua features, but seeing how recent 1.31 was released that may take many months before such a project is mature enough to be useful for production.

Cokemonkey11 · Jul 2, 2019

Dr Super Good said:
Only 1 Lua thread will be executing at any given time due to how Lua has specified that coroutines operate. As I explained above. There is no ambiguity in it, and any other behaviour would be a pointless deviation from the Lua standard.

Fine, still doesn't necessitate running on a single core.

Dr Super Good said:
Yes it will cause race conditions if given into the hands of the average map maker. These are the same map makers that thought that running every function call in its own thread in StarCraft II made stuff run faster. Fortunately that also uses a collaborative multithreading model so it only made stuff run slower instead of creating hundreds of help requests to diagnose race conditions. Ultimately people like myself would then have to help diagnose these race conditions, which should not and will not exist in the first place.

An example of a race condition would be if 2 simultaneously executing threads were to try and change a units stats, say by getting the current health of the unit and adding some amount to it. The result could be correct that both health additions applied. The result could also be wrong with only 1 of the health additions applying. Some garbage amount of health could occur due to how memory was read and cached. The application might out right crash because of something that happens under the hood becoming inconsistent such as a bitfield flag being set. Now one can fix this by Java style synchronization blocks around the entire code, but what that has just done is turn the simultaneously executing threads into cooperatively executing threads which is what we have already. Even if one guarded just each native individually that would not solve all the race conditions described. Let us not forget that all that synchronization adds overhead.

Most tasks that people want to do with triggers will not and mostly cannot benefit from simultaneous multithreading.

This has nothing to do with the quoted text. You still haven't given a concrete reason why multiple cores can't be used.

Dr Super Good said:
William has used coroutines to create an accurate wait function. As described in my example above. It worked. You can try it yourself, there is nothing secret or complicated about it.

Again, this has nothing to do with the quoted text. The citation is needed for your claims about how the interpreter and engine interoperate, not about what's possible/sound to write with LAU coroutines.

Trokkin said:
Those tests I've done already shows the superiority of Lua over JASS. ForForce is the fastest way to run function reference in JASS, and ForForce is 30+ times slower than function calls within Lua. Magnitudes.
Then, performance isn't the ultimate argument, and if JASS isn't lagging that hard then why bother about performance anyway. For those who care about performance usually also works an argument that Lua is also much more convenient for coding.

For use with a compiler (say, wurst), the cost of funtion/trigger calls is pretty inconsequential. I'm more interested in comparing potential dispatch methods.

What's the cost comparison between native lua maps and hashtable(int, int, int)?

Trokkin said:
It isn't hard to translate vjass to lua, at least without language specific optimizations. I've heard there is already a translator that does it automatically. Also I'm currently working on kind of a standard library for Lua, so I'd like to hear what exactly is holding you back on vJass, that might get a priority on my todo list.

If you manually convert your vjass to jurst (pretty easy), you can get the wurst compiler to emit lua.

lishi1608 said:
And I suggest you look at the wurst project. If memory serves they have translated some useful libs to wurst which you may find useful for your todo list.

edit: it's here. These may be very commonly use lib: wurstscript/WurstStdlib2

Wurst now also has a package manager, so you can easily add more dependencies to your project

Dr Super Good · Jul 2, 2019

Cokemonkey11 said:
Fine, still doesn't necessitate running on a single core.

Only 1 thread of Lua will be executing at any time on any core. Due to how physical thread scheduling works that thread might move between cores.

Cokemonkey11 said:
This has nothing to do with the quoted text. You still haven't given a concrete reason why multiple cores can't be used.

I think you are misunderstanding what I am trying to say. Multiple cores cannot be used simultaneously but any single core can be used. The actual limit on which cores the single physical Lua thread will run on depends on the application core affinity.

Internally Lua will be running off a single physical thread. The reason to choose this during implementation over many physical threads with locks to prevent simultaneous exeuction is that physical threads are OS handles so very expensive to create (require kernel calls) and that thread context switches or movements between cores also incur overhead. Since only 1 Lua thread will be executing at any time it makes the most sense from a performance perspective to have all Lua threads run off a single physical thread to minimize thread creation and destruction overhead and to try and encourage that thread to remain on the same core to minimize context switches.

Lua also is targeted at embedded systems where there might be no physical multi-threading API and so there is only 1 execution thread to run the Lua VM off.

Cokemonkey11 said:
Again, this has nothing to do with the quoted text. The citation is needed for your claims about how the interpreter and engine interoperate, not about what's possible/sound to write with LAU coroutines.

One can see it via practical testing. If it did not work as described then the approach I mentioned would not work as well. Since it does work it must work vaguely like how I mentioned.

Bribe · Jul 4, 2019

lishi1608 said:
I use a lot of libs like:
UnitDex by TriggerHappy
Table, SpellEffectEvent by Bribe
RegisterNativeEvent, RegisterPlayerUnitEvent by Bannar
TimerUtils
GroupUtils by Rising_Dusk
DamageEngine by Bribe (I edited it to fully in vJASS in my map)
xecast by Vexorian

Core systems should be manually coded to Lua for those who use it.

Table and, for the most part, Hashtables are redundant in Lua due to everything in Lua being a Table. NewTable has a lot of extended complexity due to needing to store various types of values. In Lua there are no types, so the Table syntax would be exponentially smaller.

TimerUtils should also be unnecessary due to the superior way handles are treated by the Lua VM, with Garbage Collection and all. The data attachment is also unnecessary because in Lua you can pass parameters to the callback function.

GroupUtils is only useful for GroupRefresh and including the IsUnitInRangeXY check in a filterfunc. GroupRefresh could be made redundant by using the new natives when iterating over lists instead of the old-style FirstOfGroup loops.

I already have a fully-coded DamageEngine 5.3 written in Lua, but I'm waiting for the issues to be fully ironed-out in the current vJass Damage Engine before I make the final release.

xecast is an absolutely massive library. There are surely many optimizations that could be made to XE in general, taking advantage of new natives. But I think whoever designs such a system first should really focus on which API they want to utilize, as there are a wide array of resources that compare to XE modules.

lishi1608 · Jul 4, 2019

Bribe said:
Core systems should be manually coded to Lua for those who use it.

Table and, for the most part, Hashtables are redundant in Lua due to everything in Lua being a Table. NewTable has a lot of extended complexity due to needing to store various types of values. In Lua there are no types, so the Table syntax would be exponentially smaller.

TimerUtils should also be unnecessary due to the superior way handles are treated by the Lua VM, with Garbage Collection and all. The data attachment is also unnecessary because in Lua you can pass parameters to the callback function.

GroupUtils is only useful for GroupRefresh and including the IsUnitInRangeXY check in a filterfunc. GroupRefresh could be made redundant by using the new natives when iterating over lists instead of the old-style FirstOfGroup loops.

I already have a fully-coded DamageEngine 5.3 written in Lua, but I'm waiting for the issues to be fully ironed-out in the current vJass Damage Engine before I make the final release.

xecast is an absolutely massive library. There are surely many optimizations that could be made to XE in general, taking advantage of new natives. But I think whoever designs such a system first should really focus on which API they want to utilize, as there are a wide array of resources that compare to XE modules.

Wow damage engine in Lua!

good information. like I said "some vJASS system exist because people don't have useful language features/standard libs to make things."

I guess if migrate to Lua, most work will be on systems which are highly bound to wc3 APIs(doesn't seem to be right English.. please forgive). Like xecast, missile systems, push back systems, shops, item receipts and so on. But I'm sure the implement will be shorter and cleaner and even more elegant.

Cokemonkey11 · Jul 4, 2019

Dr Super Good said:
Only 1 thread of Lua will be executing at any time on any core. Due to how physical thread scheduling works that thread might move between cores.

Tautologous - only 1 thread will ever be executed by 1 core at any one time, so by extension, the same is true for LUA threads. What's your point?

Dr Super Good said:
I think you are misunderstanding what I am trying to say.

I think you're not saying what you mean, or changing what you're saying

Dr Super Good said:
Multiple cores cannot be used simultaneously but any single core can be used. The actual limit on which cores the single physical Lua thread will run on depends on the application core affinity.

Citation needed intensifies

Dr Super Good said:
Internally Lua will be running off a single physical thread.

How do you know?

Dr Super Good said:
The reason to choose this during implementation over many physical threads with locks to prevent simultaneous exeuction is that physical threads are OS handles so very expensive to create (require kernel calls) and that thread context switches or movements between cores also incur overhead. Since only 1 Lua thread will be executing at any time it makes the most sense from a performance perspective to have all Lua threads run off a single physical thread to minimize thread creation and destruction overhead and to try and encourage that thread to remain on the same core to minimize context switches.

- Any asshat with google can read about the advantages and disadvantages of architecting systems with a single thread of multiple threads
- Locks aren't the only way to synchronise concurrent systems
- OS handles are not necessarily expensive to create; "kernel calls" (syscalls) aren't necessarily expensive to execute
- multi-threaded systems don't necessarily require context switching or movement between cores
- citation needed

Dr Super Good said:
Lua also is targeted at embedded systems where there might be no physical multi-threading API and so there is only 1 execution thread to run the Lua VM off.

An instance of LUA is always single-threaded - that doesn't prohibit system architects from spawning multiple lua instances, communicating between them, or running a backup instance for redundancy (as an example)

Dr Super Good said:
One can see it via practical testing. If it did not work as described then the approach I mentioned would not work as well. Since it does work it must work vaguely like how I mentioned.

One can *guess* how things work based on experience. Testing doesn't prove the absense of things, it's for exploring and reasoning about a system. See: pretty much every concurrency bug ever

Bribe said:
Core systems should be manually coded to Lua for those who use it.

"should"? Why?

Bribe said:
Table and, for the most part, Hashtables are redundant in Lua due to everything in Lua being a Table. NewTable has a lot of extended complexity due to needing to store various types of values. In Lua there are no types, so the Table syntax would be exponentially smaller.

TimerUtils should also be unnecessary due to the superior way handles are treated by the Lua VM, with Garbage Collection and all. The data attachment is also unnecessary because in Lua you can pass parameters to the callback function.

GroupUtils is only useful for GroupRefresh and including the IsUnitInRangeXY check in a filterfunc. GroupRefresh could be made redundant by using the new natives when iterating over lists instead of the old-style FirstOfGroup loops.

Nice, this is useful for those of us less familiar with lua - thanks

Dr Super Good · Jul 5, 2019

I would suggest looking at the Lua source code. Sure Blizzard's implementation will be slightly different in that it uses software floats and can save the VM state to allow save/load but generally it should be the same.

Cokemonkey11 · Jul 6, 2019

Dr Super Good said:
I would suggest looking at the Lua source code. Sure Blizzard's implementation will be slightly different in that it uses software floats and can save the VM state to allow save/load but generally it should be the same.

As you seem to consistently side-step the point of the conversation, and everything you say is entirely void of veracity, I'm going to stop discussing this.

KeepVary · Jul 10, 2019

Trokkin said:
Apparently, loadstring() leads to a silent thread crash or something where it is called.

Dr Super Good said:
Obviously as "loadstring" does not exist. That function is part of the C level API for Lua.

Any sort of I/O based functions do not work in Warcraft III's Lua. This is likely for security reasons.

Actuality "loadstring" has been changed a long time ago, this function called "load" after Lua 5.2 updated, and 1.31's Lua engine is based on Lua 5.3.

Here's a list of 1.31's available Lua functions:

Lua:

[basic]
print
load
type
tonumber
tostring
pairs
ipairs
select
next
pcall
xpcall
error
assert
setmetatable
getmetatable
rawset
rawget
rawlen
rawequal
collectgarbage

[math]
math.abs
math.min
math.max
math.floor
math.ceil
math.modf
math.fmod
math.sqrt
math.log
math.exp
math.ult
math.pi
math.deg
math.rad
math.sin
math.cos
math.tan
math.asin
math.acos
math.atan
math.type
math.tointeger
math.mininteger
math.maxinteger
math.huge
math.random
math.randomseed

[string]
string.len
string.lower
string.upper
string.reverse
string.rep
string.sub
string.gsub
string.find
string.match
string.gmatch
string.format
string.byte
string.char
string.pack
string.unpack
string.packsize

[table]
table.insert
table.move
table.remove
table.sort
table.concat
table.pack
table.unpack

[utf8]
utf8.char
utf8.charpattern
utf8.codes
utf8.codepoint
utf8.len
utf8.offset

[os]
os.clock
os.date
os.time
os.difftime

[coroutine]
coroutine.create
coroutine.wrap
coroutine.yield
coroutine.resume
coroutine.status
coroutine.running
coroutine.isyieldable

As you can see, function "load" is there.
Besides, any of those I/O related functions was gone as expected.

Dr Super Good · Jul 10, 2019

KeepVary said:
Actuality "loadstring" has been changed a long time ago, this function called "load" after Lua 5.2 updated, and 1.31's Lua engine is based on Lua 5.3.

"loadstring" is not "load". My original point still is correct...

If "load" cannot be used by users then it might be used internally, e.g. as part of the JASS API interop initialization. If "load" does not work for users the reason would likely be for security since some stupid map maker might allow players unrestricted access to it.

KeepVary · Jul 10, 2019

Dr Super Good said:
"loadstring" is not "load". My original point still is correct...

If "load" cannot be used by users then it might be used internally, e.g. as part of the JASS API interop initialization. If "load" does not work for users the reason would likely be for security since some stupid map maker might allow players unrestricted access to it.

They are the same, it's just a name change after Lua 5.2. I guess you are talking about "loadfile".

"loadstring" or "load" is a function that can convert specified string to function.
If you run this code:

Lua:

local str = load( "return 'load is loadstring'" )()
print(str)

It will print "load is loadstring" absolutely.

Dr Super Good · Jul 10, 2019

KeepVary said:
They are the same, it's just a name change after Lua 5.2. I guess you are talking about "loadfile".

"loadstring" or "load" is a function that can convert specified string to function.
If you run this code:
It will print "load is loadstring" absolutely.

Except "load" is not "loadstring"... As you posted above it is declared as "load" and not "loadstring". hence "loadstring" will attempt to execute nil and not "load".

The load function can be used to run binary Lua as well. There is a warning associated with it that such binary can be maliciously crafted to crash the Virtual Machine. Hence disabling for security reasons is a possibility.

Lizreu · Jul 11, 2019

Dr Super Good said:
Except "load" is not "loadstring"... As you posted above it is declared as "load" and not "loadstring". hence "loadstring" will attempt to execute nil and not "load".

The load function can be used to run binary Lua as well. There is a warning associated with it that such binary can be maliciously crafted to crash the Virtual Machine. Hence disabling for security reasons is a possibility.

loadstring's bincode loading is already disabled in WC3. It was one of the first things I tried :^)

KeepVary · Jul 11, 2019

Dr Super Good said:
The load function can be used to run binary Lua as well.

All right it's my negligence.
I'm not familiar with Lua 5.1 and forgot that "load" is a different function from "loadstring".

KeepVary said:
They are the same, it's just a name change after Lua 5.2.

Let me correct this to: "loadstring" and "load" were merged since Lua 5.2 released.

But in 1.31, "load" has almost no difference with "loadstring".
The point is that we are able to write dynamic code, this is actually what I want to say.

Trokkin · Jul 11, 2019

Benchmarked another two things. Natives seem to never be efficient.
And since we can substitute natives now, I suggest we slowly replace the entire common.j and blizzard.j for good, for those GUI users.

E: thanks to KeepVary, load() can indeed be used to load code in runtime.

Bribe · Jul 11, 2019

Trokkin said:
Benchmarked another two things. JASS seems to never be efficient.
And since we can substitute natives now, I suggest we slowly replace the entire common.j and blizzard.j for good, for those GUI users.

How do we declare common.ai natives now - just function Name(args) end?

Trokkin · Jul 11, 2019

Bribe said:
How do we declare common.ai natives now - just function Name(args) end?

You mean, replace? Yes, you can simply overwrite the native/function -- but you better store old version, especially if you're going to use it once.

Code:

--- It is generally not recommended to use old natives.
Native = {}

function replaceNative(name, new_f)
    if not Native[name] and _G[name] then
        Native[name] = _G[name]
    end
    _G[name] = new_f
end

But I didn't check if we can use common.ai from Lua.
If we can, we should be able to straight use it. Declaring function Name(...) end would overwrite it, so you might as well write Name = DoNothing.

Bribe · Jul 11, 2019

Trokkin said:
You mean, replace? Yes, you can simply overwrite the native/function -- but you better store old version, especially if you're going to use it once.

Code:

--- It is generally not recommended to use old natives. Native = {} function replaceNative(name, new_f) if not Native[name] and _G[name] then Native[name] = _G[name] end _G[name] = new_f end

But I didn't check if we can use common.ai from Lua.
If we can, we should be able to straight use it. Declaring function Name(...) end would overwrite it, so you might as well write Name = DoNothing.

I'll run a test later tonight to see if I can do print(UnitAlive(CreateUnit(Player(0), 'hfoo', 0, 0, 0))) without any setup and see if it returns true.

Trokkin · Jul 11, 2019

Bribe said:
I'll run a test later tonight to see if I can do print(UnitAlive(CreateUnit(Player(0), 'hfoo', 0, 0, 0))) without any setup and see if it returns true.

It exists and returns true, provided you use FourCC to convert 'hfoo' into integer.
Meaning yes, at least this function is available.

LUA tests and benchmarks

Similar threads