• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

[lua] garbage collection issues

Status
Not open for further replies.
Level 6
Joined
Jun 18, 2004
Messages
119
Hello guys,

With the advent of lua I thought things would become easier, as destruction of tables was no longer required. However, while playing with it I found that things a bit less so... I've been trying to make sense of the garbage collection system, but am really having trouble to do so.

If we create a table as such:
Lua:
mt = {}
function mt.__gc(self)
    print(self)
end
A = {}
setmetatable(A,mt)
it takes a little over 15(!) seconds for it to be removed from memory (evident by its print when __gc is called). To make matters worse, it seems to depend on how 'busy' lua is. The more it is currently doing, the less it is concerned with freeing memory. This can lead to long exit-times in heavily triggered maps.

It seems to me that we have 0 control over memory management and therefore that repeated dynamic allocation of tables is a big nono.

Please correct me if I am missing something, or if there are other ways to ensure good memory management!

-inf
 

~El

Level 17
Joined
Jun 13, 2016
Messages
558
This can lead to long exit-times in heavily triggered maps.

Might be talking out of my ass, but I think that issue is related to how many jass handles (units, stuff, whatever) you end up having in your map. At least there was some kind of corellation to that.

Blizz had to go through some hoops to ensure that the GC is synchronized though iirc, so that might explain the weird delays. Though, a table sticking around in memory for 15 seconds is hardly a concern - Lua's GC is relatively simple and it doesn't run all the time, and it doesn't have different generations of objects for GC purposes like e.g. Java.

Lua will recover garbage reliably, though. It's not like JASS where even strings leak permanently.

Still, you should avoid creating new tables when you have the option of reusing an existing one (e.g. for vector operations if you use a table to store it), it's a good practice even in pure Lua environments where you have full control over the GC, since each table has a pretty significant overhead (iirc ~40 bytes) and incurs at least one memory allocation (more if it keeps resizing), and memory allocations are a major bottleneck of any application. I wouldn't worry about it too much though unless you're literally creating thousands of tables every frame.
 
Level 6
Joined
Jun 18, 2004
Messages
119
The GC is supposed to run 100x per second, but the objects don't get to the garbage collection for some internal lua reason.

I am indeed using tables for vectors and noticed significant exitlag after running around 100 moving "Particles" for 30 seconds...

I'm not quite sure if there's a nice way of including 'recycling' of Vectors, but I'll do some testing!
 

~El

Level 17
Joined
Jun 13, 2016
Messages
558
The GC is supposed to run 100x per second, but the objects don't get to the garbage collection for some internal lua reason.

I am indeed using tables for vectors and noticed significant exitlag after running around 100 moving "Particles" for 30 seconds...

I'm not quite sure if there's a nice way of including 'recycling' of Vectors, but I'll do some testing!

Why do you think the GC is supposed to run 100x times per second? Even in heavyweight Java applications it only runs once in a few seconds, and huge "stop-the-world" collections can be as infrequent as every 30 seconds or every 5 minutes. Running the GC 100 times per second would be incredibly wasteful, considering the game logic doesn't even run that often.

Sophisticated GC systems have different tiers of storage for objects, and they can reclaim objects a very short time after they are created, but Lua doesn't have such a sophisticated GC due to its minimalism.

I've just tested creating 15mil worth of just tables, and it had minimal impact on exit times. I do know though that creating lots (in the order of hundres, though) of units can significantly impact exit times (might also be locations, specialeffects, etc.)

I would suggest doubly making sure this issue is indeed caused by tables, and not something else, and that it is an issue you want to be solving.

If you want to minimize creating garbage tables though, make sure you reuse your vector instances. If you have methods that return a result of an operation of two vectors, it might be worth to modify one of those vectors in-place instead, as an example. Having a single static table for simple, temporary operations available as a global can also be useful, so long as you don't store this table elsewhere in your code.

Or just forego vector objects entirely and just have functions that return tuples instead. E.g.
Code:
function addVectors(x1, y1, x2, y2)
    return x1 + x2, y1 + y2
end

This will avoid the allocation of new tables, though obviously might be less convenient to use. Depending on how much vector math you do this may or may not be worth it.

EDIT: What version of the game are you running? I couldn't reproduce this at all even with creating/removing 10000 units on 1.32.2. I remember this issue happening in 1.31 though.
 
Last edited:

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
The GC is supposed to run 100x per second, but the objects don't get to the garbage collection for some internal lua reason.
The Lua garbage collector does not run 100 times per second. Instead it will periodically perform incremental clean-up, specifically incremental marking. Once an incremental marking cycle is complete it will finish performing a garbage collection cycle. The rate this incremental garbage collector progresses is tuned to complete 1 full cycle every time the Lua heap size is double the size it was after the last full garbage collection cycle. More specifically the rate the garbage collector runs is based on the rate at which objects are allocate. Of course there are functions to explicitly advance the garbage collector, or even tune how frequently it runs.

Warcraft III does not use the standard Lua garbage collector. Since allocations may occur locally outside the synchronized game state the standard one, used by 1.31, was prone to causing desync as a garbage collection cycle might finish on one client before any others resulting in all gc method calls running out of sync from other clients causing a deviation of synchronized game state. In 1.31 the solution was to explicitly disable the standard Lua garbage collector and periodically call for a full garbage collection in response to a timer, which is synchronized between all clients, expiring. The idea behind this solution was so effective that it was integrated natively in some form in 1.32. The garbage collection functions are now restricted functions that users cannot call.

The exact details of how Blizzard implemented it in 1.32 are not known to me. In any case it is implemented such that full garbage collection cycles complete on the same game frame for all clients. This will happen fairly frequently, but should also be rare enough to minimize the costs associated with full garbage collection sweeps.

I'm not quite sure if there's a nice way of including 'recycling' of Vectors, but I'll do some testing!
The idea is that one allocates a heap of vectors. When a vector is requested it gets removed from this heap, only allocating new ones if the heap is empty. If all references to the vector are lost then the garbage collection method will re-append this vector to the heap for re-use at a later time. Unlike many languages, Lua allows dead object to be resurrected during garbage collection. If the heap exceeds some size, then the object is left to be removed to avoid usage spikes wasting excessive memory.

This should improve performance with respect to creation of vectors since the tables will always be the correct size internally. Next to that it is unlikely to offer any other benefits.
Still, you should avoid creating new tables when you have the option of reusing an existing one (e.g. for vector operations if you use a table to store it), it's a good practice even in pure Lua environments where you have full control over the GC, since each table has a pretty significant overhead (iirc ~40 bytes) and incurs at least one memory allocation (more if it keeps resizing), and memory allocations are a major bottleneck of any application. I wouldn't worry about it too much though unless you're literally creating thousands of tables every frame.
Due to the nature of virtual machine languages, allocation is very fast. All it does is append the object to the end of the heap, much like a stack frame.

The main reason to recycle objects is not allocation, but rather garbage collection performance. Every time a garbage collection cycle is performed, all live objects get copied sequentially to a fresh heap to avoid having to deal with memory fragmentation. In virtual machines like Java which have multiple heaps for different ages of objects any object that is being recycled will end up falling into one of the infrequently collected long term heaps, hence reducing its garbage collection cost. As far as I am aware Lua does not use multiple age related heaps, and as such there is no such advantage to be had.

With Lua where it can benefit from this is with reducing table expansion overhead. When a table is expanded a new backing array must be allocated, initialized and content copied from the old backing array which can be quite expensive. By re-using a table with a backing array of the appropriate size one can avoid this overhead which can significantly improve performance.
 
Level 6
Joined
Jun 18, 2004
Messages
119
Warcraft III does not use the standard Lua garbage collector. Since allocations may occur locally outside the synchronized game state the standard one, used by 1.31, was prone to causing desync as a garbage collection cycle might finish on one client before any others resulting in all gc method calls running out of sync from other clients causing a deviation of synchronized game state. In 1.31 the solution was to explicitly disable the standard Lua garbage collector and periodically call for a full garbage collection in response to a timer, which is synchronized between all clients, expiring. The idea behind this solution was so effective that it was integrated natively in some form in 1.32. The garbage collection functions are now restricted functions that users cannot call.

The exact details of how Blizzard implemented it in 1.32 are not known to me. In any case it is implemented such that full garbage collection cycles complete on the same game frame for all clients. This will happen fairly frequently, but should also be rare enough to minimize the costs associated with full garbage collection sweeps.
Exactly, the only thing I know from the implementation is that it is running 'synced' 100x per second. I don't know exactly *what* is running at those times though, as blizz apparently isn't allowed to disclose.



The idea is that one allocates a heap of vectors. When a vector is requested it gets removed from this heap, only allocating new ones if the heap is empty. If all references to the vector are lost then the garbage collection method will re-append this vector to the heap for re-use at a later time. Unlike many languages, Lua allows dead object to be resurrected during garbage collection. If the heap exceeds some size, then the object is left to be removed to avoid usage spikes wasting excessive memory.

This should improve performance with respect to creation of vectors since the tables will always be the correct size internally. Next to that it is unlikely to offer any other benefits.
What about the benefit of the memory not being lost? At the moment I must say that when I'm just running my 0.01 periodic timer and creating Vectors, they never ever get removed (at least, __gc is not called.), until it's too late and the map crashes (see attached map!). This is the code I'm using to test:
Lua:
MAIN_BEFORE = ("main::before")
MAIN_AFTER = ("main::after")
CONFIG_BEFORE = ("config::before")
CONFIG_AFTER = ("config::after")
hooks = dict {[MAIN_BEFORE] = list {}, [MAIN_AFTER] = list {}, [CONFIG_BEFORE] = list {}, [CONFIG_AFTER] = list {}}
oldMain = main
oldConfig = config
function newmain()
    for func in hooks[MAIN_BEFORE] do
        xpcall(function()
            func()
        end, function(Error)
            print(Error)
        end)
        ::loop_label_1::
    end
    oldMain()
    for func in hooks[MAIN_AFTER] do
        xpcall(function()
            func()
        end, function(Error)
            print(Error)
        end)
        ::loop_label_2::
    end
end
function newconfig()
    for func in hooks[CONFIG_BEFORE] do
        xpcall(function()
            func()
        end, function(Error)
            print(Error)
        end)
        ::loop_label_3::
    end
    oldConfig()
    for func in hooks[CONFIG_AFTER] do
        xpcall(function()
            func()
        end, function(Error)
            print(Error)
        end)
        ::loop_label_4::
    end
end
main = newmain
config = newconfig
function AddScriptHook(func, place)
    place = place or MAIN_AFTER
    if (operator_in(place, hooks)) then
        hooks[place]:append(func)
    end
end
-- Main map code
function range(from, to, step)
    assert(from ~= nil)
  
    if to == nil then
        to = from
        from = 0      
    end

    step = step or 1

    local i = from
  
    return function()
        ret = i
        if (step > 0 and i >= to) or (step < 0 and i <= to) then
            return nil
        end
      
        i = i + step
        return ret
    end
end
function period()
    for j in range(100) do
      
        setmetatable({},{__gc = function(self) print(self) end})      
      
        ::loop_label_5::
    end
end
i = 0
function counter()
    i = (i + 1)
    print(i)
end
function test()
    TimerStart(CreateTimer(), 0.01, true, period)
    TimerStart(CreateTimer(), 1, true, counter)
end
AddScriptHook(test, MAIN_AFTER)

Due to the nature of virtual machine languages, allocation is very fast. All it does is append the object to the end of the heap, much like a stack frame.

The main reason to recycle objects is not allocation, but rather garbage collection performance. Every time a garbage collection cycle is performed, all live objects get copied sequentially to a fresh heap to avoid having to deal with memory fragmentation. In virtual machines like Java which have multiple heaps for different ages of objects any object that is being recycled will end up falling into one of the infrequently collected long term heaps, hence reducing its garbage collection cost. As far as I am aware Lua does not use multiple age related heaps, and as such there is no such advantage to be had.

With Lua where it can benefit from this is with reducing table expansion overhead. When a table is expanded a new backing array must be allocated, initialized and content copied from the old backing array which can be quite expensive. By re-using a table with a backing array of the appropriate size one can avoid this overhead which can significantly improve performance.
Right, I'm not having any issue with allocation, everything runs smoothly until I am out of memory, because nothing ever gets removed, ever though 0 of the tables are referenced. It seems that at that point, the map crashes *because* of the garbage collection though.
 

Attachments

  • test.w3x
    149.1 KB · Views: 48
Last edited:

~El

Level 17
Joined
Jun 13, 2016
Messages
558
Good info from DSG as usual.

I must say though, the code you linked creates a _lot_ of garbage for each object instance. I'm not really sure if a simple Vector object that doesn't care about any inheritance or other nonsense really needs all that.

Here's what each constructor does:
Create 2 tables (object, nmt),
Creates a new string with the object's hash (hashid),
Creates 3 unique closures for the object!

I've been able to repro the issue on my end with the test map you've provided, which I find curious. After trying a few things to reduce the amount of tables/closures your map generates per new object, I've been able to get rid of the issue (I think so, at least?). I've also upped the generated vectors to 1000 per timer tick instead of 100.

Here's the relevant snippet I've changed:
Lua:
-- Lua classes
function class(class_init, name, bases, mtmethods, properties)
    bases = bases or {}

    local c = {}
    c.properties = {}
    c.attrs = {}
    for _, base in ipairs(bases) do
        for k, v in pairs(base.attrs) do
            c.attrs[k] = v
        end
        for k, v in pairs(base.properties) do
            c.properties[k] = v
        end
    end
    c._bases = bases
    c.attrs = class_init(c.attrs)
  
    for k,v in pairs(properties) do
        c.properties[k] = v
    end
    local mt = getmetatable(c) or {}
    local nmt = {}
  
    mt.__call = function(_, ...)
        local object = {}
        setmetatable(object, nmt)
        if type(object.__init__) == "function" then
            object:__init__(...)
        end
        return object
    end
    c.mtmethods = mtmethods
    mt.__type = c
    mt.__index = function(self,key)
        if callable(c.attrs[key]) then
            return _stripself(c.attrs[key])
        else
            return c.attrs[key]
        end
    end
    mt.__newindex = function(self,key,value)
        c.attrs[key] = value
    end
    mt.__tostring = function(self)
        return name   -- perhaps it is better if the type table is not the main object, but instead a separate table? the __tostring of the main object might be confusing.
    end
    setmetatable(c, mt)
    cmt = {}
    cmt.__call = function(...)
        return mt.__call(...)
    end
    -- cmt.__index = function(self,key)
    --     rg = rawget(self,key)
    --     if rawtype(rg) == "function" then
    --         return _stripself(rg)
    --     else
    --         return rg
    --     end
    -- end
    setmetatable(c.attrs,cmt)

    nmt.__type = c
    nmt.__tostring = function(self)
        return "dummy"
    end
    for k,v in pairs(c.mtmethods) do
        nmt[k] = c.attrs[v]
    end
    nmt.__index = function(tbl, idx)
        local attr = c.attrs[idx]
        if (c.properties[idx]) then
            return attr.gfunc(tbl)
        end
        return attr
    end
    nmt.__newindex = function(tbl, idx, new)
        local attr = c.attrs[idx]
        if (c.properties[idx]) then
            attr.sfunc(tbl,new)
        else
            rawset(tbl,idx,new)
        end
    end

    return c
end
[/lua]

I've moved the creation of the "nmt" object outside of the object constructor, since it doesn't at all depend on the actual object's state (all the closures only reference "c"). I've also removed the dynamic typeid thing because it would generate 1 new string per an object which might never go used, which can be quite a lot of memory for Lua to clean up later if there's enough of them.
 
Level 6
Joined
Jun 18, 2004
Messages
119
Cheers, thanks for catching that. I'll update it in my code.

I realize the class thing creates a lot of garbage. It's just for keeping all python functionality available (which is quite a bit of a pain).

I edited the previous post to use just lua tables though. Creating this
Lua:
setmetatable({},{__gc = function(self) print(self) end})
lots of times creates the same issues.

On the plus side, I made vectors to reuse 'recycleable' vectors with a first-in-first-out buffer with a minimum size of 100 so that all math can easily be done after which vectors are automatically overwritten. The permanent vectors have to be told to be permanent, though... I guess this is a workable solution.[/code]
 
  • Like
Reactions: ~El

~El

Level 17
Joined
Jun 13, 2016
Messages
558
Cheers, thanks for catching that. I'll update it in my code.

I realize the class thing creates a lot of garbage. It's just for keeping all python functionality available (which is quite a bit of a pain).

I edited the previous post to use just lua tables though. Creating this
Lua:
setmetatable({},{__gc = function(self) print(self) end})
lots of times creates the same issues.

On the plus side, I made vectors to reuse 'recycleable' vectors with a first-in-first-out buffer with a minimum size of 100 so that all math can easily be done after which vectors are automatically overwritten. The permanent vectors have to be told to be permanent, though... I guess this is a workable solution.[/code]

Keep in mind that printing is expensive and printing a lot is going to hang the game pretty badly by itself.
 
Status
Not open for further replies.
Top