• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

Narrowing Down Desync Cause

Status
Not open for further replies.
Level 4
Joined
Mar 8, 2006
Messages
27
I finally narrowed down my desync to a fairly small bit of code. I can't figure out why this would be desyncing though, so I was hoping to get the eyes of some people smarter than me on it and see if anyone could spot something obvious I didn't see.

I'm 99% sure it's caused by the really simple projectile system I built. When I tighten the projectile loop, it happens faster. I can repro it 100% on LAN with 5 wc3 instances open. With the loop at 0.00390625 and me spamming the projectile roughly every 0.5 sec, it happens at around 30 sec nearly every time. Reducing the loop speed increases the amount of time it takes.

Here's the lua for the projectile module that causes a desync:

Code:
local vector = require('src/vector.lua')

local projectiles = {}
local LOOP_SPEED = 0.00390625

local isCloseTo = function(val, expected, range)
    return val + range >= expected and val - range <= expected
end

local clearProjectiles = function()
    local elapsedTime = LOOP_SPEED

    for idx, projectile in pairs(projectiles) do
        local curProjectileX = GetUnitX(projectile.unit)
        local curProjectileY = GetUnitY(projectile.unit)

        if isCloseTo(curProjectileX, projectile.options.toV.x, 15) and
            isCloseTo(curProjectileY, projectile.options.toV.y, 15)
        then
            -- Already at destination, can finish
            projectile.toRemove = true
        else
            -- Linear projectile
            local totalDistX = projectile.options.toV.x - curProjectileX
            local totalDistY = projectile.options.toV.y - curProjectileY

            local distVector = vector.create(totalDistX, totalDistY)
            local v1 = vector.normalize(distVector)
            v1 = vector.multiply(v1, projectile.options.speed * elapsedTime)

            if vector.magnitude(v1) >= vector.magnitude(distVector) then
                v1 = distVector
            end

            v1 = vector.add(
                v1,
                vector.create(curProjectileX, curProjectileY))

            SetUnitPosition(projectile.unit, v1.x, v1.y)
        end
    end
    for idx, projectile in pairs(projectiles) do
        if projectile.toRemove then
            projectiles[idx] = nil
        end
    end
end

local init = function()
    local trig = CreateTrigger()
    TriggerRegisterTimerEvent(trig, LOOP_SPEED, true)
    TriggerAddAction(trig, clearProjectiles)
end

local createProjectile = function(options)
    if options.length ~= nil then
        local fromVector = vector.create(options.fromV.x, options.fromV.y)
        local toVector = vector.create(options.toV.x, options.toV.y)
        local totalVector = vector.subtract(toVector, fromVector)
        totalVector = vector.normalize(totalVector)
        totalVector = vector.multiply(totalVector, options.length)
        totalVector = vector.add(fromVector, totalVector)

        options.toV = totalVector
    end

    local proj = {
        unit = options.projectile,
        options = options,
        alreadyCollided = {},
    }
    table.insert(projectiles, proj)

    return proj
end

return {
    init = init,
    createProjectile = createProjectile,
}


It's used like this:

Code:
    local hero = gg_unit_etst_0117
    local heroV = vector.create(GetUnitX(hero), GetUnitY(hero))
    local mouseV = vector.create(GetSpellTargetX(), GetSpellTargetY())

    projectile.createProjectile{
        playerId = playerId,
        projectile = hero,
        fromV = heroV,
        toV = mouseV,
        speed = 800,
        length = 250,
        radius = 75,
    }

The fact that tightening the loop causes the desync to occur faster is concerning to me. My instincts would guess that some of the instances aren't able to keep up and maintain the fast loop speed, and eventually fall behind and desync, but to my understanding that shouldn't be how wc3 works.

I've also uploaded a version of the map with a really simple repro. Just spam Q on red and it'll desync the other players in about 30 sec. At least for my machine it does.
 

Attachments

  • map.w3x
    440.6 KB · Views: 10
Level 12
Joined
Feb 22, 2010
Messages
1,115
I think your loop speed is unnecessarily fast, it could be 10x times slower and still smooth.

The fact that tightening the loop causes the desync to occur faster is concerning to me. My instincts would guess that some of the instances aren't able to keep up and maintain the fast loop speed, and eventually fall behind and desync, but to my understanding that shouldn't be how wc3 works.

I had a similar thought when a friend of mine also was having desync problems on his map. Only things that changed since map was working properly is map size got larger and w3 version updated. After some debugging we figured out that desync happens when a large amount of units dies at the same time due to a special event (this was not a problem in prior versions, and we confirmed this is the cause by multiple tests), and cause was large number of units dying and triggers fired by those deaths. Instead of killing all of them at the same time we moved them to a corner and did the same thing in a short period of time and problem was gone.

Another frustrating thing is up until version 1.26 (or 1.30) when a desync happens you were encountering a "disconnected screen". When my friend told me about the problem and I checked out I couldn't figure out it was a desync at the start because all I see was an instant score screen, so I suspected some kind of freezing problem first. Then I read some stuff here and learned that this is how desync happens these days. I honestly don't know whose brilliant idea was REMOVING DISCONNECTED MESSAGE WHEN YOU DISCONNECTED then send players instantly to score screen, and why these problems start to happen in a game that run perfectly fine for ~16 years. So I would say our understanding of wc3 old wc3.

It is also possible that there was a real legit reason for desync in the map and I am too idiot to see it, but all my experience with w3 and recent stuff I see tell me otherwise.
 
Last edited:
Level 4
Joined
Mar 8, 2006
Messages
27
I think your loop speed is unnecessarily fast, it could be 10x times slower and still smooth.

Yeah, I just increased it to reproduce the desync. If I lower it, there's still a desync in theory but its just less likely to occur so harder to reproduce.

I honestly don't know whose brilliant idea was REMOVING DISCONNECTED MESSAGE WHEN YOU DISCONNECTED then send players instantly to score screen

They'll probably fix it for reforged. On rare occasions, I actually do get the disconnected message first, which is odd.



For now, I'm going to try to rewrite this snippet in JASS since its not much code and see if that still desyncs.
 
Level 4
Joined
Mar 8, 2006
Messages
27
I rewrote it in JASS and the desync is gone. So I guess something is wrong with lua :(

Next step is to transpile the JASS version to lua and try it again. Maybe I'm using some lua features that aren't supported yet or don't work quite right yet.


UPDATE: I manually "transpiled" it to lua again from the JASS version and no desync again. So I guess I'm doing something wrong in the original lua somehow. Will continue to investigate.

UPDATE 2: Seems by default wc3 will use something called `__jarray(0.0)` when you create a global array of reals. If I change that to use `{}` the desync comes back. So I guess something is wrong with `{}` for whatever reason in lua. I don't really understand why that would matter, since `__jarray(0.0)` just makes a table anyway...

Maybe it's related to the sizing of the lua tables. From what I understand whenever you hit the size limit it doubles it internally. So maybe `__jarray()` starts it with a large size already or something...

UPDATE 3: If anyone comes here from google or search or whatever, I found out it was caused by lua garbage collection. You can turn it off to verify, and then manually call it on a timer to sync it for each player and that fixes the issue!
 
Last edited:
Status
Not open for further replies.
Top