• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

It's Over 300_000!

Status
Not open for further replies.
Level 13
Joined
Nov 7, 2014
Messages
571
It's Over 300_000!


There are probably a lot of hardcoded constants in WC3. The 2 Jass related and, in my opinion, most dreaded being 8192 and 300_000.

300_000, as explained by PipeDream, is the maximum number of bytecode instructions an instance of the Jass VM can execute continuously before "giving" up. The reason this limit exists is to avoid infinite loops, I would imagine. It was either proving a loop always terminates, which is kind of a hard problem (halting problem hard), or counting instructions and coming up with some magical constant. Of course Blizzard decided to keep things simple and came up with 300_000.


300_000 in action

JASS:
globals
    integer g_a = 0
endglobals

function inf_loop takes nothing returns nothing
    loop
        set g_a = g_a + 1
    endloop
endfunction

function show_value takes nothing returns nothing
    set g_a = 0

    call ExecuteFunc("inf_loop")
    // or
    // call ForForce(bj_FORCE_PLAYER[15], function inf_loop)

    call BJDebugMsg("g_a: " + I2S(g_a)) // 42_857
endfunction

Why 42_857? Well in order to answer that question we need to look at the bytecode instructions that the function inf_loop gets compiled to.

You can learn about Jass' bytecode instructions from Lord of theDing's JASM - Let's dive into bytecode thread. Like a lot of other WC3 related things, leandrotp was the one to figure the meaning of all of the Jass VM's instructions.

inf_loop gets compiled to:
JASS:
    28000000 00000001
    0E010400 00001234
    13010000 00000000
    0C020400 00000001
    14030000 00000000
    20030302 00000000
    11030000 00001234
    2B000000 00000001
    28000000 00000002

Not very readable for humans indeed, good for CPUs, I guess. I think it would be nice to make this sort of gobbledygook more human friendly, so... there's this program (see attachments) that tries to do that. It's output is:
JASS:
        loop
//      L001:
            set g_a = (g_a + 1)
//          @01.i := g_a
//          push @01
//          @02.i := 1
//          @03 := pop
//          @03 := @03 + @02
//          g_a := @03
//          jmp L001
        endloop
//  L002:

Well, the hex values are now gone, but there is other weird stuff like L<decimal-digits>: and @<hex-digits>. The @ sign designates a Jass VM register and the LXXX: thingies are jump labels for the jump instructions. The := operator stands for assignment.

Lua:
L001: -- label L001
@01 := g_a -- sets register @01's value to that of variable g_a, and it's type (registers keep track of the type of value they store) to the type of g_a (integer)
push @01 -- pushes register @01 to the stack
@02.i := 1 -- sets register @02's value to 1, and it's type to integer
@03 := pop -- pops the stack and sets register @03 to that value
@03 := @03 + @02 -- adds the registers @03 and @02 and stores the result back in register @03
g_a := @03 -- sets the value of variable g_a to register @03's value
jmp L001 -- jumps to the first instruction after the label L001: (@01.i := g_a)
L002: -- label L002

We have 9 instructions, jump labes are counted as instructions, i.e they add 1 to the number of executed instructions. Label L002: is never reached. The loop's body has 7 instructions.
So 1 + 7*42_857 = 300_000, this is where 24_857 comes from.


What if we had this instead:
JASS:
globals
    integer g_a = 0
    integer g_b = 0
endglobals

function inf_loop takes nothing returns nothing
    loop
        set g_a = g_a + 1
        set g_b = g_b + 1
    endloop
endfunction

function show_values takes nothing returns nothing
    set g_a = 0
    set g_b = 0

    call ExecuteFunc("inf_loop")

    call BJDebugMsg("g_a: " + I2S(g_a)) // 23_077
    call BJDebugMsg("g_b: " + I2S(g_b)) // 23_076
endfunction

g_a = 23_077 but g_b = 23_076, how come? Again let's look at the instructions and do the counting:

JASS:
        loop
//      L001:
            set g_a = (g_a + 1)
//          @01.i := g_a
//          push @01
//          @02.i := 1
//          @03 := pop
//          @03 := @03 + @02
//          g_a := @03
            set g_b = (g_b + 1)
//          @04.i := g_b
//          push @04
//          @05.i := 1
//          @06 := pop
//          @06 := @06 + @05
//          g_b := @06
//          jmp L001
        endloop
//  L002:

The loop's body has 13 instructions.
1 + 13*23_076 = 299989
At the 23_077th iteration of the loop we execute only (300_000 - 299989 = 11) instructions, but g_b := @06 is the 12th instruction, i.e it does not execute, that's why g_b is 1 less than g_a.

The point of doing this kind of instruction counting is of course to avoid the 300_000 limit or get as close as possible to it before using an "instruction reset" (ExecutrFunc, ForForce, TriggerEvaluate, TriggerExecute), i.e minimizing the number of "instruction reset"s necessary for completing a certain task.


Some notes about Blizzard's generated bytecode.

The binary operators and array assignment use push and pop instructions.
Lua:
// 1 + 2
@01.i := 1
push @01
@02.i := 2
@03 := pop
@03 := @03 + @02

// set a[1] = 2
@01.i := 1
push @01
@02.i := 2
@03 := pop
a[@03] := @02

The reason for this, I think, is because Blizzard uses this "cyclic register allocation", after computing the lhs (left hand side) of the operator, they push it to the stack in order to avoid stomping on the value of the register when computing the rhs (right hand side). If they had done this instead:
Lua:
// 1 + 2
@01.i := 1 -- compute lhs, store the result in register @01
@02.i := 2 -- compute rhs, store the result in register @02
@02 := @01 + @02 -- compute lhs + rhs, store the result in @02
The problem with that is when computing the rhs, register @01 could get stomped by the computation, because rhs can be an arbitrary expression (any number of function calls). I think they could've used this simple optimization if they inspected the rhs and figured it wasn't going to stomp on the previous register. I think this is safe to do when rhs is a literal, for example.

The binary 'and' and 'or' operators have short-circuit evaluation, i.e they are really control structures ('or' has higher precedence than 'and' in Jass):
JASS:
// true and false
    @01.b := true
    jmp L001 if @01 = false
    @02.b := false
    jmp L002
L001:
    @02.b := false
L002:

// true or false
    @01.b := true
    jmp L001 if @01 = true
    @02.b := false
    jmp L002
L001:
    @02.b := true
L002:

Negative literals generate a unary - instructions.
JASS:
        local integer i = -1
//      local i integer
//      @01.i := 1
//      @01 := -@01
//      i := @01

// could've simply emit a negated literal
@01.i := -1 (0xFFFF_FFFF)

Global blocks are compiled to a function called <init> that gets called "at some point".
JASS:
    globals
//  fn <init>
        integer global_variable = 0
//      global global_variable integer
//      @01.i := 0
//      global_variable := @01
    endglobals
//  endfn

endglobals/endfunction emit an instruction =)...

The empty loop compiles to 0 instructions:
JASS:
loop
endloop
call BJDebugMsg("this is not unreachable")

At some point (after patch 1.24b?) Blizzard started to emit a ( 0C000000 00000000, i.e @00 := 0) instruction as part of their plan to fix the "return bug"/type casting.
JASS:
    function DoNothing takes nothing returns nothing
//  fn DoNothing
//      @00 := 0 <-- only after patch 1.24b?
//      ret
    endfunction
//  endfn


Differences between Blizzard's genrated bytecode and the program's.

I tried to mimic Blizzard's generated bytecode instructions (modulo binary vs text) as much as possible. There are some differences though. For empty if/elseif/loop blocks, Blizzard don't generate jump instructions. And because the program only parses the files, it doesn't do type checking, it doesn't emit a i2r instruction (OP_I2R=0x17):
JASS:
local real r = 1.0 + 2

05050000 00000254
0C4C0500 3F800000
134C0000 00000000
0C4D0400 00000002
174D0000 00000000 <-- @4D := i2r @4D
144E0000 00000000
204E4E4D 00000000
114E0000 00000254

local r real
@4C.r := 1.0
push @4C
@4D.i := 2
@4D := i2r @4D <-- you won't see this
@4E := pop
@4E := @4E + @4D
r := @4E

The i2r instructions are generated when integers are "mixed" with reals. So don't mix them? =)


The program expects a list of files, the first n - 1 files are only parsed for global variable/native/function declarations, only for the last file you would get an output file with the same name but with the '.bc-txt' prefix.
 

Attachments

  • jic.zip
    134 KB · Views: 67
Level 2
Joined
Nov 3, 2017
Messages
25
All control words are constnatly get look-upped on each call, but there are nothing we can do about it. Game search for them and returns address to call, thats all about it.
Tbh ther eare no such thing as instruction reset, its rather brand new launch of another pipe of VM with 300k limit as well. Sadly there are no way to reset or modity the limit except for MH.
 
Level 3
Joined
May 19, 2010
Messages
35
Very nice. Thanks for the push/pop explanation, it did look odd to me.

How in the end does stuff happen? or in other words, how do natives work? I would imagine they'd need some identifier (string, integer, ...) and not a jump.
There is a bytecode instruction to call a native function. Every native in the game has a unique numerical ID, that is used with callnative. Parameters are push'ed on the stack before the callnative instruction and the return value is in Register R0.
 
Level 5
Joined
Jun 16, 2004
Messages
108
Pretty cool tool. I ended up finding a bug and a case of non-conforming behavior in my own tool thanks to playing around with it and looking at the game again. Well, I probably still have a lot of such cases to fix still. Anyway, thanks, and nice job.
 
Status
Not open for further replies.
Top