1. Head to the 33rd Modeling Contest Poll and drink to your heart's desire.
    Dismiss Notice
  2. Choose your means of doom in the 17th Mini Mapping Contest Poll.
    Dismiss Notice
  3. A slave to two rhythms, the 22nd Terraining Contest is here.
    Dismiss Notice
  4. The heavens smile on the old faithful. The 16th Techtree Contest has begun.
    Dismiss Notice
  5. The die is cast - the 6th Melee Mapping Contest results have been announced. Onward to the Hive Cup!
    Dismiss Notice
  6. The glory of the 20th Icon Contest is yours for the taking!
    Dismiss Notice
  7. Check out the Staff job openings thread.
    Dismiss Notice
Dismiss Notice
60,000 passwords have been reset on July 8, 2019. If you cannot login, read this.

JASM - Let's dive into bytecode

Discussion in 'The Lab' started by Lord of theDing, Apr 26, 2017.

  1. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    Now that we have unlimited possibilities thanks to memhack it's time to dwelve a bit deeper.
    You may have seen this thread by @leandrotp about compiling directly to bytecode.
    I'm really into that idea, because it allows for greatly increased performance and unlocks new features (like allocating memory).

    The warcraft bytecode (I call it jasm, jass+asm) is not exactly complex or difficult, it's just that there exists pretty much NO documentation what-so-ever. That makes sense, it's not an official feature of Jass.

    To get an insight I have written an assembler/disassembler for jasm. The tool can convert between readable jasm and executable bytecode. Also I ported the disassembler code into jass itself. Now you can look at the bytecode of your functions without getting completly mindfucked.

    The code to call bytecode from jass and an documented example are in the attached map.
    You can find an overview to the jasm instructions further down in this post.

    Code (vJASS):

    scope CustomBytecode initializer init
        /*
       
        This example shows how to create a footman in bytecode
       
         To run custom bytecode two textmacros can be used. Both are defined in JasmExecuteUtils.
       
         The first textmacro sets up all necessary functions and globals to run custom bytecode.
         It creates 2 functions for us. The name of these function depends on the Parameter to the textmacro.
       
            function ExecuteBytecodeExample takes nothing returns nothing
            function GetBytecodeExampleAsCode takes nothing returns code
       
        ExecuteBytecodeExample() runs the bytecode.
        GetBytecodeExampleAsCode() returns the bytecode as a code variable that can be used with the native functions like ForGroup
       
        This textmacro also defines an array l__BytecodeExample. That array contains the bytecode that should be executed.
        This array can only be used between the call to JasmSetupGlobals and JasmSetupExec
       
        */

        //! runtextmacro JasmSetupGlobals("BytecodeExample")
       
        globals
            private integer ID = 'hfoo'
            unit created_footman = null
        endglobals
       
        // DO NOT TOUCH THIS FUNCTION!
        // it's generated bytecode is used to get the internal ids of
        // the global variables "ID" and "created_footman"
        private function GlobalTable takes nothing returns nothing
            set ID = 0
            set created_footman = null
        endfunction
       
        //! novjass
            We want to create a unit and save it to created_footman in bytecode. It should be the equivalent of
           
            set created_footman = CreateUnit(Player(0), ID, 0.0, 0.0, 270.0)
       
            First we need to get a handle to player 0, the call to Player(0)
           
            literal R0 int 0        // Put literal 0 into register R0
            push R0                 // Push that register on the stack
            callnative &Player      // Call the native function Player()
           
            // The result of the call to Player is in R0
           
            // Because the first parameter of CreateUnit is the player, we directly push R0 on the stack.
            // the stack now contains [player0]
            push R0                
           
            getvar R0 int &ID       // get the global variable ID and put its value into R0. (The variable id of ID is set in l__BytecodeExample[9])
            push R0                 // push the id on the stack. the stack is now [player0, 'hfoo']
           
           
            literal R0 real 0.0     // put the real 0.0 into R0
            push R0                 // push it twice on the stack for the x and y parameter of CreateUnit
            push R0                 // stack is now [player0, 'hfoo', 0.0, 0.0]
           
            literal R0 real 270.0   // put the real 270.0 into R0 and push it onto the stack
            push R0                 // stack is now [player0, 'hfoo', 0.0, 0.0, 270.0]
           
           
            callnative &CreateUnit  // Call CreateUnit. This native takes its 5 parameters from the stack and puts its return value int R0
           
            // stack is now []
           
            setvar R0 &created_footman  // Write R0 to created_footmap
           
            ret                     // Return from this function. Will crash Warcraft if this is missing.
        //! endnovjass
       
        private function InitJasmArray_l__BytecodeExample takes nothing returns nothing
            set l__BytecodeExample[   0] = 0x0c000400 // literal      R0     int    -      0    
            set l__BytecodeExample[   1] = 0x00000000
            set l__BytecodeExample[   2] = 0x13000000 // push         R0     -      -      -    
            set l__BytecodeExample[   3] = 0x00000000
            set l__BytecodeExample[   4] = 0x15000000 // callnative   -      -      -      &Player
            set l__BytecodeExample[   5] = 0x00000538
            set l__BytecodeExample[   6] = 0x13000000 // push         R0     -      -      -    
            set l__BytecodeExample[   7] = 0x00000000
            set l__BytecodeExample[   8] = 0x0e000400 // getvar       R0     int    -      &ID
            set l__BytecodeExample[   9] = GetGlobalIdCode(function GlobalTable, 0) // get the first variable that is used in the function GlobalTable ("ID")
            set l__BytecodeExample[  10] = 0x13000000 // push         R0     -      -      -    
            set l__BytecodeExample[  11] = 0x00000000
            set l__BytecodeExample[  12] = 0x0c000500 // literal      R0     real   -      0    
            set l__BytecodeExample[  13] = 0x00000000
            set l__BytecodeExample[  14] = 0x13000000 // push         R0     -      -      -    
            set l__BytecodeExample[  15] = 0x00000000
            set l__BytecodeExample[  16] = 0x13000000 // push         R0     -      -      -    
            set l__BytecodeExample[  17] = 0x00000000
            set l__BytecodeExample[  18] = 0x0c000500 // literal      R0     real   -      270  
            set l__BytecodeExample[  19] = 0x43870000
            set l__BytecodeExample[  20] = 0x13000000 // push         R0     -      -      -    
            set l__BytecodeExample[  21] = 0x00000000
            set l__BytecodeExample[  22] = 0x15000000 // callnative   -      -      -      &CreateUnit
            set l__BytecodeExample[  23] = 0x00000415
            set l__BytecodeExample[  24] = 0x11000000 // setvar       R0     -      -      &created_footman
            set l__BytecodeExample[  25] = GetGlobalIdCode(function GlobalTable, 1)
            set l__BytecodeExample[  26] = 0x27000000 // ret          -      -      -      -    
            set l__BytecodeExample[  27] = 0x00000000
        endfunction
       
        /*
            The second textmacro finalizes the generation of the bytecode.
            It also creates one function:
           
                function JasmInitBytecodeExample takes nothing returns nothing
               
            This function must be called before ExecuteBytecodeExample() or GetBytecodeExampleCode() can be used
        */

        //! runtextmacro JasmSetupExec("BytecodeExample")
       
        private function init takes nothing returns nothing
            local trigger trg = CreateTrigger()
           
            // fill array with bytecode
            call InitJasmArray_l__BytecodeExample()
           
            // Setup necessary variables to call bytecode
            call JasmInitBytecodeExample()
           
            // when player(0) presses escape
            call TriggerRegisterPlayerEventEndCinematic(trg, Player(0))
           
            // then execute the bytecode
            call TriggerAddAction(trg, GetBytecodeExampleAsCode())
        endfunction
       
    endscope
     

    To use my tool either call jasm.exe -h on the command line or drag and drop a file on the included *.bat files. My map dumps the bytecode of the last printed function into a file "CustomMapData/jasm_dump.txt". This file can be converted to jasm by dragging it onto "preload_to_jasm.bat" (there is also "preload dump example.txt" that you can use to try it out). Dragging a *.jasm file onto "jasm_to_array.bat" converts the jasm to a jass function that fills an integer array with the bytecode. You can then simply copy and paste that code into your map.

    Sourcecode of jasm.exe is on github.

    It looks like this:
    disassemble.jpg

    Displaying the code of your function works with a simple one-liner:
    Code (vJASS):

        private function WithIf takes nothing returns integer
           if true then
               return 1
           else
               return 2
           endif
       endfunction

    <snip>

    //! runtextmacro DumpFunction("WithIf")
     
    Jasm Instruction overview

    Arguments that contain only " - " are ignored by the instruction.
    Possible types are void, code, int, real, string, handle, bool, int[], real[], string[], hdl[], bool[].
    The assembler has a list of all natives and does name lookup. That means &Player is the same as writing F_538.

    This table is not complete and may contain errors!

    Name Id Arg 1 Arg 2 Arg 3 Arg 4 Description Example
    endprogram 0x01 - - - - Used to signal end of parsing.
    Ignored by the VM
    jmp_deprecated 0x02 - - - label Behaves exactly like jmp
    func 0x03 type - - function Used by parser to signal
    start of a function.
    Ignored by the VM
    func void F_109f
    endf 0x04 - - - - Like func, just for the end endf
    local 0x05 type - - variable Declares a new local variable local int V_10a1
    global 0x06 type - - variable Declares a new global variable global real V_10a2
    const 0x07 type - - variable Like global const handle V_10a3
    poparg 0x08 type integer - variable Get a function parameter.
    Arg2 is the number of the parameter,
    with the rightmost parameter as number 1
    poparg boolean 2 V_10a4
    cleanstack 0x0b integer - - - Remove Arg1 many parameters
    from the stack
    cleanstack 3
    literal 0x0c register type - integer Set Arg1 to the value of Arg4 literal R5 real 3.1415
    mov 0x0d register register - - Arg1 = Arg2 mov R0 R5
    getvar 0x0e register type - variable Copy variable Arg4 to register Arg1 getvar
    code 0x0f register type - function code code
    getvar[] 0x10 register register type variable getvar[] getvar[]
    setvar 0x11 register - - variable Copy register Arg1 to variable Arg4 setvar R83 V_10a0
    setvar[] 0x12 register register - variable setvar[] setvar[]
    push 0x13 register - - - Push Arg1 onto the stack push R87
    pop 0x14 register - - - Remove the topmost value
    of the stack and put it into Arg1
    pop R87
    callnative 0x15 - - - function Calls a native function.
    Parameters of that function first need to be pushed
    push R1
    callnative &Player
    calljass 0x16 - - - function Like callnative but with jass functions.
    Stack must be cleared afterwards.
    push R1
    push R2
    calljass F_109f
    cleanstack 2
    i2r 0x17 register - - - Convert the value
    in the register to a float
    i2r R85
    and 0x18 register register register - Arg1 = Arg2 and Arg3
    Boolean Operation, not bitwise
    and R1 R1 R2
    or 0x19 register register register - Arg1 = Arg2 or Arg3
    Boolean Operation, not bitwise
    or R1 R1 R2
    eq 0x1a register register register - If Arg2 and Arg3 are equal
    then set Arg1 to 1 else to 0
    eq R1 R1 R2
    ne 0x1b register register register - Not equal, compare eq ne R1 R1 R2
    le 0x1c register register register - Less equal, compare eq le R1 R1 R2
    ge 0x1d register register register - Greater equal, compare eq ge R1 R1 R2
    lt 0x1e register register register - Less than, compare eq lt R1 R1 R2
    gt 0x1f register register register - greater than, compare eq gt R1 R1 R2
    add 0x20 register register register - Arg1 = Arg2 + Arg3 add R89 R89 R88
    sub 0x21 register register register - Arg1 = Arg2 - Arg3 sub R89 R89 R88
    mul 0x22 register register register - Arg1 = Arg2 * Arg3 mul R89 R89 R88
    div 0x23 register register register - Arg1 = Arg2 / Arg3 div R89 R89 R88
    mod 0x24 register register register - Arg1 = Arg2 modulo Arg3 mod R89 R89 R88
    neg 0x25 register - - - Arg1 = -Arg1 neg R39
    not 0x26 register - - - if Arg1 == 0: Arg1 = 1
    else Arg1 = 0
    not R22
    ret 0x27 - - - - Return from the function.
    Returnvalue is in R0
    ret
    label 0x28 - - - label Used as a marker for a jump instruction.
    Ignored by VM
    label L_588
    jmpt 0x29 register - - label Jump if true.
    Continue evaluation at label Arg4 if Arg1 is true
    jmpt L_588
    jmpf 0x2a register - - label Jump if false, compare jmpt jmpf L_588
    jmp 0x2b - - - label Unconditional jump.
    Continue execution at label Arg4
    jmp L_588




    And as a little bonus, dynamic dispatch:
    Code (vJASS):

        function PoorMansJumptable takes nothing returns nothing
           call FunA() // 0
           return
           call FunB() // 1
           return
           call FunC() // 2
           return
           call FunD() // 3
           return
           call FunE() // 4
           return
       endfunction
     
       function EvalJumptable takes code table, integer num returns nothing
           call ForForce(bj_FORCE_PLAYER[0], I2C(C2I(table) + 2 * 8 * num))
       endfunction

       // 5 functions so i is between 0 and 4 inclusive
       function Test takes integer i returns nothing
           call EvalJumptable(function PoorMansJumptable, i)
       endfunction
     
     

    Attached Files:

    Last edited: Apr 29, 2017
  2. leandrotp

    leandrotp

    Joined:
    Jul 30, 2012
    Messages:
    153
    Resources:
    1
    Tutorials:
    1
    Resources:
    1
    Please let me be the first here to thank you!

    I wanted so much to make this (and much more) by myself, but you did everyone a favor and brought my ideas to the real world.

    I've been promising to write the JASS VM documentation for a long time, it's all in my head, but it's so much text that I don't really know what to begin writing.

    So I must thank you for not waiting for me to share it. You figured it out just by yourself, and you probably did much better than I ever would.

    And while I don't get the documentation fully written, I guess I can help you with some things.

    First, I need to write something about the string table. The information found here is not complete, what most people don't know is that string variables are actually HANDLES! There are basically 2 tables: the String table itself, and the string handle table.

    Also, the string table holds much more than you think. Every function name and variables used in the map scripts is also present in the string table. But normally those strings won't get a handle assigned, so we can't access them using this.

    Finally, the string table takes a key role in jass bytecode. You will see that every JASS opcode that deals with functions or variables takes an id. And this id is actually the string table id of that name! The string table is static and created at compilation time (when map script is translated to bytecode). All strings used in the map (all function and variable names, as well as string literals) are placed into string table at this time, then they get an Id assigned, and this id is then used in JASS bytecode when that name is referenced.

    Obviously the string table can grow at runtime, when new strings are generated (with things like
    Substring
    or
    GetObjectName
    ). But as I said, all strings used directly in the map script are inserted into the string table at compilation stage. Then, when the function referencing those strings is actually executed, only a string handle is generated.

    If you typecast a string into an integer, what you get is a handle, not the actual string id. There's no easy way to obtain the id for a generic string, but if you want to know the id of a specific name (to run bytecode from array, for example), you can read it directly from a bytecode instruction.

    Second, the information on Grimoire's source code is not fully accurate. Here is the true list of all JASS opcodes:
    Code (vJASS):

    struct opcode {
       char arg3
       char arg2
       char arg1
       char OP
       int data
    }

    enum OPCODES {
       OP_ENDPROGRAM=0x1,       //Signal to the parser that code has ended. Ignored by the VM
     
       OP_JUMP_DEPRECATED=0x2,  //Might have been used in past, now has the same behaviour as OP_JUMP
     
       OP_FUNCTION=0x3,         //Function declaration. Arg1 = return type, Data = function name. Used only
                                //at parsing stage.
     
       OP_ENDFUNCTION=0x4,      //Denotes end of function. Used only at parsing stage, ignored by VM
     
       OP_LOCAL=0x5,            //Create local variable. Arg1 = variable type, Data = Variable name
     
       OP_GLOBAL=0x6,OP_CONSTANT=0x7,   //These two are actually the same! There's no difference between
                                        //globals and constants from VM's point of view. Parameters are
                                        //the same as above.
     
       OP_POPFUNCARG=0x8,       //Create a local var and assign a value to it directly from the caller's
                                //stack. Arg1 = Type, Arg2 = #FuncArg, Data = Variable Name
     
       OP_TYPE=0x9,OP_EXTENDS=0xA,    //Used only by parser, ignored by VM
     
       OP_CLEANSTACK=0xB,       //Pops <Arg1> values from the stack. Used after calling Jass functions,
                                //not needed when calling natives.
     
       OP_LITERAL=0xC,          //Set value of register to a literal. Arg1 = DestReg, Arg2 = DestType,
                                //Data = Literal. If type is J_STRING, a string handle is immediately
                                //created, and that's what the register actually gets.
                             
       OP_MOV=0xD,              //Moves 1 register to another. Curiously it's only used when returning
                                //a value (by moving to R0). Arg1 = DestReg, Arg2 = SrcReg
     
       OP_GETVAR=0xE,           //Read variable. Arg1 = DestReg, Arg2 = DestType, Data = Variable name.
                                //DestType doesn't need to match the variable type.
     
       OP_CODE=0xF,             //Get function address. Arg1 = DestReg, Data = Function name. Value is
                                //stored with type J_CODE.
     
       OP_GETARRAY=0x10,        //Read from array. Arg1 = DestReg, Arg2 = IndexReg, Arg3 = DestType,
                                //Data = Array name. DestType doesn't need to match array type.
     
       OP_SETVAR=0x11,          //Write variable. Arg1 = SrcReg, Data = Variable name. Type of SrcReg MUST
                                //MATCH the type of variable, unless it's J_NULL.
     
       OP_SETARRAY=0x12,        //Write to array. Arg1 = IndexReg, Arg2 = SrcReg, Data = Array name. Type
                                //of SrcReg must match the array type.
     
       OP_PUSH=0x13,            //Pushes register <Arg1> onto the stack
     
       OP_POP=0x14,             //Pops the value from the top of the stack to register <Arg1>. God knows
                                //why Blizzard uses PUSH/POP on math operations.
     
       OP_NATIVE=0x15,          //Call a native. Data = Native name. No arguments,
     
       OP_JASSCALL=0x16,        //Call a JASS function. Same as above.
     
       OP_I2R=0x17,             //Read register <Arg1>, convert it to a float, and store the result in the
                                //same register, with type J_REAL. This is faster than native I2R

       //Boolean operations, Arg1 = DestReg, Arg2 and 3 are source operands.
     
       //These are not bitwise operations, but just boolean, they return either 0 or 1.
       OP_AND = 0x18,
       OP_OR = 0x19,
     
       //Comparison operations, return either 0 or 1
       OP_EQUAL=0x1A,
       OP_NOTEQUAL=0x1B,       // check
       OP_LESSEREQUAL=0x1C,OP_GREATEREQUAL=0x1D,
       OP_LESSER=0x1E,OP_GREATER=0x1F,
     
       //End boolean operations

       OP_ADD=0x20,OP_SUB,OP_MUL,OP_DIV, //Math operations, Arg1 = DestReg, Arg2 and 3 are source operands.
     
       OP_MODULO = 0x24, //Same as above, but this one is deprecated and never emitted by compiler (WHY?)
     
       OP_NEGATE=0x25,   //Negate the value of register <Arg1> and store in the same register. Notice that
                         //writing negative literals in the script will always produce this operation, when
                         //it could just emit a literal already negated.
                             
       OP_NOT = 0x26,    //Boolean NOT. Returns 1 if register <Arg1> is 0, and 0 in all other cases. Result
                         //is stored in the same register.
     
       OP_RETURN=0x27,   //No arguments
     
       OP_LABEL=0x28,    //Defines a jump label. Data = Label Id. Used at parsing stage, ignored by VM
       
       OP_JUMPIFTRUE=0x29,OP_JUMPIFFALSE=0x2A, //Jump to label <Data> depending on contents of register <Arg1>
     
       OP_JUMP=0x2B       //Unconditional jump. Data = Label Id
     
     
    Last edited: Apr 26, 2017
  3. TriggerHappy

    TriggerHappy

    Code Moderator

    Joined:
    Jun 23, 2007
    Messages:
    3,789
    Resources:
    22
    Spells:
    11
    Tutorials:
    2
    JASS:
    9
    Resources:
    22
    There is also a list of op codes here: YDWE/opcode.h at e4f3b8390acc6779b984d59878b0473d5c0489f3 · actboy168/YDWE · GitHub

    There are many things related to the JASS VM there as well.

    Code
    Code (C++):

        void jass_get_global_variable(lua_State* L, jass::OPCODE_VARIABLE_TYPE opt, uint32_t value)
        {
            switch (opt)
            {
            case jass::OPCODE_VARIABLE_NOTHING:
            case jass::OPCODE_VARIABLE_UNKNOWN:
            case jass::OPCODE_VARIABLE_NULL:
                lua_pushnil(L);
                break;
            case jass::OPCODE_VARIABLE_CODE:
                jassbind::push_code(L, value);
                break;
            case jass::OPCODE_VARIABLE_INTEGER:
                jassbind::push_integer(L, value);
                break;
            case jass::OPCODE_VARIABLE_REAL:
                jassbind::push_real(L, value);
                break;
            case jass::OPCODE_VARIABLE_STRING:
                jassbind::push_string(L, get_jass_vm()->string_table->get(value));
                break;
            case jass::OPCODE_VARIABLE_HANDLE:
                jassbind::push_handle(L, value);
                break;
            case jass::OPCODE_VARIABLE_BOOLEAN:
                jassbind::push_boolean(L, value);
                break;
            default:
                lua_pushnil(L);
                break;
            }
        }
    }
     
    Last edited: Apr 26, 2017
  4. Hotwer

    Hotwer

    Joined:
    Mar 10, 2013
    Messages:
    370
    Resources:
    0
    Resources:
    0
    Would love a more friendly introduction to that.
     
  5. Trigger.edge

    Trigger.edge

    Joined:
    Jun 21, 2012
    Messages:
    419
    Resources:
    0
    Resources:
    0
    Okay....

    Why do not you use your energies to make the compiler instead of this? :D

    By the way, is very useful thanks...
     
    Last edited: Apr 26, 2017
  6. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    That exactly is the plan. :grin: But writing a compiler is hard, so it's always a multi step process. And the assembler I'm writing is the lowest level step. If that doesn't work then a compiler doesn't even have a chance to function.
     
  7. Halo7568

    Halo7568

    Joined:
    Jun 16, 2004
    Messages:
    108
    Resources:
    0
    Resources:
    0
    They probably could deal with it differently, or optimize out a lot of those types of pointless push/pop operations if they did an optimization pass, but I think those are to deal with cases like this:
    Code (vJASS):

    function bigmath takes nothing returns integer
        local integer big = 1
        set big = big * big + big / big - big + big + big + big + big - big - big - big - big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big + big
        set big = 1
        return big
    endfunction

    function watchout takes nothing returns nothing
        local integer whoa = 55 + bigmath()
        call BJDebugMsg("bigmath was: " + I2S(whoa))
    endfunction
     


    Though not quite as contrived as this. Really all it takes is some random function using the same register. If everything else was left the same, but you took out the push/pops, then the 55 would be lost in this assignment.
     
  8. Trigger.edge

    Trigger.edge

    Joined:
    Jun 21, 2012
    Messages:
    419
    Resources:
    0
    Resources:
    0
    Gold Parser Builder could helpful.
     
  9. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    I attached my jasm (dis)assembler tool to the first post and also added a bit documentation about the bytecode.

    Also @leandrotp can you explain in a bit more detail how you got access to the memory? And especially how you managed to execute the bytecode you wrote in your Memory lib. I already read your post here but it's still arcane magic for me.

    Also is it possible to add new labels or functions to the stringtable? Because if we can't we are at a dead end with the compiling plans.

    And do you know of it's possible to change the bytecode of an existing function? That would allow for a very simple interface between bytecode and (v)jass because we can just create a stub function and replace its code with our bytecode.
     
    Last edited: Apr 28, 2017
  10. leandrotp

    leandrotp

    Joined:
    Jul 30, 2012
    Messages:
    153
    Resources:
    1
    Tutorials:
    1
    Resources:
    1
    Of course! Well, I still have plans to write a vJass library to ease the process of executing bytecode from arrays, but while I don't make it, I'll explain how to do it manually.

    Step 1 - Get address of Array Struct


    Basically JASS arrays are regular global variables that hold a pointer a special struct:
    Code (vJASS):

    struct JassArray<T>
    {
        void * vTable
        unsigned int maxSize
        unsigned int currentSize
        T * pData
    }
    The memory address of the array data is stored in the pData member. But to read it, we first need to get the address of the JassArray struct from the array variable. To do that, we just need to typecast the array variable into an integer!

    There are 2 ways to do that. The permanent and the temporary typecast.

    Code (vJASS):

    globals
        integer Bytecode //Used just to fool Jasshelper
        integer array l__Bytecode
        integer StructAddress
    endglobals

    function InitBytecode takes nothing returns nothing
        set l__Bytecode[0] = <my>
        set l__Bytecode[1] = <bytecode>
        set l__Bytecode[2] = <here>
        ...
    endfunction

    function Typecast takes nothing returns nothing
       local integer Bytecode //Jasshelper will implicitly rename this to l__Bytecode
    endfunction

    //# +nosemanticerror
    function GetStructAddress takes nothing returns nothing
       set StructAddress = l__Bytecode
       return
    endfunction

    function init takes nothing returns nothing
       call InitBytecode()
       call GetStructAddress()
    endfunction
     

    As I have previously explained, typecasting is based on the fact that you can declare a local variable with the same name of a global, and it will change the type of the global. So we can declare a local integer named
    l__Bytecode
    , and it will cause the global array l__Bytecode to become an integer. Reading that integer will then return the address of the JassArray struct for that array.

    This method is called permanent because all code that comes after the
    Typecast
    function will treat the array as an integer
    ! Which means that it can no longer be used as an array, and because of that, you must initialize the array in a function that comes before the Typecast function.

    This is fine for most cases, but if for some reason you want to access the array from other places in your code, there's an alternative below.
    Code (vJASS):

    globals
        integer Bytecode //Used to fool Jasshelper
        integer array l__Bytecode
        integer StructAddress
    endglobals

    function InitBytecode takes nothing returns nothing
        set l__Bytecode[0] = <my>
        set l__Bytecode[1] = <bytecode>
        set l__Bytecode[2] = <here>
        ...
    endfunction

    //Jasshelper will rename the function argument to l__Bytecode
    function GetStructAddress takes integer Bytecode returns nothing
       set StructAddress = Bytecode //l__Bytecode
    endfunction

    //# +nosemanticerror
    function init takes nothing returns nothing
       call InitBytecode()
       call ForForce(bj_FORCE_PLAYER[0], I2C(8+C2I(function GetStructAddress)))
    endfunction
     

    This method makes use of a function argument instead of a local variable to make the typecast. Because of that, the typecast only works within the scope of that function - all other code will still treat variable
    l__Bytecode
    as an integer array.

    However, we normally can't use functions that take arguments as
    code
    . This is because the first instruction of those functions is always a poparg instruction, which crashes the game if there are no arguments. So we need to use
    I2C
    and
    C2I
    to jump over that instruction, by skipping 8 bytes from the beginning of the function.

    Doing that, the function will store the address of the JassArray struct into variable StructAddress, and the array will still be usable from the rest of your code.

    Step 2 - Getting the address of data and execute


    Once we have the address of the struct we just need to read the pData member:

    set ArrayAddress = ReadMemory(StructAddress + 12)
    or
    set ArrayAddress = Memory[StructAddress/4 + 3]


    Notice that when you use the Memory array, all addresses must be divided by 4. So the function ReadMemory is provided for convenience, if you don't want to worry about that. I like to use the array because it's faster, but if the script optimization of Jasshelper is turned on, the function call will be inlined, so there's no difference.

    After getting the array address we can easily execute it:
    Code (vJASS):

    set BytecodeTrigger = CreateTrigger()
    call TriggerAddCondition(BytecodeTrigger, Condition(I2C(ArrayAddress)))
    call TriggerEvaluate(BytecodeTrigger)
     

    Or you can just use
    ForForce(bj_FORCE_PLAYER[0], I2C(ArrayAddress))
    if you want, it's better if you're going to run it only once.

    Step 3 - Dealing with saved games


    When a saved game is loaded, pretty much everything is located in different places than before. This is not a problem for normal Jass programming, but when we use bytecode we are working directly with memory, and this means that our bytecode array will certainly be located in a different memory address now.

    To deal with this we need to register a trigger for
    EVENT_GAME_LOADED
    to retrieve the location of Bytecode and update our trigger.

    Code (vJASS):

    function OnGameLoaded takes nothing returns boolean
       call GetStructAddress()
       set ArrayAddress = ReadMemory(StructAddress + 12)
       call TriggerClearConditions(BytecodeTrigger)
       call TriggerAddCondition(BytecodeTrigger, Condition(I2C(ArrayAddress)))
       return false
    endfunction

    function init takes nothing returns nothing
       local trigger t = CreateTrigger()
       call TriggerRegisterGameEvent(t, EVENT_GAME_LOADED)
       call TriggerAddCondition(t, Condition(function OnGameLoaded))
    endfunction
     

    Since Memory hack was fixed, there's no way to modify those tables. Also it's not possible to modify the bytecode of an existing function after 1.27b, because now we have just read-only access to memory.

    However it's possible to produce a fake Jump table in a jass array, and create new jump labels to use with the jump instructions. This could possibly allow to emulate function calls using JMP instead of a regular call, and we could also pass parameters in registers for better performance.

    But still there's no way for normal JASS code to call bytecode other than using natives. A good practice for using bytecode is to pass it directly to Trigger conditions and timers, since it can call other JASS code normally, but not the opposite way.
     
  11. Aniki

    Aniki

    Joined:
    Nov 7, 2014
    Messages:
    562
    Resources:
    6
    Tools:
    1
    Maps:
    1
    Spells:
    1
    JASS:
    3
    Resources:
    6
    How would a "Bytecode compiler" setup/create/modify the string table (not the string handle table) such that the instructions (OP_GETARRAY, OP_SETARRAY, etc.) that require a variable/function name (i.e an id from the string table) would have something to work with?
     
  12. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    For local variables we can just use already existing variables in the string table. I haven't done extensive testing but I did not notice any problems. To create new entries for global variables it should be possible to create a new entry by calling a string function, e.g I2S(some_id_here). If it's used only in bytecode the name doesn't matter. For functions the plan is to use only jumps.

    Can you explain a bit how that is possible?

    Also I found a way to read the string table:
    Code (vJASS):

    library StringFromId initializer init uses JasmCore
       globals
           integer jasm  // Not used, it's here just to fool Jasshelper
           integer array l__jasm
            integer jasm_address
            private integer offset = 1
            private string ret_string
            private code bytecode
       endglobals
         
        private function GetGlobalIds takes nothing returns nothing
            set ret_string = "" // start + 2 instr
        endfunction

       function InitJasmArray_l__jasm takes integer ret_id returns nothing
            set l__jasm[   0] = 0x0c010600 // literal      R1     string -    
            set l__jasm[   1] = ret_id
            set l__jasm[   2] = 0x11010000 // setvar       R1     -      -    
            set l__jasm[   3] = ret_id
            set l__jasm[   4] = 0x27000000 // ret          -      -      -
            set l__jasm[   5] = 0x00000000
        endfunction
     
        private function GetBytecodeAddress takes nothing returns integer
            return Memory[jasm_address/4+3]
        endfunction
     
        function ReadStringFromId takes integer id returns string
            set l__jasm[offset] = id
            call ForForce(bj_FORCE_PLAYER[0], bytecode)
            return ret_string
        endfunction
     
        function GetFunctionName takes code c returns string
            return ReadStringFromId(GetJasmInstrB(GetJasmCodeAddr(c),0))
        endfunction
     
       private function Typecast takes nothing returns nothing
           local integer jasm
       endfunction

        //# +nosemanticerror
        private function GetJasmAddress takes nothing returns nothing
            set jasm_address = l__jasm
            return
        endfunction

       private function init takes nothing returns nothing
            local integer globidfun = GetJasmCodeAddr(function GetGlobalIds)
            local integer ret_str_id = GetJasmInstrB(globidfun, 2)
            call InitJasmArray_l__jasm(ret_str_id)
            call GetJasmAddress()
            set bytecode = I2C(GetBytecodeAddress())
       endfunction

    endlibrary
     


    And the JasmCore lib:
    Code (vJASS):

    library JasmCore uses Memory
       function GetJasmCodeAddr takes code c returns integer
           // function address is first bytecode instruction of function
           // we start at the previous instruction to get the function declaration
           return C2I(c) - 8
       endfunction

       function GetJasmInstrA takes integer addr, integer instruction returns integer
           return Memory[addr/4 + 2*instruction]
       endfunction

       function GetJasmInstrB takes integer addr, integer instruction returns integer
           return Memory[addr/4 + 2*instruction + 1]
       endfunction
    endlibrary
     
     
  13. Aniki

    Aniki

    Joined:
    Nov 7, 2014
    Messages:
    562
    Resources:
    6
    Tools:
    1
    Maps:
    1
    Spells:
    1
    JASS:
    3
    Resources:
    6
    I2S(<some-integer>) would add a new entry (unless the string was already in the table?) but how would you obtain the string's id (maybe it's obvious, I just don't understand much about bytecode).

    "Now you're thinking with bytecode" =)...
     
  14. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    I only know a very hacky way to get the id. These ids are all sequential, so when we know the biggest id and make the call to I2S(<something>) (with <something> as a string that is not in the table) then the new id is just <biggest id> + 1. Maybe @leandrotp knows a better way, he is the memory expert:thumbs_up:

    I also updated the attached map and my first post with an documented bytecode example.
     
  15. Aniki

    Aniki

    Joined:
    Nov 7, 2014
    Messages:
    562
    Resources:
    6
    Tools:
    1
    Maps:
    1
    Spells:
    1
    JASS:
    3
    Resources:
    6
    Not sure why but the "Temporary Typecast" crashes for me with patch 1.28.1.
    It seems to crash while trying to execute the instructions from
    set StructAddress = Bytecode //l__Bytecode
    ,
    after jumping/skipping the popfuncarg instruction.

    Using your
    StringFromId
    I dumped the string table (attached .txt file) and it seems the entries in there are whatever names (function, parameter, global/local variables) and string literals the parser has encountered while parsing (in parsing order).
    So I suppose a "Bytecode compiler" would have to use custom common.j and blizzard.j files (as leandrotop has described here) in order to know the id of a newly created string which could be used for allocating a new global array.


    PS: My "hello-bytecode" script (which doesn't print "hello-world" =)...)
    Code (vJASS):

    library Foo initializer init uses Typecast, Memory

    globals
        integer result_integer
        integer result_integer_id
    endglobals
    function result_integer_get_id takes nothing returns nothing
        set result_integer = 0
        set result_integer_id = Memory[C2I(function result_integer_get_id)/4 + 3]
    endfunction

    globals
        // it seems we can't declare these to be private nor public
        // because the variable names won't match with that of the
        // "local integer instructions"
        //
        // i.e we have to come up with different variable names each time
        // we want to execute different instructions =)?
        //
        integer instructions
        integer array l__instructions
    endglobals

    private function instructions_init takes nothing returns nothing
        call result_integer_get_id()
        set l__instructions[0] = 0x0C010400
        set l__instructions[1] = 1234
        set l__instructions[2] = 0x11010000
        set l__instructions[3] = result_integer_id
        set l__instructions[4] = 0x27000000
        set l__instructions[5] = 0x00000000
    endfunction

    private function typecast_instructions takes nothing returns nothing
        local integer instructions
    endfunction

    //# +nosemanticerror
    private function instructions_execute takes nothing returns nothing
        call instructions_init()
        call ForForce(bj_FORCE_PLAYER[0], I2C(Memory[l__instructions/4 + 3]))
    endfunction

    private function init takes nothing returns nothing
        call instructions_execute()
        call BJDebugMsg("result_integer: " + I2S(result_integer))
    endfunction

    endlibrary
     
     

    Attached Files:

  16. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    Yeah I also didn't get it to work.

    I have not tested it, but I think the names of new globals can be arbitrary, as long we don't have any collisions with existing global/local variables. So I think we could get a "good enough" start index in the string table if we insert a function at the end of war3map.j which contains an unique identifier. Then the start in the stringtable would be the id of that string +X.

    This is completely untested, but something like this may work:
    Code (vJASS):

    function calculate_free_string_id takes nothing returns integer
        call ExecuteFunc("calculate_free_string_id")
        return free_id
    endfunction

    // at the end of war3map.j
    function <some_unique_id> takes nothing returns nothing
    endfunction

    function calculate_free_string_id takes nothing returns nothing
        set free_id = Memory[C2I(function <some_unique_id>)/4 -1] // free_id now contains the string table id of <some_unique_id>
        set free_id = free_id + 2
    endfunction
     


    Nice. Now we just need to find some applications for bytecode that are actually useful :wink:

    Edit: Found something more or less useful :grin:
    It's just a proof of concept, but with bytecode we can emulate closures. Funktions that take a bit of state with them. It can be used to attach data to a timer, a ForGroup call, a TriggerCondition, etc

    Code (vJASS):

        //# +nosemanticerror
        private function DoSomething takes integer i, unit u returns nothing
            local code closure = create_closure(ModuloInteger(i*i+1, 255), u, function DoSomething)
            call SetUnitVertexColor(u, i, i, i, 255)
            call TimerStart(GetExpiredTimer(), 0.5, false, closure)
        endfunction
       
        //# +nosemanticerror
        private function Example takes nothing returns nothing
            local unit u = CreateUnit(Player(0), 'hfoo', 0, 0, 270)
            call TimerStart(CreateTimer(), 5, false, create_closure(42, u, function DoSomething))
        endfunction
     


    Code (vJASS):

    scope JumpTest initializer init

    //! runtextmacro JasmSetupGlobals("JasmTimerAttach")

        globals
            private integer array free_stack
            private integer free_stack_ptr = 0
            private integer bytecode_free_ptr = 0
        endglobals

       
        private function add_handle_parameter takes integer offset, handle h returns integer
            set l__JasmTimerAttach[offset  ] = 0x0C010700 // literal R1 handle
            set l__JasmTimerAttach[offset+1] = GetHandleId(h)
            set l__JasmTimerAttach[offset+2] = 0x13010000 // push R1
            set l__JasmTimerAttach[offset+3] = 0x00000000
            return offset + 4
        endfunction
       
        private function add_integer_parameter takes integer offset, integer i returns integer
            set l__JasmTimerAttach[offset] = 0x0C010400 // literal R1 int
            set l__JasmTimerAttach[offset+1] = i
            set l__JasmTimerAttach[offset+2] = 0x13010000 // push R1
            set l__JasmTimerAttach[offset+3] = 0x00000000
            return offset + 4
        endfunction
       
        private function add_calljass takes integer offset, code fun returns integer
            set l__JasmTimerAttach[offset  ] = 0x16000000 // calljass
            set l__JasmTimerAttach[offset+1] = GetJasmFunctionId(fun)
            return offset + 2
        endfunction
       
        private function add_ret takes integer offset returns integer
            set l__JasmTimerAttach[offset  ] = 0x27000000 // ret
            set l__JasmTimerAttach[offset+1] = 0x00000000
            return offset + 2
        endfunction  
       
        private function add_cleanstack takes integer offset, integer size returns integer      
            set size = Bitwise.shiftl(size, 16)
           
            set l__JasmTimerAttach[offset  ] = Bitwise.OR32(0x0b000000, size) // cleanstack
            set l__JasmTimerAttach[offset+1] = 0x00000000
            return offset + 2
        endfunction
       
        private function cleanup takes integer offset returns nothing
            set free_stack[free_stack_ptr] = offset
            set free_stack_ptr = free_stack_ptr + 1
        endfunction
       
        //# +nosemanticerror
        private function create_closure takes integer i, handle h, code fun returns code
            local integer offset
            local integer start
            local boolean update_ptr = false
            local integer size
           
            if free_stack_ptr > 0 then
                set free_stack_ptr = free_stack_ptr - 1
                set offset = free_stack[free_stack_ptr]
            else
                set offset = bytecode_free_ptr
                set update_ptr = true
            endif
           
            set start = offset
           
            // register parameters
            set offset = add_integer_parameter(offset, i)
            set offset = add_handle_parameter(offset, h)
           
            // call original function
            set offset = add_calljass(offset,fun)
            set offset = add_cleanstack(offset, 2)
           
            // add cleanup code
            set offset = add_integer_parameter(offset, start)
            set offset = add_calljass(offset,function cleanup)
            set offset = add_cleanstack(offset, 1)
           
            set offset = add_ret(offset)
           
            if update_ptr then
                set bytecode_free_ptr = offset
            endif
           
            set size = offset - start
           
            return I2C(GetJasmTimerAttachBytecodeAddress() + start*4)
        endfunction
       
        private function InitJasmArray_l__JasmTimerAttach takes nothing returns nothing
            set l__JasmTimerAttach[  0] = 0x27000000 // ret          -      -      -      -    
            // Jass arrays are not allocated all at once
            // we access the last index to force it to allocate the full memory
            set l__JasmTimerAttach[8191] = 0x00000000
        endfunction

    //! runtextmacro JasmSetupExec("JasmTimerAttach")

        //# +nosemanticerror
        private function DoSomething takes integer i, unit u returns nothing
            local code closure = create_closure(ModuloInteger(i*i+1, 255), u, function DoSomething)
            call SetUnitVertexColor(u, i, i, i, 255)
            call TimerStart(GetExpiredTimer(), 0.5, false, closure)
        endfunction
       
        //# +nosemanticerror
        private function Example takes nothing returns nothing
            local unit u = CreateUnit(Player(0), 'hfoo', 0, 0, 270)
            call TimerStart(CreateTimer(), 5, false, create_closure(42, u, function DoSomething))
        endfunction
       
        private function init takes nothing returns nothing
            call InitJasmArray_l__JasmTimerAttach()
            call JasmInitJasmTimerAttach()
            call Example()
        endfunction

    endscope
     
     
    Last edited: Apr 30, 2017
  17. Aniki

    Aniki

    Joined:
    Nov 7, 2014
    Messages:
    562
    Resources:
    6
    Tools:
    1
    Maps:
    1
    Spells:
    1
    JASS:
    3
    Resources:
    6
    Found a silly way to reference strings in bytecode and wrote the bytecode version of "hello world" =):
    Code (vJASS):

    library Foo initializer bc_0001_execute uses Typecast, Memory, stringtableidfromhandle

    globals
        integer bc_0001
        integer array l__bc_0001
        integer bc_0001_addr
        integer bc_0001_offset = -1
    endglobals

    private function X takes nothing returns integer
        set bc_0001_offset = bc_0001_offset + 1
        return bc_0001_offset
    endfunction

    globals
        integer func_print_stid
    endglobals
    private function print takes string s returns nothing
        call BJDebugMsg(s)
    endfunction

    private function bc_0001_init takes nothing returns nothing
        set l__bc_0001[X()] = 0x0C010600
        set l__bc_0001[X()] = stid_from_handle("hello world =)")
        set l__bc_0001[X()] = 0x13010000
        set l__bc_0001[X()] = 0x00000000
        set l__bc_0001[X()] = 0x16000000
        set l__bc_0001[X()] = func_print_stid
        set l__bc_0001[X()] = 0x0B010000
        set l__bc_0001[X()] = 0x00000000
        set l__bc_0001[X()] = 0x27000000
        set l__bc_0001[X()] = 0x00000000
    endfunction

    private function bc_0001_allocate takes nothing returns nothing
        set l__bc_0001[8190] = 0
    endfunction
    private function bc_0001_typecast takes nothing returns nothing
        local integer bc_0001
    endfunction

    //# +nosemanticerror
    private function bc_0001_init_vars takes nothing returns nothing
        call bc_0001_allocate()
        set func_print_stid = Memory[C2I(function print)/4 - 1]
        set bc_0001_addr = Memory[l__bc_0001/4 + 3]
    endfunction

    private function bc_0001_execute takes nothing returns nothing
        call bc_0001_init_vars()
        call bc_0001_init()
        call ForForce(bj_FORCE_PLAYER[0], I2C(bc_0001_addr))
    endfunction

    endlibrary
     


    Code (vJASS):

    library stringtableidfromhandle initializer init uses Typecast, StringFromId

    globals
        private hashtable cache = InitHashtable()
        private integer offset // offset in the string table from which we start the search
        private string ef_s
        private integer ef_result
    endglobals
    private function init takes nothing returns nothing
        local string s
        local integer i

        set s = I2SH(1)
        if s == "" then
            set s = I2SH(2)
        endif

        set i = 1
        loop
            exitwhen ReadStringFromId(i) == s
            set i = i + 1
        endloop

        set offset = i
    endfunction

    private function stid_from_handle_exec takes nothing returns nothing
        local string s = ef_s
        local integer i

        set i = offset
        loop
            if ReadStringFromId(i) == s then
                set ef_result = i
                return
            endif
            set i = i - 1
            if i == 0 then
                exitwhen true
            endif
        endloop

        set i = offset + 1
        loop
            if ReadStringFromId(i) == s then
                set ef_result = i
                return
            endif
            set i = i + 1
        endloop

        // unreachable
    endfunction

    function stid_from_handle takes string s returns integer
        local integer sh = StringHash(s)
        local integer i = LoadInteger(cache, 0, sh)
        if i != 0 then
            return i
        endif

        set ef_s = s
        call ExecuteFunc(SCOPE_PRIVATE + "stid_from_handle_exec")
        call SaveInteger(cache, 0, sh, ef_result)
        return ef_result
    endfunction

    endlibrary
     
     
  18. Lord of theDing

    Lord of theDing

    Joined:
    May 19, 2010
    Messages:
    35
    Resources:
    0
    Resources:
    0
    That looks like a linear search in the table, right?

    What's up with this part?
    It would be nice to know how the warcraft engine handles the lookup. Then we could mimic that, it's faster than searching.
     
  19. Aniki

    Aniki

    Joined:
    Nov 7, 2014
    Messages:
    562
    Resources:
    6
    Tools:
    1
    Maps:
    1
    Spells:
    1
    JASS:
    3
    Resources:
    6
    Yes it is linear search, starting from the string-handle-table's first non-empty
    entry's string-table-id which should be ~= 4K (assuming default common.j and blizzard.j).
    Starting from there its goes backwards in the string-table until it goes to 0 and then it searches
    forward, when it finds the string's string-table-id it caches it. It doesn't need to be fast, I use it
    for testing stuff out. Its kind of the opposite of the ReadStringFromId function =).

    Yes, I guess.

    After tinkering around, it seems that registers can be of type 0x02 ("any"):
    Code (vJASS):

        set l__bc_0001[X()] = 0x0C010200 // R[1].any = 1
        set l__bc_0001[X()] = 0x00000001

        set l__bc_0001[X()] = 0x11010000 // var[result_integer] = R[1]
        set l__bc_0001[X()] = stid_from_handle("result_integer")

        set l__bc_0001[X()] = 0x11010000  // var[result_real] = R[1]
        set l__bc_0001[X()] = stid_from_handle("result_real")

        set l__bc_0001[X()] = 0x11010000  // var[result_string] = R[1]
        set l__bc_0001[X()] = stid_from_handle("result_string")

        set l__bc_0001[X()] = 0x11010000  // var[result_bool] = R[1]
        set l__bc_0001[X()] = stid_from_handle("result_bool")

        set l__bc_0001[X()] = 0x27000000
        set l__bc_0001[X()] = 0x00000000

    ...

    call BJDebugMsg(I2S(result_integer)) // 1
    call BJDebugMsg(R2S(result_real)) // 0.00000000... (0x00000001 is a very tiny real)
    call BJDebugMsg(result_string) // "" == StringHandleTable[1]
    call BJDebugMsg(bool2str(result_bool)) // "true"
     


    Its doesn't seem that useful because one can typecast a register by assigning its value to
    a global variable of the wanted type and the read from it with the 0x0E (getvar) instruction
    which doesn't type check:
    Code (vJASS):

        local integer ri = stid_from_handle("result_integer")
        set l__bc_0001[X()] = 0x0C010400 // R[1].integer = 0x3F800000
        set l__bc_0001[X()] = 0x3F800000
        set l__bc_0001[X()] = 0x11010000 // var[result_integer] = R[1]
        set l__bc_0001[X()] = ri
        set l__bc_0001[X()] = 0x0E010500 // R[1].real = var[result_integer]; R[1] changed its type from 0x04 (integer) to 0x05 (real)
        set l__bc_0001[X()] = ri
        set l__bc_0001[X()] = 0x11010000 // var[result_real] = R[1] = 1.0
        set l__bc_0001[X()] = stid_from_handle("result_real")
     



    I see what you mean now, and I agree.

    The jump instructions (0x02, 0x29, 0x2A, 0x2B) work with label ids which seem to live in a table
    but the table can't be modified (unless one has write access I suppose).
     
  20. leandrotp

    leandrotp

    Joined:
    Jul 30, 2012
    Messages:
    153
    Resources:
    1
    Tutorials:
    1
    Resources:
    1
    The game doesn't do any lookup. It doesn't need to do it. String Ids are only used by bytecode, and bytecode is only generated at the compilation stage. As the map script is read by the compiler, strings are inserted into the table, and the returned id goes to bytecode. There's no mechanism to retrieve the id of a specific string except by linear search.

    But as you know, these ids are sequential, so in theory you can determine the id of all strings in the script even before the game runs. You can also obtain the ids by reading the bytecode of a function, as you have all being doing now. But for strings generated at runtime, no way other than searching.

    Maybe someday we could have a tool like Jasshelper with a macro that returns the string id of a name at compile time. This could help a lot to work with bytecode.

    It's strange, I swear I remember to have it working in the past, but now it's crashing for me too. I'll remove that part from post for now.

    So if you really need an alternative for "Permanent Typecast" the only way I can think of is to use bytecode itself to read an array as integer. Obviously this requires you to first have a working bytecode array using the permanent method, then you can use that array to run some code and obtain the struct addresses of other arrays, through the GETVAR instruction.

    Also, just a tip on your "hello world" snippet: you don't need to make a wrapper for
    BJDebugMsg
    , you can call it directly from bytecode. It's even easier because you already know its id, you dumped the string table yourself, and the strings from common.j and blizzard.j will always have those ids, so it's easier to use them when possible.

    Another tip: you can use
    bj_forLoopAIndex
    and
    bj_forLoopBIndex
    to transfer data between the JASS world and bytecode. You even have
    SetForLoopIndexA
    and
    SetForLoopIndexB
    as wrappers, which can also be very useful. Since function arguments are not typesafe, you can PUSH a value of any type into these functions.

    Yes, the variable table, just like the function and native tables, is a hashtable internally. The names can be arbirtrary, then can contain spaces and special characters, and can even conflict with already existing variables! But if you do that, the already existing variable will be lost forever, it will be like a leak, since it's still in memory, but there will be no way to access it.

    My memory library takes advantage of this to unlock mem-reading: basically my bytecode declares a new global called "Memory", which overrides the already existing global array with this name. This way I can point it to any address, and when regular jass code tries to use the Memory[] array, it will actually read the new global and use that address as an Array Struct. In that struct the size of array is a very big number, and pData is 0 - this allows access to the entire address space of the process.

    But you can't write to that array because the newly declared global is not a true array, but just an integer. So the type-checking fails when you try to write. Before patch 1.27b, I didn't need to declare a new global, as I could tamper the already existing array directly. But this has been fixed for good.


    Yes, that's one of my ideas, though it would require a good system to allocate memory for the instructions and recycle it later when the closure is no longer needed. But it's definitely a possibility

    This type is only used when storing a
    null
    value into a handle-type variable. It's used to signal the VM that the value is not reference-counted. When you write a value of type 0x07 to a variable, the VM increases its ref count, and if the previously stored value also had type 0x07, its ref count is decreased too. But none of this happens for values of type 0x02.

    You can't modify it but you can overflow it. This table is linear and the VM has no bounds checking when accessing it. So we can produce a table of our own, using a normal Jass array, then calculate the difference between the array address and the start of the original table, and use that offset as a label. I'm currently writing the API for this.
     
    Last edited: May 3, 2017