- Joined
- Jul 30, 2012
- Messages
- 156
While people are discussing if WC3 is going to die, I, on the other hand, want to give it a new life 
The Bytecode compiler - compiling the map script directly to VM instructions
When I first discovered how to typecast in the latest WC3 patch, the first thing that came to my mind was to use I2C and run JASS bytecode from arrays. And I was very happy to find that it works, but then I went a step further and revived the old Memory exploit from 1.23b.
After that, I just gave up of the dream of running a full map script entirely from VM bytecode. Because Blizzard would certainly remove the typecasting once again, and all the ideas I had by that time would simply not work in the future patches.
But to everyone's surprise, Blizzard just fixed the real vulnerability in the VM, without removing the ability to typecast values. And this means that typecasting and bytecode execution are now officially supported by them, and is expected to work in all future patches! So my dream is more alive than ever!
Why bytecode?
The JASS language is a very basic and simple scripting syntax. It doesn't have many features, it lacks OOP and many other things that are essential for advanced development.
To address this gap, many tools have been developed through all these years to implement new features and extend the functionality of the language. JASS has been extended to vJASS, and new languages like Zinc and Wurst have also been developed.
But all these tools have one thing in common: internally they all compile your code to the basic JASS script. Which means that all of their extended capabilities must be implemented somehow with the existing features of basic JASS. Structs are implemented with arrays, dynamic code execution is implemented with triggers. And there isn't much more they can do, as they are limited by the underlying JASS engine.
Now what if we could remove all those limitations? Compile our code directly to the VM that runs behind the scenes? Working with pointers and memory, accessing the VM registers and even calling code variables directly, without the overhead of creating a new VM instance every time...
Just like in C/C++, that are compiled to x86 machine instructions, we can have a development tool that compiles directly to VM bytecode, unlocking the full potential of the JASS VM in a way never seen before.
Benefits of executing bytecode
The implementation
Basically this tool will work as an external compiler, which could possibly be integrated into Jass NewGen Pack, and maybe even replace Jasshelper completely. It would work similar to WurstScript, which has a compiler and a Standard Library, to provide some utility functions. This library would include things like API for Memory allocation, Bytecode execution, and some other useful stuff.
The resulting map will also have custom common.j and Blizzard.j scripts imported. Common.j will contain only the
Blizzard.j will also be nearly empty. Constants removed, and all those unnecessary BJ functions will also be stripped. Only the APIs that are actually referenced by the map script will be kept, but not in that file. Instead they will be reimplemented in pure bytecode, and compiled together with the main map script.
Then the map script file (war3map.j) just needs to have an initialization stub. All this code does is to initialize the Bytecode array and execute it. It may also contain
Notice that the compiled bytecode itself doesn't need to be inserted into the map script, like this. It can also be loaded from a custom file inside the map MPQ! In that case it needs to be pre-processed, to generate a Jump Table and patch CALL/JMP instructions, and after that it will be copied to a normal JASS array. All this work will be done by the initialization stub.
Conclusion
This thread's intention is to present the idea and concept of running an entire map from pre-compiled bytecode. I am posting it because this is a big project, and it's gonna take a really long time to see the light of day, if ever.
Since I lack the time to develop all these things just by myself, I'm making it public so that people can have their own ideas. Soon I will be posting internal details about how the JASS VM works, as well as a detailed explanation of every bytecode instruction of the VM.
I'd like very much to know your opinion about this. I have never written a compiler before, so I'd like to hear some ideas. From which language should I be generating code? vJass, Wurst, maybe even Lua? Would you like some new language features like a switch case, or inline functions? If you have any doubts or suggestions feel free to post here.
The Bytecode compiler - compiling the map script directly to VM instructions
When I first discovered how to typecast in the latest WC3 patch, the first thing that came to my mind was to use I2C and run JASS bytecode from arrays. And I was very happy to find that it works, but then I went a step further and revived the old Memory exploit from 1.23b.
After that, I just gave up of the dream of running a full map script entirely from VM bytecode. Because Blizzard would certainly remove the typecasting once again, and all the ideas I had by that time would simply not work in the future patches.
But to everyone's surprise, Blizzard just fixed the real vulnerability in the VM, without removing the ability to typecast values. And this means that typecasting and bytecode execution are now officially supported by them, and is expected to work in all future patches! So my dream is more alive than ever!
Why bytecode?
The JASS language is a very basic and simple scripting syntax. It doesn't have many features, it lacks OOP and many other things that are essential for advanced development.
To address this gap, many tools have been developed through all these years to implement new features and extend the functionality of the language. JASS has been extended to vJASS, and new languages like Zinc and Wurst have also been developed.
But all these tools have one thing in common: internally they all compile your code to the basic JASS script. Which means that all of their extended capabilities must be implemented somehow with the existing features of basic JASS. Structs are implemented with arrays, dynamic code execution is implemented with triggers. And there isn't much more they can do, as they are limited by the underlying JASS engine.
Now what if we could remove all those limitations? Compile our code directly to the VM that runs behind the scenes? Working with pointers and memory, accessing the VM registers and even calling code variables directly, without the overhead of creating a new VM instance every time...
Just like in C/C++, that are compiled to x86 machine instructions, we can have a development tool that compiles directly to VM bytecode, unlocking the full potential of the JASS VM in a way never seen before.
Benefits of executing bytecode
Ultimate Code Protection
Many WC3 maps use some form of code protection to prevent people from seeing/modifying their code. Be it to prevent cheating, or to prevent someone's work from being stolen, code protection tools have been developed and widely used by map makers.
But there is no protection tool that can hide the source code of a map completely. They all rely on obfuscation methods, renaming variables and functions, which makes the code difficult, but not impossible, to be read and understood.
With bytecode, this is different. When we compile directly to VM instructions, the original source code is destroyed. All that's available to the end user is the compiled bytecode. It's not readable by humans, and even if someone developed a "JASS Disassembler", a tool that could translate the bytecode into a human-readable representation, it's still a low level code. There's no way to recover the original source from it.
Dramatic Speed Increase
The WC3 internal JASS compiler, that translates the map script into bytecode at runtime, is very very inefficient. It produces too much overhead, generates many unnecessary instructions, and doesn't expose the full power of the VM.
If we can create the compiled bytecode ourselves, we will be able to unleash the full power of the engine. Not only there won't be any more overheads, but we can also make optimizations like assigning variables directly to VM registers, and inlining constants directly in the code.
You may think that speed is not a real concern these days, as PCs are much more powerful than before, but this thing has the potential to dramatically reduce the loading times of maps, as well as save a lot of processing power (and consequently, saving battery power too)
No More LEAKS!!!
Sounds like a dream come true, doesn't it? The allocation, assignment, and destruction of local variables is a costly operation, the game needs to compute a hash of the variable's name every time you use it, and you even need to manually clean all your variables at the end of functions, if you don't want to produce leaks.
But if we run our entire map from bytecode, we won't use JASS local variables anymore! We will do what every machine compiler already does: assign VM registers for all local variables used in the code. Not only this will result in an incredible speed increase, but also there will be no need to null those variables anymore, since registers are always cleaned up at the end of execution.
Direct memory access
Through the use of type-casting, it's possible to achieve read-only memory access from the regular JASS script. But with bytecode, we can also allocate new blocks of memory, and write data to them (we can only write to memory that we allocated ourselves). This unlocks the potential of unlimited data storage, which could have some new applications not yet researched.
Dynamic code generation/execution
As demonstrated here, it's possible to generate new chunks of bytecode at runtime, write them to an array and execute. If we run the entire map from bytecode, this process becomes much easier, as we can dynamically allocate memory for these new code blocks, and even call them DIRECTLY, without the use of triggers, as well as being able to JUMP or CALL any part of the code with a single VM instruction.
The implementation
Basically this tool will work as an external compiler, which could possibly be integrated into Jass NewGen Pack, and maybe even replace Jasshelper completely. It would work similar to WurstScript, which has a compiler and a Standard Library, to provide some utility functions. This library would include things like API for Memory allocation, Bytecode execution, and some other useful stuff.
The resulting map will also have custom common.j and Blizzard.j scripts imported. Common.j will contain only the
native
declarations, all constants will be removed from the script. Because the JASS VM wastes time to initialize constants, we will remove all the constant declarations and inline their values directly on the bytecode.Blizzard.j will also be nearly empty. Constants removed, and all those unnecessary BJ functions will also be stripped. Only the APIs that are actually referenced by the map script will be kept, but not in that file. Instead they will be reimplemented in pure bytecode, and compiled together with the main map script.
Then the map script file (war3map.j) just needs to have an initialization stub. All this code does is to initialize the Bytecode array and execute it. It may also contain
string
literals, since strings must be initialized before Bytecode can use them. But it's also possible to turn all strings into GetObjectName
calls, like some map protectors already do, and then GetObjectName can be called directly from bytecode.Notice that the compiled bytecode itself doesn't need to be inserted into the map script, like this. It can also be loaded from a custom file inside the map MPQ! In that case it needs to be pre-processed, to generate a Jump Table and patch CALL/JMP instructions, and after that it will be copied to a normal JASS array. All this work will be done by the initialization stub.
Conclusion
This thread's intention is to present the idea and concept of running an entire map from pre-compiled bytecode. I am posting it because this is a big project, and it's gonna take a really long time to see the light of day, if ever.
Since I lack the time to develop all these things just by myself, I'm making it public so that people can have their own ideas. Soon I will be posting internal details about how the JASS VM works, as well as a detailed explanation of every bytecode instruction of the VM.
I'd like very much to know your opinion about this. I have never written a compiler before, so I'd like to hear some ideas. From which language should I be generating code? vJass, Wurst, maybe even Lua? Would you like some new language features like a switch case, or inline functions? If you have any doubts or suggestions feel free to post here.