So first, some function updates and 2 new functions. I did this to drastically increase performance and add some new functionality ^)^.
Functions
JASS:
function Bits2String32 takes integer num returns string
local string str0 = "00000000000000000000000000000000"
local string str = ""
local integer bit0 = 0
if (num < 0) then
set bit0 = 1
set num = -2147483648 + num
endif
loop
exitwhen 0 == num
set str = I2S(num - num/2*2) + str
set num = num/2
endloop
return I2S(bit0) + SubString(str0, 0, 31 - StringLength(str)) + str
endfunction
function Bits2String takes integer num returns string
local string str0 = "00000000000000000000000000000000"
local string str = ""
local integer bit0 = 0
if (num < 0) then
set bit0 = 1
set num = -2147483648 + num
endif
loop
exitwhen 0 == num
set str = I2S(num - num/2*2) + str
set num = num/2
endloop
if (0 == bit0) then
return str
endif
return I2S(bit0) + SubString(str0, 0, 31 - StringLength(str)) + str
endfunction
function GetBitSize takes integer bits returns integer
local integer low = 0
local integer high = 31
local integer mid = 15
if (0 == bits) then
return 0
elseif (0 > bits) then
return 32
endif
loop
if (bits < udg_powArr[mid - 1]) then
set high = mid - 1
else
set low = mid + 1
endif
set mid = (high + low)/2
exitwhen high < low
endloop
return mid
endfunction
function GetFreeBits takes integer bits returns integer
return 32 - GetBitSize(bits)
endfunction
function WriteBits takes integer bits, integer bitsToWrite, integer bitsToOccupy returns integer
return bits*udg_powArr[bitsToOccupy] + bitsToWrite
endfunction
function ReadBitsBack takes integer bits, integer bitCount returns integer
if (bits < 0) then
return (bits - 2147483648)/udg_powArr[32 - bitCount] + udg_powArr[bitCount - 1]
else
return bits/udg_powArr[32 - bitCount]
endif
endfunction
function ReadBitsFront takes integer bits, integer bitCount returns integer
if (bits < 0) then
if (bitCount == 32) then
return bits
endif
set bits = bits - 2147483648
endif
return bits - bits/udg_powArr[bitCount]*udg_powArr[bitCount]
endfunction
function PopBitsFront takes integer bits, integer bitsToPop returns integer
if (bits < 0) then
return (bits - 2147483648)/udg_powArr[bitsToPop] + udg_powArr[31 - bitsToPop]
else
return bits/udg_powArr[bitsToPop]
endif
endfunction
function PopBitsBack takes integer bits, integer bitsToPop returns integer
if (bits < 0) then
if (bitsToPop == 0) then
return bits
endif
set bits = bits - 2147483648
return bits - bits/udg_powArr[32 - bitsToPop]*udg_powArr[32 - bitsToPop]
else
return bits - bits/udg_powArr[31 - bitsToPop]*udg_powArr[31 - bitsToPop]
endif
endfunction
function ReadBits takes integer bits, integer bitStart, integer bitEnd returns integer
return ReadBitsFront(PopBitsFront(bits, bitStart), bitEnd - bitStart + 1)
endfunction
function GetBitNumber takes integer bits returns integer
return udg_powArr[bits - 1]
endfunction
function ShiftLeft takes integer bits, integer shift returns integer
return bits*udg_powArr[shift]
endfunction
function ShiftRight takes integer bits, integer shift returns integer
return PopBitsFront(bits, shift)
endfunction
function RotateLeft takes integer bits, integer shift returns integer
return ShiftLeft(bits, shift) + ReadBits(bits, 0, shift - 1)
endfunction
function RotateRight takes integer bits, integer shift returns integer
return ShiftRight(bits, shift) + ReadBits(bits, 32 - shift, 31)*udg_powArr[32 - shift]
endfunction
Init
JASS:
function InitTrig_Bitmanip_library takes nothing returns nothing
local integer i = 1
set udg_powArr[0] = 1
loop
set udg_powArr[i] = udg_powArr[i - 1]*2
set i = i + 1
exitwhen i == 32
endloop
set udg_powArr[32] = -2147483648
endfunction
I also changed a couple of functions around. The front is always the right side of the number. The back is always the left side. You read it from right to left, always.
The new functions are
JASS:
function ReadBitsFront takes integer bits, integer bitCount returns integer
function ReadBitsBack takes integer bits, integer bitCount returns integer
The fast functions are
JASS:
function WriteBits takes integer bits, integer bitsToWrite, integer bitsToOccupy returns integer
function ReadBitsBack takes integer bits, integer bitCount returns integer
function PopBitsFront takes integer bits, integer bitsToPop returns integer
ReadBitsFront just reads the front x bits. ReadBitsBack reads the back x bits.
Here is a fully working example that supports a single 32-bit integer.
If you want something that's really usable, you're going to have to have an array of 32-bit integers that you write to. If you only have 2 bits left or something and want to write a value that takes 6 bits, write 2 bits to your first integer, then get a new integer and write the remaining 4 bits to it. When you read back, read out the 4 bits and then read out the next 2 bits and put them together.
JASS:
function Trig_Untitled_Trigger_001_Actions takes nothing returns nothing
local integer maxGold = GetBitSize(75000)
local integer maxLevel = GetBitSize(60)
local integer maxAr = GetBitSize(95)
local integer currentGold = 72000
local integer currentLevel = 55
local integer currentAr = 80
local integer data = 0
local integer v1
local integer v2
local integer v3
local string ALPHABET = "123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+="
local string array char
local string s
local hashtable table = InitHashtable()
local string encoded
local integer i
local integer i2
local integer stringHash
call DestroyTimer(GetExpiredTimer())
/*
* initialize alphabet
*/
set i = StringLength(ALPHABET)
loop
set i = i - 1
set s = SubString(ALPHABET, i, i + 1)
set char[i] = s
set stringHash = StringHash(s)
set i2 = 0
loop
exitwhen not HaveSavedInteger(table, stringHash, i2)
set i2 = i2 + 1
endloop
call SaveInteger(table, stringHash, i2, i)
exitwhen 0 == i
endloop
/*
* write values to 32-bit integer
*/
set data = WriteBits(data, currentGold, maxGold)
set data = WriteBits(data, currentLevel, maxLevel)
set data = WriteBits(data, currentAr, maxAr)
/*
* display raw data
*/
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, "Raw: " + Bits2String32(data))
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, "Data: " + I2S(data))
/*
* store bits into characters, 6 bits per character
*/
set encoded = ""
loop
exitwhen 0 == data
set encoded = encoded + char[ReadBitsFront(data, 6)]
set data = PopBitsFront(data, 6)
endloop
/*
* output characters holding bits
*/
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, "Code: " + encoded)
/*
* put bits back into 32-bit integer
*/
set i = StringLength(encoded)
loop
exitwhen 0 == i
set i = i - 1
set s = SubString(encoded, i, i + 1)
set stringHash = StringHash(s)
set i2 = 0
loop
exitwhen char[LoadInteger(table, stringHash, i2)] == s
set i2 = i2 + 1
endloop
set data = WriteBits(data, LoadInteger(table, stringHash, i2), 6)
endloop
/*
* output data
*/
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, "Raw: " + Bits2String32(data))
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, "Data: " + I2S(data))
set v3 = ReadBitsFront(data, maxAr)
set data = PopBitsFront(data, maxAr)
set v2 = ReadBitsFront(data, maxLevel)
set data = PopBitsFront(data, maxLevel)
set v1 = ReadBitsFront(data, maxGold)
set data = PopBitsFront(data, maxGold)
/*
* output written values
*/
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, I2S(v1) + " == " + I2S(currentGold))
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, I2S(v2) + " == " + I2S(currentLevel))
call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60000, I2S(v3) + " == " + I2S(currentAr))
endfunction
//===========================================================================
function InitTrig_Test takes nothing returns nothing
call TimerStart(CreateTimer(), 0, false, function Trig_Untitled_Trigger_001_Actions)
endfunction
Reading values back out of the number may seem confusing to you. Let me explain what's going on.
Imagine that this is a 32 bit number
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0]
When we write something, like 1001 to it, it does it like this
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1]
If we were to write 111 to the above
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1] [1] [1] [1]
etc, it's a stack (if you know what that is)
The actual array indices are the following
31-30--29-28-27--26-25--24-23-22--21-20--19-18-17--16-15-14-13--12--11-10-09--08-07-06--05-04--03-02-01--00
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0]
When you write to the array, it pushes all of the elements currently in it over to the left to make room for what you're writing, then it puts in the new data.
Writing 111 again
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1]
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1] [0] [0] [0] <-
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1] [1] [1] [1]
This means that when you read your values out, you have to read it in reverse order from how you wrote it.
If you wrote 1, 2, 3, then you must read 3, 2, 1.
When you shift something too far left or too far right, it gets dropped (these are the pops)
Let's say that we want to read the 111 and 1001 back
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1] [1] [1] [1]
Well, 111 is at the front of the array, so that's easy
Now we shift right
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1] | 1 1 1
Notice that the 1 1 1 fell off the edge. It gets dropped. This is all that's left.
[0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [0] [0] [1]
The BitManip functions let you read the last x bits and pop the last x bits too. If you were to have this
[1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1]
If you popped the left 3 bits
[0] [0] [0] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1]
This is different from a left shift. A left shift would do this
[1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [0] [0] [0]
The top 3 bits were dropped like before, but everything moved to the left 3. In the pop version, the last 3 bits were just dropped.
The pop function in BitManip pushes everything as right as it can go. If you pop at the right, it does right shift. If you pop at the left, it does nothing.
Integers in warcaft 3 are signed, meaning that reading from left to write is much more expensive than from right to left. You want that last bit to be filled with a 0 as much as possible. When it's filled with a 1, it's hell to get the one out. Here's what happens in a right shift with the following assuming that the highest bit (leftmost) is signed.
[1] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0]
Let's shift right 3
[1] [1] [1] [1] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0]
That's right, it fills the entire number up with 1's. The only way to get rid of the dern'd thing is to specifically remove it.
Anyways, hopefully you understand a bit more about numbers ^)^. This is true for all bases btw, not just binary. However, a change of base is some pretty ugly math. When you have several digits in your number from several different bases, you have to do that ugly math. As long as you only work in one base, the math is simple. Binary is the smallest base that exists, so you'll minimize the fluff in your code by using it. Now, of course, you can go with a set of arbitrary bases, but then we get into ugly math ^)^.
One cool thing I did in the past was a library called Scrambler. It just changed the base of a number over and over again, mixing up the digits each time. Resulted in seemingly random numbers that varied in even their lengths o-o. The reason numbers can vary in lengths as a result of this is of course the fractions.
Remember values like 1001? They are stored in fractions of bits.
Suppose you have 10011001
If you rearrange them just right, you can get
00001111
Which is just 1111. That is half the size of the original.
That is the number 153 turned into 15
If you keep changing bases, going from like base 2 to base 3 to base 5, etc, you can end up with some really interesting results.
1111 is 120 in ternary. Let's rearrange again to give us 012. This gives us 101 in binary, rearranged to 011 we get 10 in ternary, and so forth. We can eventually go down to just 1.
As we change between bases, we get more and more fractions. It's possible to reduce a number from 153 to 1. However, we should never drop the leading place, or we lose information (the placeholders). Just showing that it's possible.
This is more of a reality
10011001 -> 10000111 -> 12000 -> 10002 -> 1010011 -> 1000111 -> 2122 -> 241 (base 5) -> 214 -> 111011 -> 101111 -> 1202 -> 1022 -> 35
That is as packed as the number can get while losing 0 information. If we know our sequence to get back, we can then unscramble it and get all of our information back.
Once again, 153 -> 35.
Of course, numbers can also get bigger. If we scramble in the other direction to maximize the size of the number, we can figure out the range that 153, when scrambled, can become.
In binary, our number is 100011
Compare
10011001
100011
Cut off 2 full bits. This is still still has fractions of a bit in it, but notice that the fraction is very, very small. We know this because it is extremely close to a power of 2
100011
100000
The fraction is higher in ternay
1022
But there's nothing we can do to eliminate it. The number is too small to go to a larger prime base like 5, 7, or 11 to try and find more stuff that can be dropped out. The bigger the number, the bigger the range it has.
Oh well, that's just Scrambler ^)^.
If you want to see the fractions for yourself
ln(number - 1)/ln(base)
This will give you how many digits the number in base 10 will take up in a target base.
If we look back at 153, it takes up 7.257 digits in binary. 0b10000111 takes up 7.077 digits.
The closer you are to a whole number (not rounding down, rounding up), the better. If you are at like 7.99, you've practically maxed out capacity.
Now, this of course refers to bases. For example, 8 returns 3, but it actually takes up 4 binary digits. What this actually means is that a base 8 digit in binary will take up exactly 3 digits. Recall that base 8 goes from 0 to 7.
You can use this to determine how well your max values fit into binary (max value + 1), or any base for that matter. For example, a max value of 75,000 (75,001) will take up 16.19 binary digits, which is pretty bad. In base 84 (you were using this earlier right?), it would take up 2.53 digits.That seems like less wasted space, but let's convert that .47 in base 84 to binary to see how much extra space is wasted. .47*ln(84)/ln(2) = 3.004389 digits, which is
way more than .81 digits from binary. Now keep in mind that a binary digit is small, these digits add up and generate a lot of code bloat. Certainly, fractions will still add up, but you may only be adding 3-5 binary digits over all instead of 20-40.
Let's see what would happen with arbitrary bases now.
75000
300
200
100
let's assume all at max
This gives us a value of 458301185600 when all packed together. In base 84, this is 6.06 digits.
Now using only binary, we get 1258301056100
458301185600
1258301056100
already we can see a bit of bloat as compared to the arbitrary base method. In base 84, it's 6.28795656 digits
6.06
6.29
So our code won't have gone up at all, but we do have some more empty space in there. Now for the final method, the method you were using.
75000 = 2.533 digits = 3 digits (must round up)
300 = 1.288 = 2 digits
200 = 1.1969 = 2 digits
100 = 1.04 = 2 digits
3 + 2 + 2 + 2 = 9 digits
So
Arbitrary = 6.06 = 7 digits
Binary = 6.29 = 7 digits
Naive = 9.00
So the naive method made the code 2 characters longer.
I hope you can see why it might be worth it to go for, at the bare minimum, binary ; P. Binary also normally outputs in base 64, so the reality is...
Binary = 6.699 = 7 digits
Binary will probably be 1-2 characters higher than arbitrary as the code grows in size. Naive will be
way higher than both of them.
The formula, once again, is Ln(max value + 1)/Ln(base), or Ln(base1)/Ln(base2). You can use this equation to calculate the efficiency of your codes. There is an Ln function in the JASS section.
Also, if using arbitrary bases has 0 space in between the values, why was it 6.06 instead of 6? The space is at the very end of the number, when converting to base 84. Arbitrary bases won't necessarily fit nicely into the output base. That's where your space comes from. Arbitrary will never go above 1 extra character. They have the capability of going 0 extra characters, but that's terribly rare. It'll pretty much always be 1 extra character that has mostly space in it.
The space in binary and naive are spread between each of the values. This space grows as you add more values. Arbitrary has 0 space between the values, so the space won't grow, the space is only a result of the final output not being a power of the target base. This is why I personally use arbitrary. However, binary does have minimal space, so if you are going to stick with only 1 base, go with binary for sure.
Btw, base 128 will only drop binary to 5.74 digits. The benefit of increasing the size of your base goes down as the base grows in size. Base 256 is 5.02 digits. Base 1024 is 4.02 digits. Arbitrary in base 1024 is 3.87 digits. Naive in base 1024 is 5 digits, most of which is empty space. Why did the gap between naive and arbitrary lower?
In base 301
Arbitrary = 4.71 digits = 5 digits
Binary = 4.88 digits = 5 digits
Naive = 6 digits
The naive approach with this particular set of values is pretty much maximized at base 301. As it goes farther away from base 301, it will diverge from the other methods. Let's say that the base is now 75,001.
Arbitrary = 2.39 digits = 3 digits
Binary = 2.48 digits = 3 digits
Naive = 4 digits
The very best naive is ever going to get to is 4 digits. The others will continue to drop.
There is another form of naive that tries to pack as much data as possible into a single value (Acehart's does 1,000,000) and then converts that value into a character. This reduces space a bit, but is still way worse than binary.
what about base 1,000,000?
Arbitrary = 1.94 digits = 2 digits
Binary = 2.01 digits = 3 digits
Naive = 4 digits
If you would like to use the arbitrary base method instead of binary, lemme know ;D.