• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!

Any good small checksum algorithm?

Status
Not open for further replies.
Level 31
Joined
Jul 10, 2007
Messages
6,306
Anyone know of a pretty secure checksum algorithm that has a small size?

This is the one I came up with: (showing an example)

123456
Code:
Split into rows

1
2
3
4
5
6

Add the front digit (front of 6 is 1)

1=3
2=5
3=7
4=9
5=11
6=7

Perform some math (*2, then add digit, *2, then add, etc. Last digit is *). Using the sums, not the digits!

((((3*2+5)*2+7)*2+9)*2+11)*2*7=2030

123465
Code:
1=3
2=5
3=7
4=10
6=11
5=6

Perform some math (*2, then add digit, *2, then add, etc. Last digit is *). Using the sums, not the digits!

((((3*2+5)*2+7)*2+10)*2+11)*2*6=1764

123455
Code:
1=3
2=5
3=7
4=9
5=10
5=6

Perform some math (*2, then add digit, *2, then add, etc. Last digit is *). Using the sums, not the digits!

((((3*2+5)*2+7)*2+9)*2+10)*2*6=1728

123454
Code:
1=3
2=5
3=7
4=9
5=9
4=5

Perform some math (*2, then add digit, *2, then add, etc. Last digit is *). Using the sums, not the digits!

((((3*2+5)*2+7)*2+9)*2+9)*2*5=1430


Seems pretty good, but it has quite a lot of collisions. If you constrict the base more, the collisions go down but the checksum goes up ;D. Binary would be 0 collisions and would result in a checksum a bit larger than the original number ^_^. Make the base bigger and collisions go up, but checksum value goes down ;P. If you add 1 to smaller sums (to avoid 0s), collisions go down. Add 2 (to avoid both 0s and 1s), and they go way down ;o (but requires base 8).

This was the test code I ran with it using BigInt (see sig)
JASS:
struct tester extends array
    private static BigInt i = 0
    private static hashtable t = InitHashtable()
    private static integer array c
    private static integer m = 0
    private static string s = ""
    
    private static method run takes nothing returns nothing
        local integer u = -1
        local integer j = 0
        local integer h = 0
        loop
            set i = i.next
            exitwhen i.end
            set u = u + 1
            set c[u] = i.digit+1
        endloop
        set j = u-1
        set h = (c[u]+c[u-1])
        loop
            exitwhen j == 0
            set j = j - 1
            set h = h*2+(c[j]+c[j+1])
        endloop
        set h = h*2*(c[0]+c[u])
        set j = LoadInteger(t, h, -1)+1
        call SaveInteger(t, h, -1, j)
        if (j > 999) then
            call PauseTimer(GetExpiredTimer())
            call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 120, I2S(h) + ": " + i.toString() + " collided " + I2S(j) + " times")
        endif
        call i.add(1,0)
    endmethod
    
    private static method onInit takes nothing returns nothing
        set i = BigInt.convertString("100", Base["0123456789"])
        call TimerStart(CreateTimer(), 0.0001, true, function thistype.run)
    endmethod
endstruct


So any suggestions on a better checksum, or is this about as good as it'll get? Yea, quite a few collisions, but it's pretty decent protection. To get 100 collisions had to go to 5360. Still a lot compared to main stream, but can ehh.

Anyone know of a more secure checksum that takes around the same space?


This is base 8 test (47061 (sum 2520) hit with 100 collisions vs 5360)

JASS:
struct tester extends array
    private static BigInt i = 0
    private static hashtable t = InitHashtable()
    private static integer array c
    private static integer m = 0
    private static string s = ""
    
    private static method run takes nothing returns nothing
        local integer u = -1
        local integer j = 0
        local integer h = 0
        local boolean flag = false
        loop
            set i = i.next
            exitwhen i.end
            set u = u + 1
            if flag then
                set c[u] = i.digit+2
                set flag = false
            else
                set c[u] = i.digit+1
                set flag = true
            endif
        endloop
        set j = u-1
        set h = (c[u]+c[u-1])
        loop
            exitwhen j == 0
            set j = j - 1
            set h = h*2+(c[j]+c[j+1])
        endloop
        set h = h*2*(c[0]+c[u])
        //call PauseTimer(GetExpiredTimer())
        //call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60, I2S(h))
        set j = LoadInteger(t, h, -1)+1
        call SaveInteger(t, h, -1, j)
        if (j > 99) then
            call PauseTimer(GetExpiredTimer())
            call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 120, I2S(h) + ": " + i.toString() + " collided " + I2S(j) + " times")
        endif
        call i.add(1,0)
    endmethod
    
    private static method onInit takes nothing returns nothing
        set i = BigInt.convertString("100", Base["0123456789"])
        set i.base = Base["01234567"]
        call TimerStart(CreateTimer(), 0.0001, true, function thistype.run)
    endmethod
endstruct


Large value tests on second code
Code:
99227128 (sum)
123456789123456789 (value)
1N2S54 (sum in b36)
XRLS1YK9RF9 (value in b36)


12345678912345678912345 (value)
1776843290 (sum)
20DGOI10OKS0YMX (value in b36)
TDVX22 (sum in b36)

Notice that the change in rate goes down as the value grows larger. Do the collisions increase? No, because there are more combinations of checksums. As the value actually goes up, the checksum range goes up (as do the maximal collisions for any given checksum). Keep in mind that that collision test showed that in the range of 100 to 47061, only 100 values worked for the checksum of 2520.

I was wondering, again, if anyone knew a better checksum algorithm?


edit
Just made a modification.. collided 100 times at 105345 (sum 5040) (woops)

JASS:
struct tester extends array
    private static BigInt i = 0
    private static hashtable t = InitHashtable()
    private static integer array c
    private static integer m = 0
    private static string s = ""
    
    private static method run takes nothing returns nothing
        local integer u = -1
        local integer j = 0
        local integer h = 0
        local integer q = 0
        loop
            set i = i.next
            exitwhen i.end
            set u = u + 1
            set c[u] = i.digit
        endloop
        set j = u-1
        set h = (c[u]+c[u-1])
        loop
            exitwhen j == 0
            set j = j - 1
            set h = h*2+(c[j]+c[j+1])
        endloop
        set h = h*(c[0]+c[u])
        set j = 1
        set q = (c[0]+c[1])
        loop
            exitwhen j == u
            set j = j + 1
            set q = q*2+(c[j]+c[j-1])
        endloop
        set q = q*3*(c[u]+c[0])
        set h = h+q
        
        //call PauseTimer(GetExpiredTimer())
        //call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60, I2S(h))
        set j = LoadInteger(t, h, -1)+1
        call SaveInteger(t, h, -1, j)
        if (j > 99) then
            call PauseTimer(GetExpiredTimer())
            call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 120, I2S(h) + ": " + i.toString() + " collided " + I2S(j) + " times")
        endif
        call i.add(1,0)
    endmethod
    
    private static method onInit takes nothing returns nothing
        set i = BigInt.convertString("100", Base["0123456789"])
        call TimerStart(CreateTimer(), 0.0001, true, function thistype.run)
    endmethod
endstruct

123456 sum is 7084 (much higher)

123456789123456789 sum is 65674060, actually smaller!

12345678912345678912345 -> 716807076 (smaller !!)

123-200
321-184
124-285
122-129
132-165

47061-2250
 
Last edited:
Level 31
Joined
Jul 10, 2007
Messages
6,306
Prevent tampering of save/load codes by a user, essentially prevents cheating.

If a test showed 100 collisions after hitting the number 903662, this means that for that specific checksum (the most common in that range), only 100 values worked for it out of 903662 possible values.

edit
Here is a version you guys can play with

Code:
12345678912345678912345 sum
1603902372 (7,4) (3,1)
1585028004 (7,3) (3,1)
1943640996 (9,4) (3,1)
1905892260 (9,2) (3,1)

123456789123456789
366880510 (7,4) (7,4)
151198540 (7,4) (3,1)
249483370 (7,4) (4,7)

Number after collisions >99
270341 (7,3) (3,1)
422722 (7,4) (3,1)
492524 (9,4) (3,1)
312824 (9,2) (3,1)
1140246 (7,4) (7,4)
903662 (7,4) (4,7)

Code:
(7,4) (4,7)
Checksum 11256 (first 25)

Randomness: .21/3.14159 (I'd like to jack this value up to make it closer to pi)

686
757
80510
102015
102782
104733
107182
109133
112053
159702
162622
167022
253531
294560
300004
344440
432900
437300
612000
1015171
1030371
1115702
1127931
1130209

Most common checksum (first 100) (55440)
63299
74288
80997
85277
90687
91986
96266
143459
168448
171368
180977
185377
208989
211859
216259
234368
254957
259357
262277
268966
271886
276286
294208
296695
302768
302939
307168
307339
310088
310259
317957
320877
325277
345866
353186
359875
362608
362795
367008
367195
380717
385117
401168
408866
411786
416186
436775
444095
453517
461008
468706
471626
476026
496615
502695
507095
527684
544426
559615
562535
587524
590444
618593
630935
635335
635506
653444
678433
681353
690962
695362
718924
721844
726244
726415
744353
764942
769342
772262
778951
781871
786271
809833
812753
812924
817153
817324
820073
820244
827942
830862
835262
855851
863171
869860
872780
877180
890702
895102
903662

A checksum A is more efficient than checksum B if
(collisionsA/collisionsB) > (sumA/sumB)
or
sumA < sumB and collisionA >= collisionB
or
sumA <= sumB and collisionA > collisionB

ex
(1140246/422722) > (366880510/151198540)
2.69 > 2.42

1140246/903662 < 366880510/249483370
1.26 < 1.47

Please play with this algorithm or work on your own and try to get the best possible checksum w/ the lowest possible value, in this case (7,4) (4,7) (any higher is too high).

Currently set to what I found to be the best weights for the smallest values.
JASS:
struct tester extends array
    //done before summing
    //7,4 seem to be the best combo
    private static constant integer RIGHT_WEIGHT = 7
    private static constant integer LEFT_WEIGHT = 4
    
    //done after summing
    //3,1 seem to be the best combo
    private static constant integer ABS_RIGHT_WEIGHT = 4
    private static constant integer ABS_LEFT_WEIGHT = 7
    
    private static BigInt i = 0
    private static hashtable t = InitHashtable()
    private static integer array c
    private static integer m = 0
    private static string s = ""
    
    private static method run takes nothing returns nothing
        local integer u = -1
        local integer j = 0
        local integer h = 0
        local integer q = 0
        loop
            set i = i.next
            exitwhen i.end
            set u = u + 1
            set c[u] = i.digit
        endloop
        set j = u-1
        set h = (c[u]+c[u-1])*LEFT_WEIGHT
        loop
            exitwhen j == 0
            set j = j - 1
            if (j < u-2) then
                set h = h*2+(c[j]+c[j+1])
            else
                set h = h+(c[j]+c[j+1])
            endif
        endloop
        set h = h*ABS_LEFT_WEIGHT*(c[0]+c[u])
        set j = 1
        set q = (c[0]+c[1])*RIGHT_WEIGHT
        loop
            exitwhen j == u
            set j = j + 1
            if (j > 2) then
                set q = q*2+(c[j]+c[j-1])
            else
                set q = q+(c[j]+c[j-1])
            endif
        endloop
        set q = q*ABS_RIGHT_WEIGHT*(c[u]+c[0])
        set h = h+q
        
        //call PauseTimer(GetExpiredTimer())
        //call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60, I2S(h))
        set j = LoadInteger(t, h, -1)+1
        call SaveInteger(t, h, -1, j)
        if (j > 99) then
            call PauseTimer(GetExpiredTimer())
            call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 120, I2S(h) + ": " + i.toString() + " collided " + I2S(j) + " times")
        endif
        call i.add(1,0)
    endmethod
    
    private static method onInit takes nothing returns nothing
        set i = BigInt.convertString("100", Base["0123456789"])
        call TimerStart(CreateTimer(), 0.00001, true, function thistype.run)
    endmethod
endstruct
 
Last edited:

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
A good one I found was to abuse the I2S and StringHash natives. Inorder to prevent hashtable collisions it has a rather large spread from the algerthim and even a small change will cause huge value changes as far as I can tell. Thus you throw into the checksum the StringHash of player name, and every few steps you add to the checksum its StringHash of itself I2S. This ends up with basically an int of any value, which you then wrap to the required range via modulo or division (or a combination of both).

The good thing about it is the mechanics behind the StringHash native are obscured to WC3 players so one can easilly compute such a hash. Additionally the large values when converted to string and hashed make other large values which allow you to abuse integer overflow.

I am aware of the string leaks this will cause, but saving/loading should not occur regually enough for them to ever cause a problem.

Basically, the idea is to make the hash so confusing and complicated that there are too many variables for a human to ever guess it.
 
Level 31
Joined
Jul 10, 2007
Messages
6,306
Can you give me a sample range and show what number it takes for it to reach 100 collisions? Tx.

Can you also perform this test to show which is better?

A checksum A is more efficient than checksum B if
(collisionsA/collisionsB) > (sumA/sumB)
or
sumA < sumB and collisionA >= collisionB
or
sumA <= sumB and collisionA > collisionB

Tx ;D.

Can you also post some demo code?
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
Well heres the save system I made (and a few people suposidly used). I used it as it seemed a rather good for its simplicity (others are hard without bitshifts and bitwise comparators).

Do not expect anything too complicated, I made this within a few hours a while ago lol.
 

Attachments

  • savesystem.w3x
    19.3 KB · Views: 122
Level 31
Joined
Jul 10, 2007
Messages
6,306
Can you please perform the test cases though to show which checksum has a better output?

Also please be sure to test for randomness of output (how hard it is to guess the next checksum or to guess a number with the same checksum).

http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Randomness.htm

Using the 11256, I tested 0.26444 (the higher the better).

To test for randomness, don't test your actual data but test the differences between their values. The data should pretty much be growing as the number grows, so that won't be regarded as random, but the differences will be all over the place (hopefully).

Code:
686
71
79753
21505
767
1951
2449
1951
2920
47649
2920
4400
86509
41029
5444
49880
88460
4400
174700
403171
15200
85331
12229
2278

Looking at that, I can't really discern any pattern ;P. It's all over the place ;D.


edit
Testing your thingie. Even with the mass string leaks, I am determined. I saw the first collision at around 35k, which seems really awesome, but that might just be a shifted range. The only way to be sure is to test many collisions.

edit
All the string leaks are killing me, I can't test.

edit
Think I'll stick with mine for now cuz the string leaks kill me on yours.

And here's a new version with middle protection (weakness in old version was the middle)

It has a way less checksum collisions!

Collision: 5176761 (>99) 738331 (>24)
Sum: 249483379 (123456789123456789)

Compare to standard 7,4,4,7
249483379 (new)
249483370 (old)

5176761 (new)
903662 (old)

JASS:
globals
    //done before summing
    //7,4 seem to be the best combo
    constant integer RIGHT_WEIGHT = 7
    constant integer LEFT_WEIGHT = 4
    
    //done after summing
    //4,7 seem to be the best combo
    constant integer ABS_RIGHT_WEIGHT = 4
    constant integer ABS_LEFT_WEIGHT = 7
    
    //FLAGGING typically isn't worth it, but it can
    //add a bit more security more efficiently than modifying
    //the weights
    constant boolean FLAGGING = false
    
    //MID PROTECTION should always be enabled!
    constant boolean MID_PROTECTION = true
endglobals

struct tester extends array
    private static BigInt i = 0
    private static hashtable t = InitHashtable()
    private static integer array c
    private static integer m = 0
    private static string s = ""
    
    private static method run takes nothing returns nothing
        local integer u = -1
        local integer j = 0
        local integer h = 0
        local integer q = 0
        static if MID_PROTECTION then
            local integer v
            local boolean e
        endif
        static if FLAGGING then
            local boolean flag = true
        endif
        loop
            set i = i.next
            exitwhen i.end
            set u = u + 1
            static if FLAGGING then
                if (flag) then
                    set flag = false
                    set c[u] = i.digit + 1
                else
                    set flag = true
                    set c[u] = i.digit + 2
                endif
            else
                set c[u] = i.digit
            endif
        endloop
        static if MID_PROTECTION then
            set v = R2I(u/2.+.5)
            set e = (v != u/2)
        endif
        set j = u-1
        set h = (c[u]+c[u-1])*LEFT_WEIGHT
        loop
            exitwhen j == 0
            set j = j - 1
            if (j < u-2) then
                set h = h*2+(c[j]+c[j+1])
            else
                set h = h+(c[j]+c[j+1])
            endif
        endloop
        set h = h*ABS_LEFT_WEIGHT*(c[0]+c[u])
        set j = 1
        set q = (c[0]+c[1])*RIGHT_WEIGHT
        loop
            exitwhen j == u
            set j = j + 1
            static if MID_PROTECTION then
                if (e or j != v) then
                    if (j > 2) then
                        set q = q*2+(c[j]+c[j-1])
                    else
                        set q = q+(c[j]+c[j-1])
                    endif
                endif
            else
                if (j > 2) then
                    set q = q*2+(c[j]+c[j-1])
                else
                    set q = q+(c[j]+c[j-1])
                endif
            endif
        endloop
        set q = q*ABS_RIGHT_WEIGHT*(c[u]+c[0])
        static if MID_PROTECTION then
            set h = h+q+c[v]
        else
            set h = h+q
        endif
        
        //call PauseTimer(GetExpiredTimer())
        //call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 60, I2S(h))
        set j = LoadInteger(t, h, -1)+1
        call SaveInteger(t, h, -1, j)
        if (j > 99) then
            call PauseTimer(GetExpiredTimer())
            call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 120, I2S(h) + ": " + i.toString() + " collided " + I2S(j) + " times")
        endif
        
        /*if (h == 43128) then
            set m = m + 1
            if (m == 100) then
                call PauseTimer(GetExpiredTimer())
            endif
            call DisplayTimedTextToPlayer(GetLocalPlayer(), 0, 0, 120, i.toString())
        endif*/
        call i.add(1,0)
    endmethod
    
    private static method onInit takes nothing returns nothing
        set i = BigInt.convertString("100", Base["0123456789"])
        static if FLAGGING then
            set i.base = Base["01234567"]
        endif
        call TimerStart(CreateTimer(), 0.00001, true, function thistype.run)
    endmethod
endstruct
 
Last edited:

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
What you running? A Pentium 2? The string leaks my system generated for even 100 saves is less than the average save system generates.

Like I said, the string hash algerthim is designed to have low collision probability as it is used for hash based structures. Your small test sample you did should be enough to see that no futher tests are needed on it. You only need to use it once or twice to add some complexity to the checksum (as simple maths is easy to crack at times).
 
Level 31
Joined
Jul 10, 2007
Messages
6,306
Eh, I ran more than 35k runs on it (had to shut it off at prob around 300k? Not sure because it was so laggy). I know I ran 35k cuz I hit a collisions at that point ;D.

Really, unsure of how many collisions occur (the number could skyrocket). It'd be best to juts run it 1000000 times and see how many collisions occur : ).

I'd rather just run my own algorithm though to avoid string leaks (yea, I'm an efficiency freak).

Now, if the hashing could be done w/ BigInts and not strings, that'd be wonderful ^_^.

And again, collision rate on mine is now 51767.61 codes per 1 collision, and that is testing small numbers. If I were to start at even 1000000 (much smaller than any save/load code prob), it'd prob go into the billions or trillions.



edit
This checksum is bad.

Generated checksums started at 100 and adding 100
Code:
32
128 (96)
288 (160) (64)
512 (224) (64)
800 (288) (64)
1152 (352) (64)
1568 (416) (64)

where x is number in 100s from 0 - 900
32x^2

Figuring out the rest ;o, give me a few... I'll prob eventually have a single equation out. Finally calculus is coming in handy : D.

Code:
(1000s changes)
1-83	82
2-165	82
3-247	82
4-329	82
5-411	82
6-493	82

So now were talking triple polynomial for solving any of these. Let's see if all of the numbers follow the same pattern.

(1001+100+100,ex)
Code:
1-165
2-247	82
3-329	82

Sure enough, the exact same pattern. An equation can easily be derived from this data to solve any number's checksum, meaning that any code running on this checksum could be easily modified if the user knew the key (making the checksum worthless). Plug in the new number you want and use the equation to get its checksum. Really, we're talking n polynomial depending on the digits, and the patterns change depending on that, but anywho. I guess this checksum perhaps doesn't suck as you'd have to derive the equation

So no let's assume that the person can't figure out the algorithm and they are just playing with the code ;D. It's likely that plugging in random values will not give a working code (1 in 51767 chance). Given that they are not deviating from the code very much and are working on the right side, it turns into more like a 1 in 20000 chance that they get a working code. Not too shabby, I guess this works, but does anyone know a better method? /cry

It makes me sad that I can generate a polynomial to calculate this checksum : (. I can just make an equation to like the 500th degree and that'll be able to calculate any checksum : |. I guess I could improve the algorithm to work off of the polynomial (would sure be faster, lol).

/sad face
 
Last edited:
Status
Not open for further replies.
Top