• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

WC3 Encoding

Status
Not open for further replies.
Level 19
Joined
Aug 8, 2007
Messages
2,765
What encoding does Warcraft 3 use?

I tried using UTF-8 to copy and paste unicode symbols into warcraft 3 for use in other-language games when commands are needed, but they aren't working.

Theres the program aclled UTFizer where if you click the button it sets it to the encoding and u can paste it to wc3 fine, but I dont know what encoding it uses

e/ UnicodeBig doesnt work either
 
Level 19
Joined
Aug 8, 2007
Messages
2,765
US wc3 uses Ascii 256, but it depends on the region ;).

Okay, but that doesn't work :\

What im trying to figure out is how to convert the ascii into a UTF-8 type that non-English Warcraft 3s use.

I know its possible because UTFizer does it, makes the letters appear even when the charset is ASCII

(java)
Code:
            String c = "UTF-8";
            byte[] b = s.getBytes(c);
            Toolkit.getDefaultToolkit().getSystemClipboard().setContents(new StringSelection(new String(b, c)), null);

will just replace ?'s for the other characters
 

Dr Super Good

Spell Reviewer
Level 63
Joined
Jan 18, 2005
Messages
27,192
What im trying to figure out is how to convert the ascii into a UTF-8 type that non-English Warcraft 3s use.
This cannot be done. WC3 uses a character map system to convert 8 bit characters into localization specific text. This is why when you play Asian/Russian maps the text appears as garbage in English mode since the characters are mapping to "unused" characters or nonsense characters.

This means that text localization is mutually exclusive, you cannot have a single piece of text containing all localizations without the others appearing as garbage.

Warcraft III was designed with localization in mind however. The Strings file that holds virtually every tooltip and GUI trigger text piece is localized via an extension. By importing multiple into a map, you should be able to provide every region with targeted localization. The big problem is the extra map size as well as the slower load time (in-lining these strings speeds up load time).

The simplest test you can do is to paste in 256 characters and see how WC3 interprets them. Obviously you remove the control characters from those as they make WC3 act strangely.

Also maybe provide a sample of a localized tooltip containing the characters you want as then values could be extracted and tested for which character set is used. I am guessing some custom character set may be used for bit 7 characters which requires special mapping (and thus how the tool gets its claim to fame).
 
Level 19
Joined
Aug 8, 2007
Messages
2,765
This cannot be done. WC3 uses a character map system to convert 8 bit characters into localization specific text. This is why when you play Asian/Russian maps the text appears as garbage in English mode since the characters are mapping to "unused" characters or nonsense characters.

This means that text localization is mutually exclusive, you cannot have a single piece of text containing all localizations without the others appearing as garbage.

Warcraft III was designed with localization in mind however. The Strings file that holds virtually every tooltip and GUI trigger text piece is localized via an extension. By importing multiple into a map, you should be able to provide every region with targeted localization. The big problem is the extra map size as well as the slower load time (in-lining these strings speeds up load time).

The simplest test you can do is to paste in 256 characters and see how WC3 interprets them. Obviously you remove the control characters from those as they make WC3 act strangely.

Also maybe provide a sample of a localized tooltip containing the characters you want as then values could be extracted and tested for which character set is used. I am guessing some custom character set may be used for bit 7 characters which requires special mapping (and thus how the tool gets its claim to fame).

Thats the funny thing though... Heres what i decoded one of the strings into

Code:
                            63,
                            -20,
                            -110,
                            -106,
                            -21,
                            -71,
                            -98

If i try to assemble a string with those same bytes, it gives me the same bytes back if i use .getBytes() but when i paste it in WC3, it just pastes a question mark.

The text of the bytes above is 시야. When encoded in UTFizer and pasted here, it looks like ?쒖빞 in Windows but works fine in game. When re-assembling it by byte in Java, it looks like ?쒖빞 outside of windows but ? inside of wc3
 
Status
Not open for further replies.
Top