• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

Improve automated extraction of (listfile)?

Status
Not open for further replies.
Level 19
Joined
Jan 3, 2022
Messages
320
The topic has come up recently again where people cannot retrieve all of the file names of a broken MPQ map archive when using MPQEditor. Why is the situation so bad though?

Explanation​

(if you know how listfiles and MPQEditor work, can skip reading this)
An MPQ archive normally has all file names stored in a (listfile) file. Map protectors/optimizers usually delete the file because it's not needed for the map to work. At the same time it makes modifying/working on the map in WE nearly impossible: you need the file to be normal, to have a name. MPQEditor has an automatic name detection feature under "Tools" -> "W3X Name Scanner". In one mode, it injects into the game to intercept file loading calls and save them to a list file (only worked in old WC3 versions). In second mode, it scans the map files for text strings and checks if any of those strings are a valid file name; it also has a bundled "default" listfile for Warcraft's own MPQ archives.

The injection method obviously no longer works and the second search method is imperfect.
I tried MPQEditor's W3X Name Scanner on a legit optimized map: 129 files remained unknown. I wrote my own tiny tool but it was barely better: 127 files remained unknown. More on that later.

Other methods​

RMPQEx Map Extractor (by Riv) too has a scanner for unknown names (called "Auto Search"). Tested: 117 files remain unknown.
MPQEditor also had run the "Name Breaker" project in the past, the goal was to bruteforce the names by trying all combinations. That was useful for Blizzard's original MPQs without list files, where it was possible to make a lot of assumptions about the name structure. It is not practical for maps where file names can be very long...
...unless someone creates a GPU-accelerated bruteforcer similar to what hashcat does :) You'd either get the original name by bruteforcing with so much power or a hash collision (a random name that would work the same)

RMPQEx surprised me, it recognized 10-12 more files than MPQEditor.

Do It Yourself​

Nonetheless, the (listfile) must be fully recoverable for all the files the map actually uses. The war3 maps consist of two groups of files: 1) needed for WorldEdit only 2) needed for map to run.
Anything that you see in-game must be referenced in 2nd group's files. A custom unit will load the custom texture and model by file name, so these strings will be inside the map somewhere. Basically, the methods of RMPQEx and MPQEditor to detect names must be improved upon to recover all unknown files.

Lua:
#!/usr/bin/env lua

-- takes ... any amount of files, extracts all unique strings that are
-- encased in double-quotes:
-- "extracted_string"
-- RUN AS:
-- lua extract-strings.lua ability.ini war3map.j someother.ini > random-strings-for-listfile.txt

-- v1.0 (just a little test version)
-- Results:
-- MPQEditor: 129 unrecognized files
-- This script using w3xlni export: 127 files
-- RMPQEx: 117 files
function processFile(filePath, intoTable)
    local file = assert(io.open(filePath, "r"))
    for line in file:lines() do
        -- strings can only be single-line, always inside "quotes"
        local raw = line:match('%b""')
      
        -- exclude empty strings
        -- and |r color codes (which would also be invalid file names on WIN)
        if raw and raw ~= '""' and not raw:find("|", 1, true) then
            -- cut double-quotes and remove escaped backslashes
            local str = raw:sub(2,-2):gsub("\\\\", "\\")
            if not intoTable[str] then
                io.stderr:write(str .. "\n")
                intoTable[str] = true
              
                -- some values are comma-separated, just add them without lookup:
                for csv in str:gmatch("[^,]+") do
                    io.stderr:write(csv .. "\n")
                    intoTable[csv] = true
                end
            end
        end
    end
end

function main(args)
    local stringTable = {}
    for i = 1, #args do
        processFile(args[i], stringTable)
    end
    -- print stringTable to STDOUT
    for str, _ in pairs(stringTable) do
        print(str)
    end
end

main(args or {...})
This is what I tried:
  1. Use w3x2lni to extract map data to .ini files (actually only w2l CLI version worked, with 2 unspecified errors)
  2. Use my tiny script above to extract every text from /table/*.ini (w3x2lni output files) and war3map.j, and use that later as file name
  3. Take the output file from previous step and add it as a Listfile in MPQEditor
  4. ???
  5. PROFIT
This method worked as much as being a Proof-of-Concept, it only extracted 2 more names than MPQEditor, but 10 fewer than Riv's extractor. There're three possible problems with it:
  1. w3x2lni doesn't export everything as .ini files and some file names are inside other files
  2. w3x2lni's errors I encountered are actually errors, that resulted in incomplete output
  3. Jass
However, w3x2lni worked very well on an unoptimized map, but not perfectly.

Further, based on my previous research there're more "secret" paths used by Reforged where files could be hidden (localization and _hd.mod folders): Multilanguage map prototype (Translation Tutorial) - this trick was previously used to hide ./war3map.j inside ./scripts/war3map.j and it took some time until deprotectors figured it out.

I'm eager to try it out and improve the script above when I get time. Until then, I'd like to hear some input from people who know this stuff better than me :wwink:
  1. Are RMPQEx, MPQEditor just bad at scanning?
    1. Of course, in case of JASS code, the process cannot be automatic: string textureName = "path/to/texture." + variableExtension - here the tool should print all suspicious lines for the human to figure it out, a manual process
  2. If the map has imported unused files, will there remain any info after an optimization/protection process?
  3. How much of the 117-127 unrecognized files are actually used by the map (in my case)?
  4. (I don't want to summon the Devil but): Novel map protection techniques that'd rename files automatically to use names not allowed on Windows (e.g. | character) - THIS IS A PATENTED TECHNOLOGY, YOU OWE ME $9000 IF YOU USE IT IN YOUR PROTECTOR. Please don't hamper creativity. - Something like this would force you to also modify all file references to rename the invalid file name.
 
Last edited:
Years ago, although I did not widely publish it, while chatting with someone on Discord there was a time that I created something akin to a theoretical solution to what I think you are discussing. Originally I seem to have created the necessary libraries purely by accident -- I wanted a GUI where I could choose a WC3 unit and then open the corresponding model in a model editor.

But, when I already had unit data parsers like that, it was fairly trivial to make a GUI where the user could choose any unit or whatever and then the system would check
  • Icon
    • Disabled Icon
  • Model
    • All used textures of model (this was using Retera Model Studio libraries, etc)
  • Portrait Model
    • All used textures of model (this was using Retera Model Studio libraries, etc)
  • Missile art Model
    • All used textures of model (this was using Retera Model Studio libraries, etc)
  • Caster Upgrade Art Icon
  • Hero Scorescreen Icon
  • (maybe some other similar ones that I forget)
The GUI prompted the user for a folder, so whenever a unit was clicked from the list of units in the map, all assets per that unit were dumped accordingly. This required mounting the game MPQs as well as the map, so that it could check and optionally skip ingame files (but include map-specific overrides of ingame files at the same path).

Later, when I was doing something that I probably shouldn't talk about and wanted to posit the idea to the Ultimate Battle authors that their map could be a game-wide mod MPQ instead, I wrapped the behaviors of that GUI program (that I originally made for someone on Discord) and then basically I put it into an automated loop to get all custom assets in an alike manner. To be honest I think this may have been somewhat offensive to the UB team that I was doing things like that, maybe I was just taking technology and running with it. They're pretty cool guys and I should have treated them better.

Anyway, today we can go a step beyond that. You could actually just run Warsmash which is based on the same code, but fork it and change the model loading code subroutine so that whenever an asset loads ingame, the corresponding asset is saved to disk including its MPQ-based path into some designated folder. This is also a convenient way to capture all assets used in a particular map experience such as for building a demo -- by doing that, and locking Warsmash to only play a single map when launched, you can make an EXE wrapper based demo of a map that is basically self contained and share it with a friend using Warsmash.

That is not really a matter of map protection. The concept of map protection does not really exist, or if it does then it is not exposed to the user on Warcraft 3 map editor. Only the maps with the "Blizz" tag are cryptographically signed and protected, losing their signature when modified. For all other maps, there is not a way to cryptographically ensure that they are unmodified. In addition, if you want to add cheats to a map you can do so quite easily by modifying the game itself using local file overrides and adding your triggers to the set of all triggers loaded in all maps. So, even before Warsmash, these concepts people have are something that actually was never based on sound logic and was never going to work. I'm not advocating against map protection -- it simply doesn't exist for World Editor users. So, the people who use my modding technologies for "that" are just getting better modding technologies from my standpoint.

But, because I am in favor of an open source reference implementation of the Warcraft III game, by extension if we achieve what I am asking for then there will be a future where the consistency of a map and the game installation upon which it is played is a decision of the user, and not a piece of malware/DRM/whatever that commands their computer to function in a particular way regardless of the decision of the computer's owner. This is a consequence of better modding tools. If you really wanted to, you could get one of the ACE/memhack users to go and find the Blizzard Warcraft III subroutine for loading an asset and tell it to write the asset to disk while loading it in the very same way as what you can do with Warsmash. It would just probably be much harder.


Edit:

If I have a map using an in-game model, but with one of the ingame textures of that model replaced via an override file of the same name, would your solution catch it? I guess if you fed the search all of the listfile contents of the original WC3 install maybe that would work, but of course you would want to include files that weren't listed like the ones from the old War3Patch.mpq and such.
 
Last edited:
Level 19
Joined
Jan 3, 2022
Messages
320
Yes @Retera, @Drake53 is right about the "Live game scanner" which hooked into the game and intercepted the load calls. This is one possible way, but it's not automatic and very, very slow because you have to play for a long time. And yet it doesn't provide a 100% success rate, especially for big maps where you may never load some resources.
If I have a map using an in-game model, but with one of the ingame textures of that model replaced via an override file of the same name, would your solution catch it? I guess if you fed the search all of the listfile contents of the original WC3 install maybe that would work, but of course you would want to include files that weren't listed like the ones from the old War3Patch.mpq and such.
Yes that is the idea, to include all old and known paths, add Reforged's new folders as prefixes to search in and highlight functions in code that are responsible for loading resources (like sound, textures etc. they take a path to a file to load). This would be pretty good automation if achieved.

@MPQEditor I will send you a PoC map when I'm ready. I'm currently torn between various places I work on around WC3.

PPS: Data mine maps from epicwar.com to generate the biggestest (listfile) ever.
 
Status
Not open for further replies.
Top