1. The Melee Mapping Contest #4: 2v2 - Results are out! Step by to congratulate the winners!
    Dismiss Notice
  2. We're hosting the 15th Mini-Mapping Contest with YouTuber Abelhawk! The contestants are to create a custom map that uses the hidden content within Warcraft 3 or is inspired by any of the many secrets within the game.
    Dismiss Notice
  3. The 20th iteration of the Terraining Contest is upon us! Join and create exquisite Water Structures for it.
    Dismiss Notice
  4. Check out the Staff job openings thread.
    Dismiss Notice

[Documentation] String Type

Discussion in 'The Lab' started by PurgeandFire, Sep 2, 2013.

  1. DracoL1ch

    DracoL1ch

    Joined:
    Dec 12, 2010
    Messages:
    1,746
    Resources:
    2
    Tutorials:
    2
    Resources:
    2
    there's no way for strings table be synced, either no way it leads to any kind of desyn at modern patch (26+). local defined strings are thing
     
  2. PurgeandFire

    PurgeandFire

    Code Moderator

    Joined:
    Nov 11, 2006
    Messages:
    7,420
    Resources:
    18
    Icons:
    1
    Spells:
    4
    Tutorials:
    9
    JASS:
    4
    Resources:
    18
    opening by request.
     
  3. IcemanBo

    IcemanBo

    Joined:
    Sep 6, 2013
    Messages:
    6,098
    Resources:
    22
    Maps:
    3
    Spells:
    11
    Template:
    1
    Tutorials:
    4
    JASS:
    3
    Resources:
    22
    Thanks.

    Some questions regarding the test scenario about the string table. (assuming wc3's c++ usage)
    • If all used strings perma exist in the string table.. in which part of memory are dynamically created entries located then (not its reference), since it must not be static-, read-only memory. (where usually only string literals are stored in c++ at compiletime) It must be their own string library implementation which also allocates on heap, but .. never frees it? ;o
    • May there be some periodical collector going through the string table, freeing entries that match a condition?
    • If not, is then small string optimization even a thing then? If string table was tested for wc3, with which strings was it tested? Short ones, long ones,.. both? This optimization would be very good for damange amount texts, string iterations, and alike. So also small strings are always required to allocate memory on the heap?
    Has someone who experimented with string table maybe some closer description of the test scenario?
     
    Last edited: Oct 6, 2018
  4. DracoL1ch

    DracoL1ch

    Joined:
    Dec 12, 2010
    Messages:
    1,746
    Resources:
    2
    Tutorials:
    2
    Resources:
    2
    afaik there are nothing like gc, just the same reason why big databases never clear out it's data but just mark it as "removed" on backend - to avoid fragmentizing and improve search/memory consumption.
    I don't believe there are any optimizations like you've mentioned either, because each string is located as a raw string with 0-byte ending inside common memory regions. String table only provides shortcuts and improve comparing speed. Like a simple linked list, it doesn't contain any extra logic behind. At least it would be noticable, if there were, when working with memhack, but so far I've never seen a string being re-allocated, which means its rather a big heap with alloc() when needed.

    when you close the game (map/campaign), it clears out everything out of memory, including string table. Corrupting string table with fake links (aka replacing string address directly inside the table) will cause memory errors later, I had this shit before.
     
    Last edited: Oct 6, 2018
  5. IcemanBo

    IcemanBo

    Joined:
    Sep 6, 2013
    Messages:
    6,098
    Resources:
    22
    Maps:
    3
    Spells:
    11
    Template:
    1
    Tutorials:
    4
    JASS:
    3
    Resources:
    22
    string libs should take care of cleaning, when pointer on stack is out of scope. Just the way you use normaly strings in c++, memory gets allocated on heap (for longer strings), and after the function ends, you do not call something like a string destructor yourself. But in warcraft, if the string perma exists, this does mean they use their own perma allocation, and hence do never care if strings references are still a thing. (would explain why nulling strings would make no sense)

    Not sure it is related how for example customers gets data provided from big data bases, entries marked internally as removed, not being technically removed. This should have usually useability difference, that for example customer can't see entries that are not meant to exist anymore, because of privacy or what ever.. but what would be the relation of string table to wc3 coders? Jassers just always can use any string they want, and no masking is needed, and the question is about its allocation or new referencing.

    When program ends, then I guess it's normal that also the memory for constants and literals is cleared, just like everything should be cleared that is used by the program. But in runtime is interesting, how dynamically created strings behave.

    But if the table doesn't include any extra string logics, then it confuses me a bit. string table seems like a weird concept for me. So there needs to be some mapping to the string finaly, as how else would I find the correct entry through the reference dynamically, when not looping through the table.

    If you never experienced any new allocation with memhack, then maybe it's really just it that they never free their memory again when ever a new string is allocated on heap.
     
  6. DracoL1ch

    DracoL1ch

    Joined:
    Dec 12, 2010
    Messages:
    1,746
    Resources:
    2
    Tutorials:
    2
    Resources:
    2
    they dont care about cleaning because there are no real case when you'll have issues with too many unique strings, only if thats your intention. I manged to slow down the game generating ~200k unique strings. Thats way too many for any normal map. Nulling strings ,like nulling "code" references, wont make any sense since the object is never destroyed.
    String Table provides faster compare, and wc3 is all about comparing. Variable ID, function ID from jass bytecode morphs into a string and then searched through the table to find the string with the same hash. Theres no ops with raw strings, it's always StringHash - and string table stores that hash next to the string's address. I've found about the purpose of string table on Stackoverflow back then, did never care since then. It's just better for engine to work.
     
  7. IcemanBo

    IcemanBo

    Joined:
    Sep 6, 2013
    Messages:
    6,098
    Resources:
    22
    Maps:
    3
    Spells:
    11
    Template:
    1
    Tutorials:
    4
    JASS:
    3
    Resources:
    22
    200k entries sounds a lot for the engine to slow down, but meh, a few thousends sounds plausible, and I would not understand the argument on doing it on purpose, only because of a feel that it's maybe acceptable.

    Mind to elaborate what exactly is compared faster instead of what? In this relation I mean, why is it required to literally never free up the string's memory until application end forces a free? It would make sense to remove entries from table from time to time, that are not being used and/or match an other condition (being dynamically created from stack refs), as the point from above, keeping every data internally inside memory seems not reasonable.

    Because of this, people are spreading to keep unique strings at mininal, and honestly that's bullshit for me. Like noone should care for such low level mangement, when making string iterations and comparisons, trying not to create unique strings, or string2real coversions. But technically those people would have a point.

    I'll try to find it, but you could also share the exact article about string table in c++ if you remember.
     
    Last edited: Oct 6, 2018
  8. DracoL1ch

    DracoL1ch

    Joined:
    Dec 12, 2010
    Messages:
    1,746
    Resources:
    2
    Tutorials:
    2
    Resources:
    2
    nobody cares about strings, idk where did you get that. Bless god they know about leaks at first place.
    Optimized C++
    Also a table reduce load by simplifying duplicate strings, for instance. You can google for "c++ string table management" for more articles, most of the times inventing yet another wheel is recommended, people don't advice to work with raw strings at any matter. So I assume blizzard, being non-newbies, went with their own implementation as well.
    Maybe GC have been planned, but never finished, because, once again, you dont do the job you dont need. No maps ever suffered of strings overflow.

    comparing string's hash is faster than comparing the string itself with another string. "a"=="A" is kinda simple, but table allows to compare 1024+ symbols strings faster, just because it won't retrieve string into CPU low-level cache (risking to replace probably much more needed data). There are a lot of benefit of keeping cache intact for the future ops. Basically low-level optimization. Plus don't forget the year of when engine has been programmed.
     
  9. IcemanBo

    IcemanBo

    Joined:
    Sep 6, 2013
    Messages:
    6,098
    Resources:
    22
    Maps:
    3
    Spells:
    11
    Template:
    1
    Tutorials:
    4
    JASS:
    3
    Resources:
    22
    Strings and leaks - string concatinations
    [JASS] - [possible leak] strings used in functions - damage texttags
    What and How do Strings Leak? - gametime
    [General] - String contains - substrings

    ^sure, some are older, but one can see posts alike from time to time, people spread it, and it's something someone cares.

    edit:

    But thanks for your thoughts. : ) What I take with me is
    • fast hash comparisons vs long string comparisons
    • might reduce string copies, re-allocations, for better performance
    • perma allocation on heap is an accepted "leak-risk", or just no GC was created back then to remove potential removeable entries
     
    Last edited: Oct 7, 2018