• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • It's time for the first HD Modeling Contest of 2024. Join the theme discussion for Hive's HD Modeling Contest #6! Click here to post your idea!

JMPQ-v3 a pure java MPQ library

Status
Not open for further replies.
Level 8
Joined
Nov 20, 2011
Messages
202
Hello,
to be honest Iam to lazy to write long presentations but I still want to share this library with you. So here is just a quick overview:

JMPQ-v3 is a pure java implemention of the MPQ format which aims for small ram useage and high performance.
The following operations are possible:
5bITk.jpg


The java doc may be imcomplete :S

Download and Bug report can be found here: https://github.com/Crigges/JMPQ-v3

Edit: Libary can now be found on Maven Central: http://mvnrepository.com/artifact/systems.crigges/jmpq3
 
Last edited:

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,198
Code:
File tempMpq = File.createTempFile("work", "around"); 
Files.copy(mpqW.toPath(), tempMpq.toPath(), StandardCopyOption.REPLACE_EXISTING); 
fc = FileChannel.open(tempMpq.toPath(), StandardOpenOption.CREATE, StandardOpenOption.READ, StandardOpenOption.WRITE);

In order to open a MPQ archive it has to copy the entire file?!

Code:
MappedByteBuffer headerBuffer = fc.map(MapMode.READ_ONLY, headerOffset + 8, headerSize); 
headerBuffer.order(ByteOrder.LITTLE_ENDIAN); 
readHeader(headerBuffer);
Although one might think memory mapping is a good way to do file I/O, it is not in most cases. Specifically it can break when memory mapping is not possible (archive in archive, other URI resources) and for small chunks has quite a large overhead due to the address space mapping.

Also offers no support for MPQ V1,2 and 3. Although MPQ V1 and 2 are no longer used outside of old install discs due to WoW migrating to CASC, MPQ V3 is still used by user made StarCraft II maps and mods (not core mods anymore as they are also CASC now).
 
Level 8
Joined
Nov 20, 2011
Messages
202
In order to open a MPQ archive it has to copy the entire file?!
Yes this is a bad workaround due to a windows-java bug with mapped files.

Although one might think memory mapping is a good way to do file I/O, it is not in most cases. Specifically it can break when memory mapping is not possible (archive in archive, other URI resources) and for small chunks has quite a large overhead due to the address space mapping.
MPQ inside MPQs are stored as Single Unit since they are not compressed. so it is possible to do memory mapping there.
Anyway java N/IO is better than java I/O in Memory usage and Performance.

Also offers no support for MPQ V1,2 and 3. Although MPQ V1 and 2 are no longer used outside of old install discs due to WoW migrating to CASC, MPQ V3 is still used by user made StarCraft II maps and mods (not core mods anymore as they are also CASC now).
Yes i never planed to support any other MPQ version.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,198
Yes this is a bad workaround due to a windows-java bug with mapped files.
Never encountered such a problem. I also used memory mapping at some stage before I knew what I was doing and it worked without having to copy a 500-2000 MB file depending on content.

MPQ inside MPQs are stored as Single Unit since they are not compressed. so it is possible to do memory mapping there.
Anyway java N/IO is better than java I/O in Memory usage and Performance.
No it is not better in performance or memory usage. It has a completely different use case so cannot even be compared. Yes it can perform better, but it can often perform worse all depending on what you are doing. It also completely destroys readability of your code since it is intended for bulk transfer of data and not file I/O. You should still use streams for file I/O, even if they end up wrapping into channels for bulk data movement.

Stream calls are only slow if there is no buffering since then each call invokes a kernel level I/O function as part of a driver or the OS file system. If they are buffered then they run very fast and can even be faster than Channel I/O calls since they recycle the same buffer as opposed to having to allocate new ones constantly or invoking memory mapping kernel level operations.

Memory mapping gives you no advantage in this case as most of the time you will still have to buffer the data else where due to decryption or decompression.

The interface used is non-standard. I cannot install this as a plugin and suddenly work with MPQ files. You really should be mounting MPQ files as a FileSystem so that anything that deals with Path objects can transparently use this.

In fact you should be able to stream from the archive directly, something not exposed by your MPQ interface.

To elaborate. Channel is for bulk data movement usually to or from an I/O source at a low level. Stream is for processing data at a high level. NIO did not obsolete IO, it expanded it with new I/O capabilities. Both are interchangeable natively with some static adapters.
 
Level 8
Joined
Nov 20, 2011
Messages
202
Never encountered such a problem. I also used memory mapping at some stage before I knew what I was doing and it worked without having to copy a 500-2000 MB file depending on content.
The size does not matter the problem is that windows does not allow to truncate or delet files with when the file is still mapped. See also:
http://bugs.java.com/view_bug.do?bug_id=4715154
http://stackoverflow.com/questions/...-from-memory-mapped-using-filechannel-in-java

No it is not better in performance or memory usage. It has a completely different use case so cannot even be compared. Yes it can perform better, but it can often perform worse all depending on what you are doing. It also completely destroys readability of your code since it is intended for bulk transfer of data and not file I/O. You should still use streams for file I/O, even if they end up wrapping into channels for bulk data movement.

Stream calls are only slow if there is no buffering since then each call invokes a kernel level I/O function as part of a driver or the OS file system. If they are buffered then they run very fast and can even be faster than Channel I/O calls since they recycle the same buffer as opposed to having to allocate new ones constantly or invoking memory mapping kernel level operations.

Memory mapping gives you no advantage in this case as most of the time you will still have to buffer the data else where due to decryption or decompression.

The interface used is non-standard. I cannot install this as a plugin and suddenly work with MPQ files. You really should be mounting MPQ files as a FileSystem so that anything that deals with Path objects can transparently use this.

In fact you should be able to stream from the archive directly, something not exposed by your MPQ interface.

To elaborate. Channel is for bulk data movement usually to or from an I/O source at a low level. Stream is for processing data at a high level. NIO did not obsolete IO, it expanded it with new I/O capabilities. Both are interchangeable natively with some static adapters.

Jmpq v2 was using I/O and performed much worse since i had to load the whole mpq into the ram. I need random file access and java IO does not support it.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,198
The size does not matter the problem is that windows does not allow to truncate or delet files with when the file is still mapped. See also:
Hence why you should not be using memory mapping. Memory mapping looks fantastic from the surface but ultimately it has some very specific and unusual use cases. Specifically the guarantees of the Java API are not very good.

Jmpq v2 was using I/O and performed much worse since i had to load the whole mpq into the ram. I need random file access and java IO does not support it.
Java IO does support random access.
RandomAccessFile
FileChannel
FileSystemProvider (underlying implementation of FileSystems)
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,198
RandomAccessFile is using FileChannel at the backend so no.
It never did back in 1.0. They added nio to expand the API with generic file system support and other useful I/O features. The matter remains that nio is not a replacement for io and they are meant to be used together.

However we have deviated from my original point...
I need random file access and java IO does not support it.
Java IO system does support it. Be it through RandomAccessFile or through SeekableByteChannel. There is no reason to use memory mapping in this situation especially when, as you highlighted so clearly, there are issues with the implementation.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,198
RandomAcessFile seems to allow random file access without using nio.
As mentioned previously, nio and io go hand in hand. Neither was designed to be better with nio being added purely to expand the capabilities of io.

Maybe it will be faster but the ram usage will be higher than now.
Please explain how memory usage will be higher?
 
Level 8
Joined
Nov 20, 2011
Messages
202
Except for it not supporting MPQ1-3, creating a lot of temporary files and not being a FileSystemProvider.

As i mentioned JMPQ was NOT intended to support non wc3 MPQs, and without theese temporary files the ram useage would be higher.

However this JMPQ version is the fastest I ever wrote and tryes to use at least ram as possible. Take a look at the MPQ Texture Browser which is using JMPQ-v3. It uses one thread for the gui, one thread for the file extraction and all the other threads for resizing the images. And nevertheless the browser's bottleneck is the image resizing. So maybe you are right and JMPQ could perform better by using RandomAccessFile. But I don't care, because there is no need for further performance improovement. If you want to read huge amounts of data out of MPQs you will need to process it further and in most cases there will be the bottleneck.

But proove me wrong: fork my code github and write JMPQ-v4 which does all the stuff you want. Compare the ramuseage and performance with v3 and show me the results.
But currently you are just bashing v3 with the unconfirmed argument that IO would be better than NIO in this usecase. There is no proove you are right and even if you are right the improovment would be so tiny that i wouldn't care, otherwise proove me wrong.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,198
I already wrote one several years ago. I was still learning about Java so it is not very neat.

It did not support HET and BET tables but it could open SC2 map and mod archives since they still used HashTable and BlockTable mechanics in parallel. It also did not support all types of decompression but did support the common ones. Main flaw was a lack of writing support, however the way MPQ is constructed makes this difficult (why every patch had to rebuild the archives when patching). I could argue the lack of file writing support is similar to your lack of MPQ1+ support in that it was designed with other priorities at hand and never finished.

My Java MPQ solution provided you with SeekableByteChannel to the file contents when reading. The chunk mechanics of MPQ files meant that the channel was equivalently buffered so even small reads would be reasonably fast. No temporary files were created at any time when opening a MPQ archive and reading files from it. Double buffering was used to minimize chunk buffer allocation when decrypting/decompressing to at most 2 per channel opened.

Performance measurements showed that it was entirely I/O bound. Full extracting (reading and writing out to file, each part done on separate disks, files extracted in order presented by listfile) a main MPQ archive (war3.mpq) would take <60 seconds when not inside the file cache. Extracting the same MPQ archive when inside the OS file cache took <3 seconds. When not inside the file cache the slow performance is the result of random access I/O on mechanical drives and mirrors the "initial loading performance" Warcraft III has.

After I have finished some more work on my BLP ImageReader I am considering revisiting the MPQ library and updating it to be a full FileSystem.

with the unconfirmed argument that IO would be better than NIO in this usecase
As I have said multiple times already both io and nio are meant to work together. Nowhere did I say one is better than the other, just what you are doing with them does not make much sense. Usual file input/output in Java is still done using the standard io package and there is no reason to migrate from that as it fulfils its purpose perfectly. File systems and other low-level input/output providers should be using the nio package and channels, even if ultimately the channel they provide ends up wrapped in an io package stream to be used.

Edit:
Because you copy the file when opening an archive you probably force a very fast sequential read of the entire archive. This means that both source and destination files are now in the OS file cache so no underlying I/O will occur. This will give you performance similar to my library after forcing the entire archive into the file cache. Forcing the entire archive into the file cache requires 500 MB of memory odd for a main archive. If such memory is not available or the OS is forced to purge the file cache then performance should degrade closer to the random access performance my library presents. Also your argument about memory usage is not entirely true since your process virtual address space could spike by 500 MB when opening a MPQ archive due to the memory mapping of the entire archive file.
 

Attachments

  • MPQ1Final.zip
    70.5 KB · Views: 132
Last edited:
Level 8
Joined
Nov 20, 2011
Messages
202
Because you copy the file when opening an archive you probably force a very fast sequential read of the entire archive. This means that both source and destination files are now in the OS file cache so no underlying I/O will occur. This will give you performance similar to my library after forcing the entire archive into the file cache. Forcing the entire archive into the file cache requires 500 MB of memory odd for a main archive. If such memory is not available or the OS is forced to purge the file cache then performance should degrade closer to the random access performance my library presents. Also your argument about memory usage is not entirely true since your process virtual address space could spike by 500 MB when opening a MPQ archive due to the memory mapping of the entire archive file.

How about no?
You are simply saying: Hey your libary performs as good as mine but just due to a bad workaround. You can delete the file copy part as long as you won't rebuild the mpq. The performance will be even better i guess. I am going to add a readonly mode soon to prevent copying the entirely file.

However bring me Facts and do some Benchmarks instead of telling me what could be better.
 
Level 8
Joined
Nov 20, 2011
Messages
202
hi criggs, thank you so much for creating jmpq, it is really useful. could you please add maven support for this version (i'm using v2)? thanks!

v2 is complletly outdated and has serveral bugs. Additionaly iam not familiar with maven so i don't know how to add support for it. But you can just fork my repo and add the support yourself. If it is fine i will merge it.
 
Status
Not open for further replies.
Top