• 🏆 Texturing Contest #33 is OPEN! Contestants must re-texture a SD unit model found in-game (Warcraft 3 Classic), recreating the unit into a peaceful NPC version. 🔗Click here to enter!
  • 🏆 Hive's 6th HD Modeling Contest: Mechanical is now open! Design and model a mechanical creature, mechanized animal, a futuristic robotic being, or anything else your imagination can tinker with! 📅 Submissions close on June 30, 2024. Don't miss this opportunity to let your creativity shine! Enter now and show us your mechanical masterpiece! 🔗 Click here to enter!

June 2013 News Batch

Status
Not open for further replies.
Level 14
Joined
Nov 18, 2007
Messages
816
Its deprecated because it only supports 2GB files [...]
Turns out thats bullshit.

Since it is so heavily used, they could not remove it or change it so instead they added nio, short for New Input Output.
They never wanted to remove it, which is why nothing i used has an @Deprecated annotation.

New Java programs should avoid the old io in favour of nio. Adapters exist to convert between the two stream formats however these are only for compatibility. In reality Java does file IO only using nio Classes. The ones you see in io package are wrappers around nio Classes.
Turns out this is bullshit as well. All of this. java.io is an ALTERNATIVE to java.nio. The two packages have different designs (stream oriented vs. buffer oriented), and while java.nio is certainly a lot faster (heard claims of >250% speedup), it doesnt matter if the bottleneck is not I/O.

java.io also doesnt use java.nio internally. Which you could have verified by looking at the source.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
Turns out thats bullshit.
Um maybe you did not understand me. io has no seek method for very large files, nio does. io requires you to skip forward (or backwards?) by an int amount, nio uses a long absolute position. Although in this case where you are reading a stream file it makes no difference since everything has to be read sequentially, it does make a difference for an MPQ reader as there you are reading chunks at random.

They never wanted to remove it, which is why nothing i used has an @Deprecated annotation.
They never can remove it, it is too heavily in use to ever be removed. Only Java 2.0 could consider removing it but that is currently never planned.

All of this. java.io is an ALTERNATIVE to java.nio.
Java io came first and has been with Java pretty much since the start, nio was made as an official package then added into the standard edition due to a combination of popular demand and requirement.

it doesnt matter if the bottleneck is not I/O.
Which in this case it likely will be since you are parsing a file.
 
Level 14
Joined
Nov 18, 2007
Messages
816
Um maybe you did not understand me. io has no seek method for very large files, nio does. io requires you to skip forward (or backwards?) by an int amount, nio uses a long absolute position. Although in this case where you are reading a stream file it makes no difference since everything has to be read sequentially, it does make a difference for an MPQ reader as there you are reading chunks at random.
FileInputStream.skip()


They never can remove it, it is too heavily in use to ever be removed. Only Java 2.0 could consider removing it but that is currently never planned.
They can still officially deprecate it. Which they have not done. So you pick the tool thats best for the job, and thats not necessarily java.nio.


Which in this case it likely will be since you are parsing a file.
You do realize this is a simple proof of concept parser? I hope you also realize that the files that are going to be parsed are on the order of hundreds of KiB in size? I hope you also realize that this parser is useless without something around it, and im taking a guess here and say that anything useful is going to have bottlenecks other than I/O (i might very well be wrong, but unless you plan on parsing hundreds of MiB, you wont notice when a file is loaded once)?
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
Well here is my solution.

DOOIO is used to produce 2 different data container objects, one for Units and Items and the other for Doodads and Terrain Objects.

ioformats.doo.io contains all the structure classes. These are used for platform independent data interpretation of structures.

ioformats.doo contains all the data types and the deserialization class DOOIO. The data classes themselves could be organized into packages with separate deserialization classes for each type.

The reason I do not de-serialize each type within its own class is to avoid excessive importing of io classes. This improves cohesion as all I/O activity is performed in a single place separate from the data manipulation. It would also allow for different I/O types for the data (eg for transferring across a network or storing as a more efficient file format). You can view this as a sort of I/O plugin model where data classes have no real concrete I/O implementation but instead a number of I/O classes that manipulate them located in the same package.

SeekableByteChannel is used for the source as that is simplest type of channel you can use that declares its size. Since I am reading the entire file into a buffer (as the files are small) it is important to know how large the buffer must be.

The problem with .doo files is their complex stream orientated format nature. This prevents one from using a super efficient buffering scheme without extra intermediate data copies. For comparison a format like MPQ uses for archived files is buffer orientated and all chunks of a file are guaranteed to be equal to (uncompressed) or less than (compressed) the defined buffer size. Since .doo files are intended to be relatively small, it should make no difference loading the entire file into memory and then processing it.

By manipulating a FileChannel directly as a source, memory mapping could be used to eliminate internal buffer allocation and reading altogether. This would be considerably faster as no extra copying of file data is required after the OS pages it in. However such a method seems pointless to make since .doo files are located within .mpq archives and are usually compressed so the data requires at least 1 buffer swap in any case.

Thus the overall flow of reading a .doo file from a standard WorldEdit made .mpq would go...
compressed chunks -> raw chunks -> single file in buffer -> native objects

Since .mpq reading is already buffered, a lighter more streaming version could be used that only uses a buffer of the largest structure size that it fills with the required bytes at each stage. Whether this would be faster or not is debatable as it trades off a possibly cache inefficient large buffer with single call filling for a smaller cache friendly buffer with huge numbers of filling calls. It would also be slower when reading files directly as it would require buffering and extra memory copies.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
Would be better if someone used all this in a project. You could make people write the code with an intention of it being used then.

Obviously you would need to define the code specification better, preferably an interface and language restriction as those are easiest to use in other components.
 
Level 31
Joined
Jul 10, 2007
Messages
6,306
I want to make a third party editor, in C++, with plugin functionality in lua i think, idk on plugins.

So i'm slowly writing a parser for every map file.

so vJASS and Galaxy?

What are you using for parsing?

I'm working on something that's integrated into notepad++, but the tool chain isn't specifically linked to that, so we can combine ;D.

Already doing a makefile framework for vJASS using Lua and c++. Also have some c++ for embedding Lua into vJASS.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,202
Trial and error. I found out about it when I was trying to create an efficient save/load system that could resolve each character in O(1) time instead of the usual O(n) time which most save/load systems used. To do this I used a hash of a string and started to notice strange and wrong results. After some research I found that both 'A' and 'a' give the same results when passed to the hash function. This meant I needed to use a O(n) linked list to resolve between cases.

Since the hash produced is not case sensitive, it means any strings with alternative case of the same content would hash to the same result.

I am pretty sure I was not the first or last person to notice that. People like Nes also wrote save/load systems and I am guessing he some how came across it as well.
 
Status
Not open for further replies.
Top