• Listen to a special audio message from Bill Roper to the Hive Workshop community (Bill is a former Vice President of Blizzard Entertainment, Producer, Designer, Musician, Voice Actor) 🔗Click here to hear his message!
  • Read Evilhog's interview with Gregory Alper, the original composer of the music for WarCraft: Orcs & Humans 🔗Click here to read the full interview.

C# FileStream.CopyToAsync vs File.Copy

Status
Not open for further replies.
Hello again. I am currently using File.Copy method but I was wondering about using the FileStream.CopyToAsync method.
Speed is essential so I am trying to see which is faster / better to use. ( less error prone). The files range in size from as small as 1 byte to as large as a few gb. (max is around 10gb atm but that can easily change)
I have tried this and it is marginally faster.
JASS:
using (FileStream SourceStream = File.Open(fileName, FileMode.Open, FileAccess.Read))
                {
                    using (FileStream DestinationStream = File.Create(fileDestination))
                    {
                        await SourceStream.CopyToAsync(DestinationStream);
                    }
                }
I was wondering if there is a better more efficient way to copy files from one location to another ? Also what are the advantages of using an Async copy vs File.Copy ? What other methods of copying is there and what is preferred ? Any help is greatly appreciated.
 
a tip: use stream.Flush; stream.Close(); in the end of the using statements: even though it's supposed to dispose itself, it leaves the resources in the memory if you don't close it.

I did several tests and the above does not seem to be the case. When using the stream.Flush(); and stream.Close(); It does not do anything more than the using statements. It does slow down the file copying by almost 3 times.
 
Hmmm... For me not using those statements caused a huge amount of memory leaks ^^ Used 2 memory streams and a file stream for downloading, encoding and saving a file, ended up with 500MB RAM taken...

That is very odd. I will test on my other pcs and see if it occurs. What os do you use and what .net framework ? Maybe that is the difference.
 
Level 15
Joined
Oct 18, 2008
Messages
1,591
That's interesting... Here is part of my code (got to delete parts though ^^)
C#:
class Encrypter
{
	public MemoryStream Encrypt(Stream streamToEncrypt)
	{
		var outStream = new MemoryStream((int)streamToEncrypt.Length);
		var positionInPassword = 0;

		//Encryption algorithm deleted from here and there :P

		var b = new byte[streamToEncrypt.Length];
		streamToEncrypt.Read(b, 0, (int)streamToEncrypt.Length);
		outStream.Write(b, 0, b.Length);
		outStream.Position = 0;
		return outStream;
	}
}

C#:
var encrypter = new Encrypter();
using (var stream = e.Result.Stream)
{
	using (var encrypted = encrypter.Encrypt(stream))
	{
		//stream.Flush();
		//stream.Close();
		using (var fstream = new FileStream(Path.Combine(MainViewModel.DownloadPath, fileName), FileMode.Create))
		{
			var bytes = new byte[encrypted.Length];
			encrypted.Read(bytes, 0, (int)encrypted.Length);
			//encrypted.Flush();
			//encrypted.Close();
			fstream.Write(bytes, 0, bytes.Length);
			//fstream.Flush();
			//fstream.Close();

			//Yea, I didn't use CopyTo, like a real caveman :P
			
			SaveDownloads();
		}
	}
}
 
Your Encrypt method returns a MemoryStream. I think that stops the using statement from releasing that instance of the MemoryStream. I am not entirely sure but that is what I think is the cause since our methods are the same other than the part that you use the MemoryStream in that way. That is very odd though. I'll try to do more testing when I can.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,258
Hmmm... For me not using those statements caused a huge amount of memory leaks ^^ Used 2 memory streams and a file stream for downloading, encoding and saving a file, ended up with 500MB RAM taken...
Windows uses a paging memory system. Even file I/O is done using paging unless specifically requested. What flush does is force out pages to backing storage instead of letting them reside in memory and in the process converts dirty pages (ones with changes) into can be freed pages so makes memory more available.

Hello again. I am currently using File.Copy method but I was wondering about using the FileStream.CopyToAsync method.
Speed is essential so I am trying to see which is faster / better to use. ( less error prone). The files range in size from as small as 1 byte to as large as a few gb. (max is around 10gb atm but that can easily change)
What are you trying to do in terms of I/O? I do not exactly know what the streams you are using do.

You could try memory mapping the files. For 10GB support this will logically need you to have a 64bit OS. The main advantage is it completely eliminates steams and converts everything into memory manipulation and lets the OS handle the I/O. Especially for random access it is good. It is also memory efficient if large amounts of data from a file are used often since the pages from the file are used directly instead of being copied into other pages for process use (avoid extra data copy operations).
 
Level 15
Joined
Oct 18, 2008
Messages
1,591
Windows uses a paging memory system. Even file I/O is done using paging unless specifically requested. What flush does is force out pages to backing storage instead of letting them reside in memory and in the process converts dirty pages (ones with changes) into can be freed pages so makes memory more available.

Still it doesn't really justify what happens without it... Imho the GC should clean up the remainders of the stream (since it has been disposed of at the end of the "using" statement) but it stays in the memory even after that...

Also as a sidenote, as far as I know you need unsafe code for memory mapping, so no Windows Phone / Windows Store apps there :)
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,258
However since it uses a safe context, it might be limited (not entirely sure since I haven't used it yet)
Only some methods of it are unmanaged and those generally rely on low level OS context (why they are marked as unmanaged ).

I have no idea about mobile devices but unless they live under a rock they should also support it since it would be silly for such devices to not used virtual memory systems.

The main advantage of memory mapping is the reduction in OS calls.

Stream based IO
seek
read (1 or more times)
Repeat

Memory Mapped IO
Map file section
Use mapped section natively (no OS calls)
Unmap file section
Repeat

It also removes the need for a read buffer (unless you want to copy from the mapping, manipulate and then possibly copy back) since you can read from the mapping directly as it acts as a buffer. Since calls to methods such as read or seek require expensive OS calls they have long execution times. Memory mapping only has expensive OS calls when you initialize and flush the mappings. Although these calls are more expensive than normal read/seek, they only need to be carried out once. In memory mapping the actual I/O is done as a page fault which happens via interrupt so no expensive OS calls are required from the process.

Is it faster? Depends.
If you are issuing a lot of seek and read calls then possibly.
If you are issuing a lot of read calls (small read buffer) then possibly.
If you are reading sequentially section by section then probably not.
 

Dr Super Good

Spell Reviewer
Level 64
Joined
Jan 18, 2005
Messages
27,258
What I am trying to do is to copy files from one location to another in the fastest way possible.
Surely File::Copy would be the fastest way? It is also supported by XNA so fully portable. If it blocks until complete you can always create multiple threads to copy multiple files at once (be aware that I/O performance will be practically identical and may even slow down with multiple threads using the same backing storage.

Some notes about file copying...
1. CPU and memory are unlikely to be bottlenecks. The disc I/O rate will determine the maximum speed of the program.
2. The OS may support file copying natively depending on the file manager specification, there should be no need for your program to handle the data processing and doing so will likely be less efficient.
3. It is virtually impossible to obtain the theoretical maximum performance from backing storage. Mechanical drives suffer fragmentation (seeking losses) and location based I/O speed (fastest near edge of platter, slowest in middle). SSDs can suffer management wastage (zeroing, moving of over-used blocks etc) although they are much more likely to perform close to the maximum specified speed.

If you want to copy and immediately use the copied data then memory mapping actually might be better. It will allow you to start using the written data before all writing is complete. However in such a case you could probably process the data from the source file before copying and only write out the results (and whatever of the source file is not changed). Memory mapping allows you to use the virtual memory system as a form of I/O buffer. Especially if you make a lot of little changes to a single file page this gives best performance since then only once the page has stabilized (or when you un-map) is the writing actually performed. Obviously this brings out new fail conditions in the case of crashes.
 
Status
Not open for further replies.
Top