C# FileStream.CopyToAsync vs File.Copy

deathismyfriend · Feb 13, 2014

Hello again. I am currently using File.Copy method but I was wondering about using the FileStream.CopyToAsync method.
Speed is essential so I am trying to see which is faster / better to use. ( less error prone). The files range in size from as small as 1 byte to as large as a few gb. (max is around 10gb atm but that can easily change)
I have tried this and it is marginally faster.

JASS:

using (FileStream SourceStream = File.Open(fileName, FileMode.Open, FileAccess.Read))
                {
                    using (FileStream DestinationStream = File.Create(fileDestination))
                    {
                        await SourceStream.CopyToAsync(DestinationStream);
                    }
                }

I was wondering if there is a better more efficient way to copy files from one location to another ? Also what are the advantages of using an Async copy vs File.Copy ? What other methods of copying is there and what is preferred ? Any help is greatly appreciated.

Kanadaj · Feb 13, 2014

a tip: use stream.Flush; stream.Close(); in the end of the using statements: even though it's supposed to dispose itself, it leaves the resources in the memory if you don't close it.

deathismyfriend · Feb 14, 2014

Kanadaj said:
a tip: use stream.Flush; stream.Close(); in the end of the using statements: even though it's supposed to dispose itself, it leaves the resources in the memory if you don't close it.

I did several tests and the above does not seem to be the case. When using the stream.Flush(); and stream.Close(); It does not do anything more than the using statements. It does slow down the file copying by almost 3 times.

Kanadaj · Feb 14, 2014

Hmmm... For me not using those statements caused a huge amount of memory leaks ^^ Used 2 memory streams and a file stream for downloading, encoding and saving a file, ended up with 500MB RAM taken...

deathismyfriend · Feb 14, 2014

Kanadaj said:
Hmmm... For me not using those statements caused a huge amount of memory leaks ^^ Used 2 memory streams and a file stream for downloading, encoding and saving a file, ended up with 500MB RAM taken...

That is very odd. I will test on my other pcs and see if it occurs. What os do you use and what .net framework ? Maybe that is the difference.

Kanadaj · Feb 14, 2014

Windows 8.1 with .NET Framework 4.5, and this time it was a WPF application

Also, it's not the first time for me to experience memory leak from a disposed stream either.

deathismyfriend · Feb 14, 2014

Kanadaj said:
Windows 8.1 with .NET Framework 4.5, and this time it was a WPF application

Also, it's not the first time for me to experience memory leak from a disposed stream either.

That is interesting. I ran it on my other pcs. With os windows 7 / vista / xp all 64 bit ultimates with no memory leak.

Kanadaj · Feb 14, 2014

That's interesting... Here is part of my code (got to delete parts though ^^)

C#:

class Encrypter
{
	public MemoryStream Encrypt(Stream streamToEncrypt)
	{
		var outStream = new MemoryStream((int)streamToEncrypt.Length);
		var positionInPassword = 0;

		//Encryption algorithm deleted from here and there :P

		var b = new byte[streamToEncrypt.Length];
		streamToEncrypt.Read(b, 0, (int)streamToEncrypt.Length);
		outStream.Write(b, 0, b.Length);
		outStream.Position = 0;
		return outStream;
	}
}

C#:

var encrypter = new Encrypter();
using (var stream = e.Result.Stream)
{
	using (var encrypted = encrypter.Encrypt(stream))
	{
		//stream.Flush();
		//stream.Close();
		using (var fstream = new FileStream(Path.Combine(MainViewModel.DownloadPath, fileName), FileMode.Create))
		{
			var bytes = new byte[encrypted.Length];
			encrypted.Read(bytes, 0, (int)encrypted.Length);
			//encrypted.Flush();
			//encrypted.Close();
			fstream.Write(bytes, 0, bytes.Length);
			//fstream.Flush();
			//fstream.Close();

			//Yea, I didn't use CopyTo, like a real caveman :P
			
			SaveDownloads();
		}
	}
}

deathismyfriend · Feb 14, 2014

Couldn't it be because you are using a memory stream ? It could be what is causing the leak since it is never being closed out.

Kanadaj · Feb 14, 2014

The MemoryStream is created in a using statement, that's why I find it strange, and I'm talking about an 80MB file generating 400MB RAM usage, so I it can't be a single stream's error

deathismyfriend · Feb 14, 2014

Your Encrypt method returns a MemoryStream. I think that stops the using statement from releasing that instance of the MemoryStream. I am not entirely sure but that is what I think is the cause since our methods are the same other than the part that you use the MemoryStream in that way. That is very odd though. I'll try to do more testing when I can.

Dr Super Good · Feb 14, 2014

Hmmm... For me not using those statements caused a huge amount of memory leaks ^^ Used 2 memory streams and a file stream for downloading, encoding and saving a file, ended up with 500MB RAM taken...

Windows uses a paging memory system. Even file I/O is done using paging unless specifically requested. What flush does is force out pages to backing storage instead of letting them reside in memory and in the process converts dirty pages (ones with changes) into can be freed pages so makes memory more available.

Hello again. I am currently using File.Copy method but I was wondering about using the FileStream.CopyToAsync method.
Speed is essential so I am trying to see which is faster / better to use. ( less error prone). The files range in size from as small as 1 byte to as large as a few gb. (max is around 10gb atm but that can easily change)

What are you trying to do in terms of I/O? I do not exactly know what the streams you are using do.

You could try memory mapping the files. For 10GB support this will logically need you to have a 64bit OS. The main advantage is it completely eliminates steams and converts everything into memory manipulation and lets the OS handle the I/O. Especially for random access it is good. It is also memory efficient if large amounts of data from a file are used often since the pages from the file are used directly instead of being copied into other pages for process use (avoid extra data copy operations).

Kanadaj · Feb 14, 2014

Dr Super Good said:
Windows uses a paging memory system. Even file I/O is done using paging unless specifically requested. What flush does is force out pages to backing storage instead of letting them reside in memory and in the process converts dirty pages (ones with changes) into can be freed pages so makes memory more available.

Still it doesn't really justify what happens without it... Imho the GC should clean up the remainders of the stream (since it has been disposed of at the end of the "using" statement) but it stays in the memory even after that...

Also as a sidenote, as far as I know you need unsafe code for memory mapping, so no Windows Phone / Windows Store apps there

Dr Super Good · Feb 14, 2014

Also as a sidenote, as far as I know you need unsafe code for memory mapping, so no Windows Phone / Windows Store apps there

No you just need to use the memory mapping interface? Maybe it is only for C++/C.

Kanadaj · Feb 14, 2014

Hmmm... I guess you need this one for C#
http://msdn.microsoft.com/en-us/library/system.io.memorymappedfiles.memorymappedfile%28v=vs.110%29.aspx

However since it uses a safe context, it might be limited (not entirely sure since I haven't used it yet)

However many of it's methods return unmanaged streams and is unavailable for mobile applications (WP8 and WinRT).

Dr Super Good · Feb 15, 2014

However since it uses a safe context, it might be limited (not entirely sure since I haven't used it yet)

Only some methods of it are unmanaged and those generally rely on low level OS context (why they are marked as unmanaged ).

I have no idea about mobile devices but unless they live under a rock they should also support it since it would be silly for such devices to not used virtual memory systems.

The main advantage of memory mapping is the reduction in OS calls.

Stream based IO
seek
read (1 or more times)
Repeat

Memory Mapped IO
Map file section
Use mapped section natively (no OS calls)
Unmap file section
Repeat

It also removes the need for a read buffer (unless you want to copy from the mapping, manipulate and then possibly copy back) since you can read from the mapping directly as it acts as a buffer. Since calls to methods such as read or seek require expensive OS calls they have long execution times. Memory mapping only has expensive OS calls when you initialize and flush the mappings. Although these calls are more expensive than normal read/seek, they only need to be carried out once. In memory mapping the actual I/O is done as a page fault which happens via interrupt so no expensive OS calls are required from the process.

Is it faster? Depends.
If you are issuing a lot of seek and read calls then possibly.
If you are issuing a lot of read calls (small read buffer) then possibly.
If you are reading sequentially section by section then probably not.

deathismyfriend · Feb 15, 2014

Thanks for your input DSG and Kanadaj.

What I am trying to do is to copy files from one location to another in the fastest way possible. Most of these files will be small (under 1GB) but others can be (big like 10GB).
Yes i do have a 64bit OS.

Dr Super Good · Feb 15, 2014

What I am trying to do is to copy files from one location to another in the fastest way possible.

Surely File::Copy would be the fastest way? It is also supported by XNA so fully portable. If it blocks until complete you can always create multiple threads to copy multiple files at once (be aware that I/O performance will be practically identical and may even slow down with multiple threads using the same backing storage.

Some notes about file copying...
1. CPU and memory are unlikely to be bottlenecks. The disc I/O rate will determine the maximum speed of the program.
2. The OS may support file copying natively depending on the file manager specification, there should be no need for your program to handle the data processing and doing so will likely be less efficient.
3. It is virtually impossible to obtain the theoretical maximum performance from backing storage. Mechanical drives suffer fragmentation (seeking losses) and location based I/O speed (fastest near edge of platter, slowest in middle). SSDs can suffer management wastage (zeroing, moving of over-used blocks etc) although they are much more likely to perform close to the maximum specified speed.

If you want to copy and immediately use the copied data then memory mapping actually might be better. It will allow you to start using the written data before all writing is complete. However in such a case you could probably process the data from the source file before copying and only write out the results (and whatever of the source file is not changed). Memory mapping allows you to use the virtual memory system as a form of I/O buffer. Especially if you make a lot of little changes to a single file page this gives best performance since then only once the page has stabilized (or when you un-map) is the writing actually performed. Obviously this brings out new fail conditions in the case of crashes.

C# FileStream.CopyToAsync vs File.Copy

Similar threads