C# combination formula optimization

deathismyfriend · Nov 6, 2013

I am having a problem with the combination formula hitting the max ulong value to fast.
I was wondering if anyone could help me to improve the formula.

This is the factorial class i am using to get the factorial value from the numbers.

Code:

    class Factorial
    {
        public static ulong Get(int factorial)
        {
            ulong count = 1;
            ulong end = Convert.ToUInt64(factorial) + 1;
            for (ulong i = 1; i < end; i++)
            {
                count = count * i;
            }
            return count;
        }
    }

Here is the formula i am using.

Code:

        public static ulong Total(int size, int groupSize)
        {
            return Factorial.Get(size) / (Factorial.Get(groupSize) * Factorial.Get((size - groupSize)));
        }

Thanks for any and all help.

TriggerHappy · Nov 6, 2013

Does it have to be an unsigned long?

You can try using BigInteger.

PurgeandFire · Nov 6, 2013

The nice thing about combinations is that lots of stuff cancel out.

e.g.
10 C 6 = (10!) / (4! * 6!)

You can rewrite that as:
(10 * 9 * 8 * 7) / (4!)

I haven't really given it much thought, but off the top of my head I'd just resort to just canceling terms:

Java:

    // Calculates n * (n - 1) * (n - 2) * ... k
	private static long partialFactorial(int n, int k) {
		long result = n;
		for (int i = (n-1); i >= k; i--) {
			result = result * i;
		}
		return result;
	}
	
    // Calculates the full factorial
    // mine is a little different, calculates backward
	private static long factorial(int n) {
		long result = n;
		for (int i = (n-1); i > 1; i--) {
			result = result * i;
		}
		return result;
	}
	
    // n choose r
	private static long nCr2(int n, int r) {
		long numerator;
		long denominator;
		if ((n - r) > r) {
			numerator = partialFactorial(n, n - r + 1);
			denominator = factorial(r);
		} else {
			numerator = partialFactorial(n, r + 1);
			denominator = factorial(n - r);
		}
		return numerator / denominator;
	}

You can cut the nCr2 down, but the variables are there for explicitness. Basically, it just cancels out terms. Let's say you have 10C4. If you were doing it on paper, you'd simplify it as so:

10! / ((10 - 4)! * 4!)
10! / (6! * 4!)
10 * 9 * 8 * 7 * 6 ... * 1 / (6 * 5 * 4... * 1 * 4!)
10 * 9 * 8 * 7 * ~~6 ... * 1~~ / (~~6 * 5 * 4... * 1~~ * 4!)
10 * 9 * 8 * 7 / 4!

To get the most out of it, you'll want to cancel the denominator factorial that is largest. In the case above, it is (n - r)! = (10 - 4)! = 6!. However, there are cases where r is greater than (n - r), so you'll want to cancel out r! instead. That is what the function determines.

If you want to support even larger numbers, you can perform even more divisions by first figuring out what else can cancel. For example, in this:
(10 * 9 * 8 * 7) / (4 * 3 * 2 * 1)
You can cancel out the 4, 3, 2:
(5 * 3 * 2 * 7) = 210
That would require some extra work though. I didn't include that. :grin:

Good luck! Btw, my code is in Java. You might have to rewrite a bit of it to fit C#, but the syntax of our codes are pretty similar so it shouldn't be a problem. I tested it versus the regular nCr (in your first post), and it indeed allowed me to calculate some larger factorials e.g. 22 choose 15.

P.S. There may be other ways of going about it. I just went with what came to mind.

Dr Super Good · Nov 6, 2013

deathismyfriend, as they teach you in mathematics, you do not need to compute the factorials to get the number of combinations you want. Since computing factorials involve Pi (capital Pi, not the constant) operation, division can cancel out parts of the range of the Pi operation.

The end result (as PurgeandFire has shown) is the computation of 2 Pi operations over small (comparatively) ranges. This is a huge reduction by 2 full range Pi operations.

There is also a further optimization which again relates to range cancelation (as PurgeandFire has shown I think). When the combination size goes above half the number of elements it starts to decrease again (like in a Pascal Triangle) which represents the Pi operation ranges starting to cancel each other out. As such the number of combinations of length 5 in a set of length 5 is the same as the number of combinations of length 1 in the same set.

Ultimately you would want to represent factorial as a special Pi operation object. This would represent a numeric range which a Pi operation is to apply to. The computer can then simplify the ranges before trying to compute a value from Pi and thus it would solve the optimum number of multiplications necessary all the time without the need for special hard coded cases. However the method used by PurgeandFire is more than good enough unless you plan to use factorial or other Pi operation involving functions often.

To PurgeandFire, your factorial method should be nothing more than an adapter to the partial factorial method called with some arguments since the code is mostly the same between them (next to an extra parameter and the value test). The JVM should automatically inline the code at runtime while your produced source code will have less procedural coupling (more readable) and the byte code files may be smaller (unless it compiler inlines the code which is still good).

Menag · Nov 6, 2013

Dr Super Good said:
Since computing factorials involve Pi (capital Pi, not the constant) operation, division can cancel out parts of the range of the Pi operation

Wow, what's that mysterious Pi operation? Must be fucking complex since you mention it like 5 times in your post and do like it's mega evil shit.

@ topic here's some link:
http://en.wikipedia.org/wiki/Binomial_coefficient#Binomial_coefficient_in_programming_languages

PurgeandFire meant this i think, you can't do it much faster, only if you want to use stirling.

Dr Super Good · Nov 6, 2013

Wow, what's that mysterious Pi operation? Must be fucking complex since you mention it like 5 times in your post and do like it's mega evil shit.

It is like Sigma except that produces the sum of all terms. Pi produces the product of all terms and is how one defines the factorial function.

Sigma works for addition and subtraction. So the ranges can be reduced based upon a sequence of sigma additions and subtractions. Pi works for multiplications and divisions and eventually will work out a product of powers of terms.

If one defines combinations in Pi, one can clearly see there is range overlap and cancelation, something not immediately apparent when one defines it as factorials.

muzzel · Nov 6, 2013

wow dude, you really dont get sarcasm ...

deathismyfriend · Nov 6, 2013

Thanks for all the answers. I have already changed it to BigInteger but i will look into what you suggested purge. Thanks a lot everyone.

Dr Super Good · Nov 6, 2013

wow dude, you really dont get sarcasm ...

What is sarcasm doing on a serious board? Please go back to off topic.

I have already changed it to BigInteger

Will work but might cause performance problems for very large combinations. You really have little choice but to use the other method.

deathismyfriend · Nov 6, 2013

PurgeandFire said:
Good luck! Btw, my code is in Java. You might have to rewrite a bit of it to fit C#, but the syntax of our codes are pretty similar so it shouldn't be a problem. I tested it versus the regular nCr (in your first post), and it indeed allowed me to calculate some larger factorials e.g. 22 choose 15.

Thanks a lot. Only thing i will have to rewrite is changing it from long to BigIntegers as i want the ability to calculate huge numbers. BigIntegers with the code i originally posted allowed up to 1,000,000 choose 5. (took a few minutes to calculate lol.) Your way should be a bit faster.

Thanks also DSG. explanation helped a lot to.

Menag · Nov 6, 2013

deathismyfriend said:
1,000,000 choose 5

Try 1,000,000 choose 500,000 instead D:

edit: Well for that high numbers you could use stirling's approximation:

n! ~ sqrt(2pi*n) * (n/e)^n
to be exact:
n! = sqrt(2pi*n) * (n/e)^n * e^lambda_n with 1/(12n+1) <= lambda_n <= 1/(12n)

C# combination formula optimization

deathismyfriend

deathismyfriend

Resources

TriggerHappy

TriggerHappy

Resources

PurgeandFire

PurgeandFire

Resources

Dr Super Good

Dr Super Good

Resources

Menag

Menag

Resources

Dr Super Good

Dr Super Good

Resources

muzzel

muzzel

Resources

deathismyfriend

deathismyfriend

Resources

Dr Super Good

Dr Super Good

Resources

deathismyfriend

deathismyfriend

Resources

Menag

Menag

Resources

Similar threads