What's the difference? Remainder vs Modulus

[アーティクル]
12/05/2011

Today, another episode of my ongoing series "What's the difference?" Today, what's the difference between a remainder and a modulus, and which, if either, does the % operator represent in C#?

A powerful idea that you see come up in mathematics and computer programming over and over again is the idea of an equivalence relation.

First off, let's define (again) what a "relation" is; a relation is a function that takes two things, say, integers, and returns a Boolean that says whether the two integers are "related" or not. For example, "is greater than" is a relation. It takes two integers, say 4 and 2, and returns a Boolean that indicates whether 4 "is greater than" 2 or not; in this case, yes, it is.

An "equivalence relation" is a relation which is reflexive, symmetric, and transitive. That is, if ∾ is an equivalence relation, then:

Reflexive: X∾X is true for any X
Symmetric: X∾Y = Y∾X for any X and Y
Transitive: if X∾Y and Y∾Z are both true then X∾Z is also true.

Clearly "is greater than" is not an equivalence relation; though it is transitive, it is neither symmetric nor reflexive. "Have the same sign" on the other hand, is an equivalence relation; all negative numbers are equivalent to each other, all positive numbers are equivalent to each other, and zero is equivalent to itself.

An equivalence relation partitions the set of integers into (possibly infinitely many) equivalence classes. For example, let's consider the equivalence relation A∾B if and only if A-B is evenly divisible by 4. So 0∾4 is true and -86∾2 is true, and so on. We can make four "equivalence classes" where every integer is in exactly one class, and every integer in each class is related to every other integer in that class. The classes are {0, 4, -4, 8, -8, 12, -12, ... }, {1, -3, 5, -7, 9, -11, ... }, {2, -2, 6, -6, 10, -10} and {3, -1, 7, -5, 11, -9, ... }.

These classes are classes of "equivalent" integers, where "equivalence" means "is equally good in terms of determining whether the relation holds". Equivalence classes are often identified by a "canonical" member; in the case of the equivalence classes of "integers that are congruent modulo four", the canonical members are usually taken to be 0, 1, 2 and 3.

One can of course generalize this; given any positive integer n you can create the equivalence relation "A∾B if and only if A-B is evenly divisible by n". This defines n equivalence classes, which are usually identified by the canonical elements 0, 1, ..., n-1. The relation is usually written (somewhat clumsily, in my opinion) as A ≡ B mod n and is read as "A is congruent to B modulo n".

It is very tempting to think of the A % M operator in C# as meaning "partition the integers into M equivalence classes and give me the canonical element associated with the class that contains A" operator. That is, if you say 123 % 4, the answer you get is 3 because 123 ≡ 3 mod 4, and 3 is the "canonical" element associated with the equivalence class that contains 123.

However, that is not at all what the % operator actually does in C#. The % operator is not the canonical modulus operator, it is the remainder operator. The A % B operator actually answer the question "If I divided A by B using integer arithmetic, what would the remainder be?"

In order to work out what the remainder is, we need to first clearly define our terms. The divisor, dividend, remainder and quotient must obey the following identity:

dividend = quotient * divisor + remainder

Now, that is not enough to uniquely determine an answer to the question of what, say, 123 / 4 is. 123 = 30 * 4 + 3 is the desired solution, but 123 = 6000 * 4 - 23877 is also a possible solution. We need to put restrictions on the quotient and remainder.

Naively we might say that the lesser quotient is the better quotient. But of course that does not help either, because 123 = 1 * 4 + 119 is also perfectly good, and 1 is less than 30, but 1 is not a better quotient. We need some better rules; the rules we've come up with to determine the quotient and remainder (assuming that the divisor and dividend are valid, that is, we're not dividing by zero and so on) are:

* If there is any quotient that makes the remainder zero then that's the quotient and the remainder is zero, and we're done.
* Otherwise, do the division in all non-negative integers. Take the largest quotient that keeps the remainder positive.
* If the divisor and dividend have opposite signs then make the quotient negative.
* The remainder can now be worked out so as to obey the identity.

The net result here is that integer division rounds towards zero. That should make sense; we want 123/4 to be 30 with a remainder of 3, not 31 with a remainder of -1. Similarly, we want -123/4 to be -30, not -31! It would be bizarre to say that 4 goes into 123 30 times but goes into -123 -31 times! One expects that changing the sign of a term changes the sign of the result; it does not change the magnitude of the result.

If -123/4 is -30, then what is the remainder? It must obey the identity, so the remainder is -3. That is not the canonical item associated with the equivalence class that contains -123; that canonical item is 1.

It's important to remember this fact when doing things like balancing a hash table:

int bucketIndex = item.GetHashCode() % bucketCount;

or determining if a number is odd:

var oddNumbers = someSequence.Where(x => x % 2 == 1);

The first is wrong because the array index can be negative if the hash code is negative; the second does not classify -123 as an odd number. The % operator does not give the canonical modulus, it gives the remainder.

Comments

Anonymous
December 04, 2011
Wow. After so many years of C#, it's impressive to realize that I was wrong on the behaviour of such a simple operator Oo
Anonymous
December 05, 2011
What is the motivation for % to be remainder operator instead of modulus? Or, in other words, in what situations remainder is more useful?
Anonymous
December 05, 2011
almost every instance of 60 in your arithmetic examples should be replaced with 30. 4 * 60 is 240. :-) Whoops; that's what I get for editing the example after it was written. (I was originally going to talk about parity, so the multiplicands were twos.) You've got to apply the edits consistently it turns out. Thanks for the note. -- Eric
Anonymous
December 05, 2011
Nitpick: By the given definition of Reflexiveness (X∾X is true for any X), "Are both greater than zero" would not be an equivalence relation, as it would not hold for any X <= 0. I'm not a mathematician, so I'm not sure if there is actually a slightly different true definition of reflexiveness, or if this is an incorrect example. Looking at Wikipedia I believe it's the latter? en.wikipedia.org/.../Equivalence_relation But please don't get me wrong - this was a great post, Eric. :)
Anonymous
December 05, 2011
The comment has been removed
Anonymous
December 05, 2011
Some may find this paper, Division and Modulus for Computer Scientists, helpful: legacy.cs.uu.nl/.../divmodnote.pdf
Anonymous
December 05, 2011
@Vlad - there's no good reason for it to be that way except that early hardware where C first ran defined the operation that way. The remainder function on negative numbers is pretty useless and one in practice invariably winds up fixing the result if negative numbers are a possibility. For all the reasons "it's important to remember this fact," it's clearly the wrong decision to make when designing a programming language's semantics.
Anonymous
December 05, 2011
@Matt The reflexive property isn't dependent on just one comparison, but on the comparison of two comparisons. For example, It doesn't matter if a > 0 or b > 0, but that (a > 0) = (b > 0). Here is a pseudo-implementation (in something that looks a lot like javascript) that might help: function bothGreaterThanZero(a, b) { return a > 0 && b > 0; } function isReflexive(a, b, func) { x = func(a, b); y = func(b, a); return x === y; } var a = -1, b = -5; var result = isReflexive(a, b, bothGreaterThanZero); Since first and second are equal, the relation is reflexive.
Anonymous
December 05, 2011
@Darren: the PDPs for which C was originally cooked up had a very limited instruction set. Both kinds of remainders were available as common library functions though. Insofar as C#'s % definition is historically inspired, it is more likely due to the influence of the x86 idiv instruction. Also note that the first major .Net languages were C# and VB.Net which had a C++ and VB heritage. Since C++'s % was implementation defined it may have made sense to choose VB's Mod behaviour.
Anonymous
December 05, 2011
DOH! I need a rewind on that example. That is what I get for posting without running. function greaterThanZero(num) { return num > 0; } function isReflexive(a, b, func) { x = func(a); y = func(b); return x === y; } var a = -1, b = -5; var result = isReflexive(a, b, greaterThanZero);
Anonymous
December 05, 2011
Surely, the first rule on the reminder you want is abs(reminder) < abs(divisor). > Similarly, we want -123/4 to be -30, not -31! We actually want it to be -30.75, and if you round that down (which is what you want in most cases), you get -31. This is what you get in languages inspired by math, rather than CPU instruction sets. For a good treatment of the subject see Daan's classical paper: legacy.cs.uu.nl/.../divmodnote.pdf Or even Wikipedia: en.wikipedia.org/.../Modulo_operation
Anonymous
December 05, 2011
good to know! for some reason i thought that % operator always yields the canonical number ( a positive number). It turns out I was wrong and negative numbers are possible too.
Anonymous
December 05, 2011
@Eugene > Surely, the first rule on the reminder you want is abs(reminder) < abs(divisor). So? This requirement is observed by both the canonical modulus operation and by the x86-based remainder operation. >> Similarly, we want -123/4 to be -30, not -31! > We actually want it to be -30.75, and if you round that down > (which is what you want in most cases) Why? Why is "round toward negative infinity" a more useful operation than "round toward zero"? Alternatively, why is it good to have a system where (-x)/y != -(x/y)?
Anonymous
December 05, 2011
This is one of my bugbears. I don't think I have EVER found the remainder operation useful, except as a means to calculate the modulus. I've had to write a modulus function for pretty much every language I've ever used, and this isn't always entirely straightforward; handling edge cases (like INT_MIN) can be tricky. Modulus really ought to be a member of the Math class.
Anonymous
December 05, 2011
This is an interesting article. I am like many of the other posters--in 25 years of practical programming, I have never used any language's "remainder" operator for anything other than modulo on positive integers. Mandatory nitpick: not all relations are functions, although all functions are relations. And even more importantly, not all relations are binary.
Anonymous
December 05, 2011
The comment has been removed
Anonymous
December 05, 2011
> Transitive: if X∾Y and Y∾Z are both true then X∾Z is also true; otherwise it is false It seems there's an inaccuracy here, namely in the 'otherwise it is false' part. For equivalence there's no difference, but it is not part of 'transitive' definition. Assuming it is correct, we have that !(X∾Y && Y∾Z) -> !X?Z. Contraposition of implication is equivalent transition, so X∾Z -> X∾Y && Y∾Z. So, put simply, we have that if X∾Z is true, then for any Y should also be true X∾Y && Y∾Z which is definitely incorrect for any transitive relation. For simple example let's take > relation which is transitive as we know. Let X=3, Y=3, Z=1. 3>3 && 3>1 (X>Y && Y>Z) is not true, so according to your definition 3>1 (X>Z) should be false. Bah.
Anonymous
December 05, 2011
Dale: > Alternatively, why is it good to have a system where (-x)/y != -(x/y)? Suppose you're implementing a game, simulation or renderer where you have objects positioned in a 2D or 3D space. There is nothing special about the 0 coordinate, and you don't want a discontinuity as objects travel between negative and positive coordinates. Say you measure position in units and split space up into 3 unit square cells. If you use division and round-towards-zero integer division to determine cell, you get a discontinuity as an object moves across zero: Position = ... -11, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7... Cell = ... -3, -3, -3, -2, -2, -2, -1, -1, -1, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2... So your object spends longer in cell 0 and there's a discontinuity between negative and positive space. Here's a similar bug where rounding direction errors have caused unintentionally different behaviour in a game for negative coordinate positions than positive ones: gaming.stackexchange.com/.../do-positive-coordinate-locations-give-more-ore-in-minecraft I'd say it's not that the property "For all x, y: if y!=0 then (-x)/y == -(x/y)" is undesirable, but that in practice it's not as useful a property as "For all x, y, k: if y!=0 then x / y + k == (x + ky) / y" I would concur with the others and say that in 6 years as a professional programmer and maybe 20 years as an amateur I cannot think of a single instance of wanting the remainder operation, but I can think of a great many instances of wanting the modulus operator. (However, I also don't think I can remember an instance where I cared about the behaviour for a negative divisor, so I'm equally happy with floored division or Euclidian division, as described in the Wikipedia article.)
Anonymous
December 05, 2011
As an introductory reference on the subject, your article is missing one detail: an example telling how to get Modulus in C# for those cases where that IS what you want. What is the "best" way to do this? (Assuming you want a formula that works even in corner cases like INT_MIN).
Anonymous
December 06, 2011
public int Modulo (int x, int y) { if ( y == 0) return 0; int reduced = (x / y); int absReduced = reduced < 0 ? -1 * reduced : reduced; return absReduced % y; }
Anonymous
December 06, 2011
>> Transitive: if X∾Y and Y∾Z are both true then X∾Z is also true; otherwise it is false Ivan> It seems there's an inaccuracy here, namely in the 'otherwise it is false' part. For equivalence there's no difference, [...] For simple example let's take > relation which is transitive as we know. Let X=3, Y=3, Z=1. 3>3 && 3>1 (X>Y && Y>Z) is not true, so according to your definition 3>1 (X>Z) should be false. Bah. Actually, it fails for equivalences too. Take integer equality, X=1, Y=2, Z=1. X=Y is false, therefore X=Z, 1=1, is false. More formally, the author wrote "(X~Y)&&(Y~Z)=(X~Z)" when he only meant to say "(X~Y)&&(Y~Z)->(X~Z)". Overall though, well writen. It's interesting to contrast this further with other languages, as Anonymous Coward has. I was under the false impression for some time that C/C++ held to the remainder-quotient identity while rounding toward negative infinity, but this was just an artifact of the compiler/architecture. I have not performed % on a signed number since I learned the truth.
Anonymous
December 06, 2011
An easy way to get the Conical Modulus is to convert to uint. Using Eric's example: int bucketIndex = (int)((uint)item.GetHashCode() % (uint)bucketCount);
Anonymous
December 06, 2011
@Aaron: yeah, good catch, thanks. :)
Anonymous
December 06, 2011
@DRBlaise: This doesn't work. For example, (-1 mod 4) is 3, but your code gives 1.
Anonymous
December 06, 2011
To calculate the modulus "x mod n" I tend to use one of the following methods:

general: y = x%n; if y<0 { y = y+n; }
when n is a power of 2 use a bit mask: y = x & (n-1)
in situations I know that x and n are not negative: y = x%n; (remainder and modulus are the same in this case) And like many other respondents, I also never had any need for the "remainder" operation in my 20+ years of programming (and very frequent need of the modulus). The existing '%' operation cannot be changed in C# anymore, but isn't it time to add a proper modulus operator to the language? For example '%%'.

Anonymous
December 06, 2011
I tend to be really lazy:

int mod = (x % n + n) % n;
int mod = (x + bigConstant) % n; (if I know that x will never be smaller than -bigConstant, chosen to be a multiple of n)

Anonymous
December 06, 2011
> (Note that if a relation is an equivalence relation then the converse also holds: if X∾Y and Y∾Z are not both true the X∾Z is false.) If X = 1, Y = 2, Z = 1, and the equivalence relation is "=", does this mean that if 1 = 2 and 2 = 1 are not both true (which they aren't), then 1 = 1 is false? I think you meant if exactly one of X∾Y and Y∾Z is true, then X∾Z is false. Or I could be missing something.
Anonymous
December 06, 2011
@carlos: Converting to uint DOES work-Did you run your example through code (-1 mod 4)? int x = -1; int y = 4; Console.WriteLine((uint)x % (uint)y); This prints 3.
Anonymous
December 06, 2011
"(Note that if a relation is an equivalence relation then the converse also holds: if X∾Y and Y∾Z are not both true the X∾Z is false.)" For example, 1=2 and 2=1 are not both true, so 1=1 is false.
Anonymous
December 07, 2011
@DRBlaise: Sorry, I had a brain fart. I was thinking (uint) is the same as Abs(), when I know very well it isn't. However, your solution still fails unless 2 divides the dividend. e.g. for -1 mod 5 you get 0.
Anonymous
December 07, 2011
So I guess this has the same problem as everyone else as to what this operator does: msdn.microsoft.com/.../ydwa9zh0.aspx
Anonymous
December 09, 2011
+1 to weeble0's comment. I have also been a professional programmer for 6-7 years and an amateur for 20 years, and I am also not aware of a single instance of wanting the "remainder" operation involving negative numbers, but I can think of several instances of wanting the modulus operator (i.e. where I want 'mod' to produce a positive result). Normalizing angles comes to mind immediately. It would sure be SO nice if I could just write "angle % 360.0".
Anonymous
December 11, 2011
The comment has been removed
Anonymous
December 22, 2011
For an easy modulus that includes negatives, I use the below (in the call to WriteLine): int n = 5; for (int i = -10; i < 11; i++) { Console.WriteLine(i % n < 0 ? n+(i % n) : i % n); }
Anonymous
December 27, 2011
It's wonderful to see mathematics still being talked about as it relates to programming. Too often it's just a matter of trying to see what give the answer that's useful to you in that situation and not bothering to look at the "real" story. Great article.
Anonymous
January 28, 2012
Just one month ago you recommended "Java Puzzlers: Traps, Pitfalls, And Corner Cases" on your C# reading list. The unintuitive behaviour of '%' is the very first "puzzler" in the book. You are proving your own point that it is a useful book for C# devs to read as well.
Anonymous
January 31, 2012
Eric, why use x % 2 == 1 instead of just x % 2 != 0 ?
Anonymous
November 21, 2012
Surely thi is a classic example of Principle of Greatest Surprise: How to make sure to surprise the greatest number of programmers in the worst possible way at the worst possible time. Haivng been bitten by this in C, but understanding why C++ had to remain backwards compatible, I would have bet real money that C# would make a more sensible decision. Wrong! DO we have to wait another 30 year for a programming language to get it right?

次の方法で共有

What's the difference? Remainder vs Modulus

Comments

その他のリソース