0.0 The natural numbers

We start by discussing the natural numbers, which we define in the following way:
 
Definition 0.1 (Natural numbers)
A natural number is a member of the sequence \(0,1,2,3,\dots\), obtained by starting from 0 and adding 1 successively.
 
Notation
We write \(\mathbb{N} = \left\{  0,1,2,3,\ldots\right\} \) for the set of all natural numbers.
 
The curly bracket notation here indicates that the objects are grouped together as a set.  A set is simply a collection of objects; we will discuss more about sets later in chapter 1.
 
Not everyone agrees that 0 should be included as a natural number. When counting (for example, when numbering sections of a document) most people tend to start at 1. Definitions are important in mathematics, and if you're reading material by two different authors then you should always check the definitions that they have each chosen to use.
 
Natural numbers can be added and multiplied. That is, if \(m\) and \(n\) are natural numbers, then we can construct \(m+n\) and \(m\times n\), which are also natural numbers. Addition and multiplication are examples of binary operations: they take a pair of elements from \(\mathbb{N}\) and produce an element of \(\mathbb{N}\). You've done so much adding and multiplying before in your life that you probably don't think that these need a definition. Nevertheless, we will attempt to define them below!
 
Two important natural numbers are 0 and 1, which are the additive and multiplicative identities, meaning they have the properties
 
\(n+0 = n \quad \textrm{and} \quad n\times1 = n \quad \textrm{for all} \ n \in \mathbb{N}.\)
 
The symbol \(\in\) is shorthand to mean "is an element of", and we read this as "for all \(n\) in the natural numbers", or "for all natural numbers \(n\)".
 
Another important property of the natural numbers is that they have an ordering, so we can write things like \(m\le n\). We can carefully define this less-than-or-equal-to symbol:
 
Definition 0.2 (Less than)
Let \(m\) and \(n\) be natural numbers.  We write \(m\le n\) to mean that there exists a natural number \(k\) such that \(m + k= n\).
 
This is an example of a relation, which we'll discuss in chapter 2. This definition includes the case where \(m=n\), because our definition of the natural numbers included 0, and because \(m+0=m\).
 
Notice that our description of the natural numbers is a little unsatisfactory because it relies on the idea of "adding 1", and we still haven't defined addition yet. For the purpose of this course we will rely on our basic intuition for these things, but it is possible to be more careful, building on an axiomatic description of \(\mathbb{N}\) (that is, laying out some clear axioms - statements that we assume to be true as a starting point - and then deducing all other properties from those). A popular way to do this is with the Peano axioms, one of which involves a "successor function". Functions will be discussed in chapter 3.
 

0.1 Mathematical induction

We now move on to talk about induction. The following principle is sometimes quoted as a theorem, although it follows directly from our definition of the natural numbers. In fact it can be used as an axiom when defining \(\mathbb{N}\) in a more rigorous manner.
 
Theorem 0.3 (Principle of Induction)
Let \(P(n)\) be a family of statements indexed by the natural numbers.  Suppose that
   (i) \(P(0)\) is true, and
   (ii) for any \(n\), if \(P(n)\) is true then \(P(n+1)\) is also true. 
Then \(P(n)\) is true for all natural numbers \(n\).
 
Induction is often visualised like toppling dominoes.  The inductive step (ii) corresponds to placing each domino sufficiently close that it will be hit when the previous one falls over, and the initial step, (i) - often called the base case - corresponds to knocking over the first one.  
 
To use induction to prove a family of statements, we simply have to demonstrate (i) and (ii). Before I show you an example, I would like to show you some notation for sums.
 
Notation
The expression
\(\displaystyle \sum_{k=a}^b f(k)\)
where \(f\) is a function means \(f(a)+f(a+1)+f(a+2)+\dots+f(b-1)+f(b)\). We take the sum of the values of \(f\) over all integer \(k\) such that \(a\leq k \leq b\).
 
Note that the sum is not itself a function and doesn't involve \(k\).
 
When written in-line, these sums are sometimes written like this; \(\sum_{k=0}^n f(k)\).
 
By convention, empty sums like \(\sum_{k=0}^{-3} f(k)\) are understood to be zero.
 
Notation
If we want the sum over something other than consecutive values of \(k\), then we could write
\(\displaystyle \sum_{k\in S} f(k)\)
to indicate a sum over values in the set \(S\), or we could even use words like
\(\displaystyle \sum_{p \text{ prime}} f(p),\)
but I'm getting ahead of myself.
 
 
We're ready for an example of proof by induction now.
 
Proposition 0.4
For any \(n\in\mathbb{N}\) 
\(\displaystyle\sum_{k = 0}^{n} k = \frac{n(n+1)}{2}.\)
 
Proof. In this case \(P(n)\) is the statement that the given equality holds for that particular \(n\).
Clearly \(P(0)\) holds because for \(n=0\) the sum on the left-hand side is 0 and the expression on the right-hand side is also 0.
Now suppose \(P(n)\) holds. Then
\( \begin{align*} \sum_{k = 0}^{n+1} k &= \sum_{k = 0}^{n} k + (n+1) \\ & = \frac{n(n+1)}{2} + (n+1) \qquad \qquad \textrm{(by the inductive hypothesis)} \\ & = \frac{(n+1)(n+2)}{2}, \end{align*} \)
which is exactly the statement \(P(n+1)\).  So by induction, \(P(n)\) is true for all \(n\).
\(\square\)
 
The small square on the right here is used to signify that we've reached the end of the proof. Historically the letters QED were used for this purpose, but that has largely gone out of fashion. Other symbols, including a filled square \(\blacksquare\), are sometimes used.
 
The abbreviations LHS and RHS are commonly used to refer to the "left-hand side" and "right-hand side" of an equality. When using them in your mathematical writing, make sure it is clear which equality you are referring to. I've written a note at the side where I've used the inductive hypothesis (the assumption that \(P(n)\) holds) to help communicate my argument. You should do the same; a well-written proof should always explain any steps that are not just routine algebra. You should always write in full sentences.
 
A straightforward extension of induction is if the family of statements holds for \(n\ge N\), rather than necessarily \(n\ge 0\).
 
Corollary 0.5
Let \(N\) be an integer and let \(P(n)\) be a family of statements indexed by integers \(n\ge N\).  Suppose that
   (i) \(P(N)\) is true, and
   (ii) for any \(n \ge N\), if \(P(n)\) is true then \(P(n+1)\) is also true.
Then \(P(n)\) is true for all \(n \ge N\).
 
Proof. This follows directly by applying the principle of induction to the statements \(Q(n) = P(n+N)\) for \(n\in \mathbb{N}\).
\(\square\)
 
We use the word "corollary" to mean a result that is an extension of, or a consequence of, a theorem or proposition; a corollary is generally not such a major result as the theorem or proposition itself.  
 
The words "theorem" and "proposition" are used somewhat interchangeably to mean a result that one has proved (unlike a "conjecture", which is something that has not yet been proven). Theorem is typically used for more significant results, and theorems are often given a specific name.  
 
We also use the word "lemma", to mean a result that is going to be useful in proving a later theorem or proposition. Lemmas are typically not such exciting or major results in themselves.
 
 
Another variant on induction is when the inductive step relies on some earlier case(s) but not necessarily just the immediately previous case.  This is sometimes called strong induction:
 
Theorem 0.6 (Strong Form of Induction)
Let \(P(n)\) be a family of statements indexed by the natural numbers.  Suppose that
    (i) \(P(0)\) is true, and
    (ii) for any \(n\), if \(P(0)\), \(P(1)\), \(\ldots\), \(P(n)\) are true then \(P(n+1)\) is also true.
Then \(P(n)\) is true for all natural numbers \(n\).
 
Proof. Like above, we define a related family of statements \(Q(n)\). To do this, let \(Q(n)\) be the statement "\(P(k)\) holds for \(k = 0,1,\ldots n\)".  Then the conditions for the strong form of induction are equivalent to (i) \(Q(0)\) holds and (ii) for any \(n\), if \(Q(n)\) is true then \(Q(n+1)\) is also true.  It follows by the (normal) principle of induction that \(Q(n)\) holds for all \(n\), and hence \(P(n)\) holds for all \(n\).
\(\square\)
 
For an example to illustrate how the strong form of induction can be useful, I need to define a couple of terms that are hopefully familiar to you.
 
Definition 0.7 (Divides)
Given natural numbers \(m\) and \(n\), we say that \(m\) divides \(n\) if there exists a natural number \(k\) such that \(m\times k=n\). We write \(m|n\) if \(m\) divides \(n\).
 
 
Definition 0.8 (Prime number)
A natural number \(p>1\) is prime if no natural numbers except 1 and \(p\) divide \(p\).
 
 
Proposition 0.9
Every natural number greater than 1 may be expressed as a product of one or more prime numbers.
 
Proof. Let \(P(n)\) be the statement that \(n\) may be expressed as a product of prime numbers.
 
Clearly \(P(2)\) holds, since 2 is itself prime.
 
Let \(n \ge 2\) be a natural number and suppose that \(P(m)\) holds for all \(m<n\). If \(n\) is prime then it is the product of one prime number, \(n\). If \(n\) is not prime, then there must exist some \(r,s >1\) such that \(n = rs\). By the inductive hypothesis, each of \(r\) and \(s\) can be written as a product of primes, and therefore \(n = rs\) is also a product of primes. Thus, whether \(n\) is prime or not, we have have that \(P(n)\) holds.
 
By strong induction, \(P(n)\) is true for all natural numbers.  That is, every natural number greater than 1 may be expressed as a product of one or more primes.
\(\square\)
 
Related to induction is the idea of recursion as a method of definition.  For example, if we suppose that we are happy with what it means to "add 1" (noting the discussion above), then we can recursively define more general addition on the natural numbers.
 
Definition 0.10 (Addition of natural numbers)
Define addition on \(\mathbb{N}\) by the rules that for all \(m \in \mathbb{N}\),
    (i) \(m + 0 = m\), and
    (ii) for any \(n \in \mathbb{N}\), \(m + (n+1) = (m+n) + 1\).
 
We can combine this with induction to prove some useful properties.  For example,
 
Proposition 0.11 (Associativity)
Addition on \(\mathbb{N}\) is associative.  That is, for all \(x,y,z \in \mathbb{N}\),
\(x+(y+z) = (x+y)+z.\)
 
Proof. We induct on \(z\), so first suppose \(z = 0\).  Then, for any \(x,y \in \mathbb{N}\)
\(\text{LHS} = x+(y+0) = x + y = (x+y) + 0 = \text{RHS},\)
where we have twice used rule (i) from our definition of addition.
 
Now for the inductive step, suppose the proposition is true for \(z = n\), and consider the case \(z = n+1\).  Then, for any \(x,y \in \mathbb{N}\),
\( \begin{align*} \text{LHS} &= x+\left(  y+\left(  n+1\right)  \right) \\   & =x+\left(  (y+n)+1\right)\qquad\text{(rule (ii) from the definition)}\\ & =(x+(y+n))+1\qquad\text{(rule (ii) from the definition)}\\ & =((x+y)+n)+1\qquad\text{(inductive hypothesis)}\\ & =(x+y)+(n+1) \qquad\text{(rule (ii) from the definition)}\\ & = \text{RHS}. \end{align*} \)
So, by induction, the expression holds for any \(z \in \mathbb{N}\).  Thus addition is associative.
\(\square\)
 
We can use a similar approach to define multiplication and factorial.
 
Definition 0.12 (Multiplication of natural numbers)
Define multiplication on \(\mathbb{N}\) by the rules that for all \(m \in \mathbb{N}\),
   (i) \(m \times 0 = 0\), and 
   (ii) for any \(n \in \mathbb{N}\), \(m \times (n+1) = (m\times n) + m\).
 
Definition 0.13 (Factorial)
Define factorial \(n!\) on \(\mathbb{N}\) by the rules that
   (i) \(0! = 1\), and
   (ii) for any \(n \in \mathbb{N}\), \((n+1)! = n! \times (n+1)\)
 
Here is another important property of the natural numbers that we can prove using induction.
 
Theorem 0.14 (Well-ordering property of the natural numbers)
Every non-empty subset of \(\mathbb{N}\) has a least element.
 
We have not yet defined a subset, but maybe you can guess what it means. \(S\) is a subset of \(\mathbb{N}\) if every element of \(S\) is also an element of \(\mathbb{N}\). Non-empty means it contains one or more elements.
 
Proof. We prove this by contradiction. Suppose, for a contradiction, that there is a non-empty subset \(S\) that does not have a least element. We define \(S^*\) to be the set of natural numbers that are not in \(S\), and aim to show by induction that in fact \(S^*= \mathbb{N}\)
 
Let \(P(n)\) be the statement that \(S^*\) contains \(n\).
 
For the initial step, note that 0 is not in \(S\), since if it were, then \(S\) would have a least element (namely 0). So \(0 \in S^*\) and therefore \(P(0)\) holds.
 
Now suppose \(P(0),\ldots, P(n)\) hold. Then \(n+1\) cannot be in \(S\), because if it were then it would be the least element of \(S\) (since by the inductive hypothesis all the smaller elements of \(\mathbb{N}\) are not in \(S\)). Hence \(n+1 \in S^*\), and therefore \(P(n+1)\) holds.
 
By strong induction, \(n \in S^*\) for all \(n \in \mathbb{N}\), and therefore \(S\) is empty. This contradicts our initial assumption and therefore proves the result.
\(\square\)
 
Here we have laid out the proof by carefully defining the statements \(P(n)\) involved in the inductive argument.
 
The well-ordering property of the natural numbers is one that you may well think is "obvious" (though note that the same property is not true of the real numbers, for example, so it is not an entirely trivial property). We used the principle of induction to prove it. In fact, the well-ordering property is equivalent to the principle of induction; it is also possible to work the other way and use the well-ordering principle to prove the principle of induction. Here is a proof of Theorem \ref{induction} (Principle of Induction) based only on the well-ordering property. As a reminder, in Theorem \ref{induction} we have a family of statements \(P(n)\) and we suppose that (i) \(P(0)\) is true, and (ii) for any \(n\), if \(P(n)\) is true then \(P(n+1)\) is also true.
 
Proof of Theorem 0.3. Let \(S\)be the set of natural numbers \(n\) such that \(P(n)\) is false. We aim to show that \(S\) is empty. 
 
Suppose, for a contradiction, that \(S\) were not empty. Then the well-ordering property means that \(S\) has a least element. That least element cannot be 0 since \(P(0)\) is true by the initial step (i). Therefore we can write the least element as \(n+1\) for some \(n \in \mathbb{N}\). Since \(n+1\) is the least element in \(S\) it must be the case that \(n\) is not in \(S\), and so \(P(n)\) holds.   But then the inductive step (ii) implies that \(P(n+1)\) also holds, which contradicts \(n+1\) being in \(S\).
 
Thus \(S\) must be empty, and therefore \(P(n)\) holds for all \(n \in \mathbb{N}\).
\(\square\)
 
 

0.2 The binomial theorem

We next aim to prove the binomial theorem, which provides the rule for how to expand a product of the form \((x+y)^{n}\).  First, we define some notation:
 
Definition 0.15 (Binomial coefficient)
For natural numbers \(n\) and \(k\), we define the binomial coefficient
\(\displaystyle\binom{n}{k}=\frac{n!}{k!(n-k)!} \qquad \text{for} \ 0\leqslant k \leqslant n.\)
This is read as `\(n\) choose \(k\)' and is also sometimes denoted by \(^{n}C_{k}.\) By convention \(\binom{n}{k}=0\) if \(k>n\).
 
 
The binomial coefficients appear in many areas of mathematics.  They represent the number of ways of choosing \(k\) elements from a set of size \(n\).  They can famously be laid out as an array called Pascal's triangle (or Khayyam's triangle, or Yang Hui's triangle, or Tartaglia's triangle), in which the \(n\)th row contains each of the non-zero \(\binom{n}{k}\):
 
\(n=0\)           1          
\(n=1\)         1   1        
\(n=2\)       1   2   1      
\(n=3\)     1   3   3   1    
\(n=4\)   1   4   6   4   1  
\(n=5\) 1   5   10   10   5   1
 
The following result is an algebraic expression of the defining feature of Pascal's triangle, that each entry is the sum of the two entries most immediately above it.
 
Lemma 0.16 (Pascal's Triangle)
Let \(n\) and \(k\) be natural numbers with \(1\le k \le n\).  Then
\(\displaystyle\binom{n}{k-1}+\binom{n}{k}=\binom{n+1}{k}.\)
Proof. This is simply a question of putting in the definitions and playing with the algebra.  Putting the left hand side over a common denominator we obtain
\( \begin{align*} \frac{n!}{(k-1)!(n-k+1)!}+\frac{n!}{k!(n-k)!}  & =\frac{n!\left( k+(n-k+1)\right)  }{k!(n-k+1)!}\\ & =\frac{n!\times(n+1)}{k!(n-k+1)!}\\         & =\frac{(n+1)!}{k!(n+1-k)!}, \end{align*} \)
which is equal to the right-hand side.
\(\square\)
 
This lemma can be used to show by induction that the binomial coefficients are integers rather than just rational numbers (a fact that is perhaps not immediately obvious from the definition).  
 
We are now in a position to prove the binomial theorem.
 
Theorem 0.17 (Binomial Theorem)
Let \(x\) and \(y\) be real (or complex) numbers, and \(n\) be any natural number. Then
\(\displaystyle (x+y)^{n} = \sum_{k=0}^{n} \binom{n}{k} x^{k} y^{n-k}.\)
 
Proof. We use induction on \(n\).  First we check that the expression holds for \(n = 0\).  This is true, since the left hand side is 1 in that case, and the right hand side is also 1 (because \(\binom{0}{0} = 1\) and any number raised to the power 0 is 1).  Now assume the expression holds for \(n\) and consider the case for \(n+1\),
\( \begin{align*} (x+y)^{n+1}&=(x+y)(x+y)^{n}\\ &=(x+y)\left(  \sum_{k=0}^{n}\binom{n}{k}x^{k}y^{n-k}\right)\qquad \textrm{(by the inductive hypothesis).} \end{align*} \)
Continuing to expand the brackets gives
\(\displaystyle \sum_{k=0}^{n}\binom{n}{k}x^{k+1}y^{n-k}+\sum_{k=0}^{n}\binom{n}{k} x^{k}y^{n+1-k}=x^{n+1}+\sum_{k=0}^{n-1}\binom{n}{k}x^{k+1}y^{n-k}+\sum_{k=1}^{n}\binom{n}{k}x^{k}y^{n+1-k}+y^{n+1},\)
where we have taken out the last term from the first sum and the first term from the
second sum. In the first sum we now make a change of indexing variable; we set \(k=l-1,\) noting that as \(k\) ranges over \(0,1,...,n-1\) then \(l\) ranges over \(1,2,...,n.\) So the above equals
\( \begin{align*} & x^{n+1}+\sum_{l=1}^{n}\binom{n}{l-1}x^{l}y^{n+1-l}+\sum_{k=1}^{n}\binom{n}{k}x^{k}y^{n+1-k}+y^{n+1}\\ & =x^{n+1}+\sum_{k=1}^{n}\left\{ \binom{n}{k-1}+\binom{n}{k}\right\} x^{k}y^{n+1-k}+y^{n+1} \qquad & \text{[relabeling $l$ as $k$]} \\ & =x^{n+1}+\sum_{k=1}^{n}\binom{n+1}{k}x^{k}y^{n+1-k}+y^{n+1} \qquad & \text{[using Lemma 0.16]} \\ & =\sum_{k=0}^{n+1}\binom{n+1}{k}x^{k}y^{n+1-k}, \end{align*} \)
which shows that the expression holds for \(n+1\).  Thus, by induction, the expression holds for all \(n\).
\(\square\)
 
Last modified: Wednesday, 1 October 2025, 1:35 PM