Introduction to University Mathematics (2025-26): 4 Logic and Proof

4.0 Logical statements and notation

We have already seen that we must deal with a lot of precise statements and assertions, ranging from very simple ones like "\(n = 2\)" to slightly more involved ones like

for all \(x\in \mathbb{R},\, x^2\ge 0\),

there exist \(x,y,z\in \mathbb{N}\backslash\{0\}\) such that \(x^{2025}+y^{2025} = z^{2025}\).

The statements can be true or false (perhaps you recognise the final example just given, which is famously false, by Fermat's last Theorem). Often during a mathematical argument we might not (yet) know if a given statement is true or false. Nevertheless we can work with it; following through the logical consequences of a statement may eventually lead us to the conclusion that it is false, for example.

To discuss such logic it is helpful to denote the statements by symbols; we have already seen examples of this.

If \(P\) is the statement `\(x\ge 2\)' and \(Q\) is `\(x^2\ge 4\)' (in the context of \(\mathbb{R}\)), we can then say things like `\(P\) implies \(Q\)' (which is, itself, another logical statement - one that we know to be true in this case).

We can combine logical statements using connecting words like `and', and `or', and we can negate a statement \(P\) by writing `not \(P\)'.

For the avoidance of doubt, we define these with truth tables.

Definition 4.1 (And, Or, Not)

If \(P\) and \(Q\) are statements, then the statements `\(P\) and \(Q\)' and `\(P\) or \(Q\)' are true or false as per the following truth table.

\(P\)	\(Q\)	\(P\) and \(Q\)	\(P\) or \(Q\)
F	F	F	F
F	T	F	T
T	F	F	T
T	T	T	T

If \(P\) is a statement, then the statement `not \(P\)' is true or false as per the following truth table.

\(P\)	not \(P\)
F	T
T	F

In regular English, the word `or' is often interpreted as an exclusive or; that is, it may carry an implicit meaning of `one but not the other' (as in `you can have a piece of cake or an ice cream'). This is not the case in mathematical usage, where `\(P\) or \(Q\)' should be interpreted to mean that \(P\) holds or \(Q\) holds or both do.

Symbols are sometimes used for these.

Notation.

\(P\wedge Q\) means `\(P\) and \(Q\)'. \(P \vee Q\) means `\(P\) or \(Q\)'. \(\lnot P\) means `not \(P\)'.

Example 4.2

If \(P\) is `\(n = 2\)' and \(Q\) is `\(n\) is even' (in the context of \(\mathbb{N}\)), then `\(P \vee Q\)' is equivalent to `\(n\) is even', and `\(P \wedge Q\)' means `\(n = 2\)', and \(\lnot P\) is the same as `\(n\neq 2\)'.

There is a direct analogy between the symbols \(\vee\), \(\wedge\) and \(\lnot\), and the symbols \(\cup\), \(\cap\), and \({}^{c}\) for set union, intersection and complement. To make this analogy clear, suppose \(A,B\subseteq S\) are sets and let \(P\) and \(Q\) be the statements `\(x\in A\)' and `\(x\in B\)'. Then \(x \in A\cup B\) is clearly equivalent to \(P\vee Q\), \(x \in A\cap B\) is equivalent to \(P \wedge Q\), and \(x\in A^{c}\) is equivalent to \(\lnot P\).

These logical symbols therefore obey the same distributive laws as for sets (Proposition 1.19) and also De Morgan's laws (Proposition 1.20), which in this context are

Proposition 4.3 (De Morgan's laws for logic)

(i) \(\lnot (P\vee Q)\) if and only if \((\lnot P) \wedge (\lnot Q)\)

(ii) \(\lnot (P\wedge Q)\) if and only if \((\lnot P) \vee (\lnot Q).\)

These rules for how to negate `\(P\) or \(Q\)' or `\(P\) and \(Q\)' are hopefully quite intuitive (if we don't have \(P\) or \(Q\) holding then that means we don't have \(P\) holding and we don't have \(Q\) holding). But when it comes to more complicated statements it is easy to confuse ourselves, so having these clear rules to fall back on may be useful.

4.1 Implies

Definition 4.4 (Implies)

The statement `\(P\) implies \(Q\)' means \(Q \vee (\lnot P)\) and is written \(P \Rightarrow Q\). With a truth table we can write

\(P\)	\(Q\)	\(P\Rightarrow Q\)
F	F	T
F	T	T
T	F	F
T	T	T

In regular English usage, `implies' or `if ... then ...' tends to be understood to indicate a degree of causation; that \(X\) has something to do with \(Y\). In mathematical usage this does not need to be the case. So if we say `\(P \Rightarrow Q\)' or `If \(P\) then \(Q\)', we simply mean that whenever \(P\) is true, then \(Q\) is also true.

So `If Paris is the capital of France then the Thames flows through London' is a true statement, despite the fact that there is obviously no connection between these two facts.

Similarly, `If Oxford is on Mars then Cambridge is on Venus' is also a true statement. A statement like this, where the `\(P\)' is never true, is said to be vacuously true. A similarly useless statement might begin `for all \(x \in \varnothing\), ...'.

Note that, unlike \(A\wedge B\) and \(A\vee B\), there is an asymmetry here; `\(P\) implies \(Q\)' is not the same statement as `\(Q\) implies \(P\)'.

Example 4.5

If \(P\) is `\(n = 2\)' and \(Q\) is `\(n\) is even', then \(P \Rightarrow Q\), but \(Q \not\Rightarrow P\).

Statements of the form \(P\Rightarrow Q\) are very common, and there are lots of ways to say the same thing. The following are equivalent.

(i) if \(P\) then \(Q\);

(ii) \(P\) implies \(Q\);

(iii) \(P \Rightarrow Q\);

(iv) \(P\) only if \(Q\);

(v) \(P\) is a sufficient condition for \(Q\);

(vi) \(Q\) is a necessary condition for \(P\);

(vii) if \(Q\) does not hold then \(P\) does not hold;

(viii) not \(Q\) implies not \(P\);

(ix) \((\lnot Q) \Rightarrow (\lnot P)\).

The last three of these are known as the contrapositive.

Definition 4.6 (Contrapositive, converse)

(i) For a statement of the form `\(P \Rightarrow Q\)', the contrapositive is \(\lnot Q \Rightarrow \lnot P\). It is always true when the original statement is true.

(ii) For a statement of the form \(P \Rightarrow Q\), the converse is \(Q \Rightarrow P\). It is sometimes but not always true when the original statement is true.

(iii) For a statement of the form \(P \Rightarrow Q\), the negation is \(\lnot(P \Rightarrow Q)\), which is never true when the original statement is true.

Using De Morgan's laws for logic (Proposition 4.3), the negation of \(P\Rightarrow Q\) can also be written as \(P \wedge (\lnot Q)\), that is `\(P\) is true and \(Q\) is false'.

In order to prove a statement of this form, we typically start by assuming that \(P\) holds and try to deduce through some logical steps that \(Q\) holds too. Alternatively, we can start by assuming that \(Q\) does not hold and show that \(P\) does not hold (that is, we prove the contrapositive).

If using \(\Rightarrow\), note that the symbol stands for both the `if' and the `then'. We shouldn't use it to stand for just the `then' as, for example, in `if \(x = -1\) \(\Rightarrow\) \(x^2 =1\)'. This would mean `if \(x = -1\) implies that \(x^2 = 1\)' and would need to be followed by `then ...' (to match the `if').

As noted above, even if we know that \(P \Rightarrow Q\), it might or might not be the case that \(Q \Rightarrow P\).

Notation.

We write \(P \Leftrightarrow Q\) to mean `\(P \Rightarrow Q\) and \(Q \Rightarrow P\)'.

You can use the definition of \(P\Rightarrow Q\) and De Morgan's laws for logic (Proposition 4.3) to deduce that \(P\Leftrightarrow Q\) means \((P \wedge Q) \vee (\lnot P \wedge \lnot Q)\).

We can read this as `\(P\) if and only if \(Q\)', or `\(P\) is necessary and sufficient for \(Q\)', or `\(P\) is equivalent to \(Q\)'.

The use of the \(\Leftrightarrow\) symbol in a proof needs some caution, as we'll discuss later (it has a logical implication both forwards and backwards, whereas our thought process tends to work in one direction most of the time!). The letters `iff' are also commonly used to stand for `if and only if'.

These statements are usually best thought of separately as `if' and `only if' statements. So to prove `\(P\) if and only if \(Q\)', we should first prove `if \(P\) then \(Q\)', and then separately prove `if \(Q\) then \(P\)' (or vice versa). Sometimes we may find that essentially the same argument used for the first direction also works in reverse, but sometimes quite a different method of argument may be required. One thing to be wary of is that having assumed \(P\) to deduce \(Q\), and then having changed to assuming \(Q\) with a view to deducing \(P\), it is all to easy to keep making use of \(P\) (or parts of \(P\)), forgetting that that is no longer assumed. It is a good idea to make very clear, both to yourself and in your written proof, which direction you are doing.

4.2 Formulation of mathematical statements

We've seen the phrases `for all' and `there exists' many times already in these lectures. It's time to introduce some notation for them.

Notation.

The symbol \(\forall\) denotes `for all' or `for every', and can simply be used as shorthand for those words.

The symbol \(\exists\) denotes `there exists', and can similarly replace those words.

The symbols \(\forall\) and \(\exists\) are known as quantifiers.

Typically a phrase like `there exists \(x \in S\)' is followed by a statement \(P(x)\) about what specific property \(x\) has, and it is quite common for \(\exists\) to stand for the following `such that' as well as the `there exists'. For example, `\(\exists x \in S \quad P(x)\)' can be read as `there exists \(x\) in \(S\) such that \(P(x)\) holds'. Personally I prefer to include the letters `s.t.' (or a colon `\(:\)') to stand in place of the `such that', so I write `\(\exists x \in S \,\text{s.t.}\, P(x)\)'.

Notation.

The symbol \(\exists !\) means `there exists unique', implying that there is one, and only one, element with the given property.

It is helpful to practice `translating' such statements into English sentences, and vice versa.

Example 4.7

Using these symbols we can write things like

\(\displaystyle \forall x \in \mathbb{R}\quad x^2 \ge 0 \quad \text{but}\quad \exists x \in \mathbb{C} \text{ s.t. } x^2 < 0,\)

which you should read as `for all \(x\) in the real numbers, \(x^2\) is greater than or equal to zero, but there exists \(x\) in the complex numbers such that \(x^2\) is less than zero'.

Example 4.8

The definitions of what it means for \(f\colon X \rightarrow Y\) to be injective and surjective can be written as (respectively)

\(\displaystyle \forall x_1,x_2 \in X, \ f(x_1) = f(x_2) \Rightarrow x_1 = x_2,\)

and

\(\displaystyle \forall y \in Y, \ \exists x \in X \,\text{s.t.}\, f(x) = y.\)

Example 4.9

If \(\mathbb{P}\) is the set of prime numbers, then

\(\displaystyle \forall p \in \mathbb{P},\quad p>2 \Rightarrow \exists n\in\mathbb{N} \,\text{s.t.}\, p=2n+1,\)

is a way of stating `every prime number greater than \(2\) is odd', and

\(\displaystyle \forall x \in \mathbb{R}, \quad \left(\ x<0 \quad \text{or}\quad \exists y \in \mathbb{R} \,\text{s.t.}\, y^2 = x\ \right),\)

is a way of stating `every non-negative real number has a real square root'.

The quantifiers should include a specification of the set over which they range (\(\mathbb{P}\) or \(\mathbb{R}\) in these examples). However, there are situations when this is so obvious from the context that it becomes cumbersome to keep writing this. In particular, you'll often see `\(\forall \varepsilon > 0\)', in which it is understood that \(\varepsilon\) is a real number.

To prove a statement of the form `\(\forall x \in X\ P(x)\)', it is always a good idea to start the proof with `Let \(x \in X\).' or `Suppose \(x \in X\) is given.'. This `addresses' the quantifier with an arbitrary \(x\), which should then be treated as fixed for the rest of the proof.

It is important to note that the order of quantifiers matters. Returning to an earlier example, if \(S\) is the set of students at Oxford, and \(C\) is the set of colleges, we can say

\(\displaystyle \forall s \in S, \ \exists c \in C \,\text{s.t.}\, s \in c,\)

which says that for every student there is a college of which they are a member; this is true.

If we changed the order of the quantifiers and wrote

\(\displaystyle \exists c \in C \,\text{s.t.}\, \forall s \in S, \ s \in c,\)

this says that there is one particular college of which every student at Oxford is a member; that is a completely different statement, and it is not true.

Importantly, we must read from left to right, and as new elements or statements are introduced they are allowed to depend on previously introduced elements but can't depend on things that are yet to be mentioned. So in the first of the statements above, the \(c\in C\) that exists according to the second quantifier can (and does) depend on the specific \(s\in S\) from the first quantifier. In the second statement, the specific \(c\) identified by the first quantifier must work for all \(s\) in the second one.

Proposition 4.10 (Negation)

(i) The negation of a statement of the form `\(\forall x\in X, P(x)\)' is `\(\exists x\in X \,\text{s.t.}\, \lnot P(x)\)'.

(ii) The negation of a statement of the form `\(\exists x\in X \,\text{s.t.}\, P(x)\)' is `\(\forall x\in X, \lnot P(x)\)'.

We can therefore negate a complicated statement like `\(\forall s \in S, \exists c\in C \,\text{s.t.}\, s \in c\)' by working from the outside in. Each line in the following is equivalent.

\( \begin{align*} \lnot\left(\forall s \in S, \exists \right.&\left.c\in C \,\text{s.t.}\, s \in c\right)\\ \exists s \in S \,\text{s.t.}\, \lnot\left(\exists \right.&\left. c\in C \,\text{s.t.}\, s \in c\right)\\ \exists s \in S \,\text{s.t.}\, \forall &c\in C, \lnot\left(s \in c\right)\\ \exists s \in S \,\text{s.t.}\, \forall &c\in C, s\notin c \end{align*} \)

If we try to express these in words, we might have something roughly like

It's not true that every student has a college.

There's a student for whom it's not true that they have a college.

There's a student, and for every college, it's not the case that they're a member.

There's a student and for every college, they're not a member.

But one reason why logical notation is so helpful is that English itself is sometimes ambiguous. For example, the statement

`For all natural numbers \(x\), \(x<y\) for some natural number \(y\)'

could mean `for all \(x\in \mathbb{N}\), there exists \(y \in \mathbb{N}\) such that \(x<y\)', or `there exists \(y \in \mathbb{N}\) such that for all \(x\in\mathbb{N}\), \(x<y\)'. The English wording could justifiably be interpreted either way, but clearly only the first of these is true. In symbolic notation, we should write

\(\displaystyle \forall x\in \mathbb{N}, \ \exists y \in \mathbb{N} \,\text{s.t.}\, x < y,\)

and it is unambiguous that \(y\) is allowed to depend on \(x\).

To avoid confusion, it is a good idea to keep to the convention that the quantifiers come first, before any statement to which they relate. However, many authors (including this one!) don't stick rigidly to this if there's a last `for all'. For example, if \(f:\mathbb{R} \rightarrow \mathbb{R}\) is a bounded function, you may see something like

\(\displaystyle \exists M \in \mathbb{R} \,\text{s.t.}\, |f(x)|<M \ \forall{x}\in \mathbb{R}.\)

Example 4.11

As an example of how the notation introduced in this section can make a proof more concise, we'll probe the double inclusion principle again.

Proof of Proposition 1.18

We argue, via a sequence of equivalent statements, that \(A = B\) is the same as \((A\subseteq B \text{ and } B\subseteq A)\):

\( \begin{align*} A = B \quad & \Leftrightarrow \quad \forall x\in S \quad \left( x \in A \Leftrightarrow x \in B \right)\\ & \Leftrightarrow \quad \forall x\in S \quad \left( x \in A \Rightarrow x \in B \quad\text{and}\quad x\in B \Rightarrow x \in A \right) \\ & \Leftrightarrow \quad \forall x\in S \quad \left( x \in A \Rightarrow x \in B \right)\quad \text{and}\quad \forall x\in S \quad \left( x \in B \Rightarrow x \in A \right) \\ & \Leftrightarrow \quad A \subseteq B \quad \text{and}\quad B \subseteq A. \end{align*} \)

\(\square\)

Notice that bracketing is sometimes necessary in order to make clear what the statements are to which the quantifiers refer. Intelligent use of spacing on the page can also be helpful, especially when writing by hand.

To use \(\Leftrightarrow\) like this we need to make sure that the logic works both ways, both forwards and backwards. When constructing such a proof, it can be helpful to first work forwards, with each statement implying the next one, and then separately check whether the argument works backwards (i.e. first write it with the arrows as \(\Rightarrow\), before checking if each one can be converted to \(\Leftrightarrow\)).

It is to some extent a matter of personal taste how much to use symbolic notation rather than writing things out in words. Regardless of how much you use them in your own writing, it is important to understand and be fluent in interpreting these symbols in other people's writing.

In order to prove or use a theorem it is important to correctly understand its logical form. Most theorems are ultimately of the form `if \(P\) then \(Q\)', although the \(P\) and the \(Q\) may themselves be quite complicated statements that are combinations of other statements. In this context, the `\(P\)' is the hypothesis and the `\(Q\)' is the conclusion.

Example 4.12

Consider the statement

If \(n\) is a non-zero natural number, then \(n\) has a unique prime factorisation.

Here, the hypothesis is `\(n\) is a non-zero natural number', and the conclusion is `\(n\) has a unique prime factorisation'.

In this example, the theorem is explicitly stated in the form `if \(P\) then \(Q\)', so understanding the hypothesis and conclusions is very easy. Sometimes a theorem may be stated in a way that makes this less obvious.

Example 4.13

Consider the statement

Every prime number greater than \(2\) is odd.

Faced with such a statement, it may be helpful to think through carefully what the hypothesis and conclusion are, and to re-state it in a way that makes this more transparent.

Thus, another way of saying the same thing is:

Let \(p\) be a prime number greater than \(2\). Then \(p\) is odd.

The first sentence is the hypothesis, and the second is the conclusion.

Example 4.14

As a more involved example, here is a rather poor statement of the intermediate value theorem (IVT), which you will come across in Analysis.

Whenever \(f\) is a continuous function on \(\mathbb{R}\), \(a, b\) are real numbers such that \(a<b\), \(f(a)<0\) and \(f(b)>0\), \(f(c) = 0\) for some \(c \in (a,b)\).

Although all the correct ingredients of the theorem are here, it is not at all clear how to split the statement into hypothesis and conclusion. A better version is:

Let \(f\colon \mathbb{R} \rightarrow \mathbb{R}\) be a continuous function, and suppose \(a,b\in\mathbb{R}\) are such that \(a<b\), \(f(a)<0\), and \(f(b)>0\). Then there exists a real number \(c \in (a,b)\) such that \(f(c) = 0\).

Splitting the theorem into shorter sentences has helped to clearly separate the hypothesis (which is a combination of various sub-statements in this case) from the conclusion. As a general rule, using words like `Let' and `Suppose' is a good way of `setting up' the hypotheses, and `Then' is a good way of signalling that what follows is the conclusion.

The intermediate value theorem is stating the intuitively obvious result that if the graph of a continuous function is below the axis somewhere and above the axis somewhere else, then it must cross the axis somewhere in between.

4.3 Examples of proof

We have already seen various methods for proving and refuting mathematical statements. We now make some comments on the general classes of methods. To illustrate the discussion, we'll consider the following simple result about the arithmetic and geometric means of non-negative real numbers:

Theorem 4.15 (AM-GM Inequality)

Let \(x,y\) be non-negative real numbers. Then

\(\displaystyle \frac{x+y}{2} \geq \sqrt{xy},\)

with equality if and only if \(x = y\).

The left hand side of this inequality is called the arithmetic mean (AM) and the right hand side is called the geometric mean (GM).

Direct proof

To prove a statement of the form `if \(P\) then \(Q\)' directly, we make use of \(P\) to arrive at \(Q\) through a sequence of logical reasoning. It may be that we can start from \(P\) and work directly to \(Q\), or it may be that we make use of \(P\) along the way (as in the example below). A direct proof of Theorem 4.15 could look like this:

Proof of Theorem 4.15.

Since \(x,y\in \mathbb{R}\), we know that \((x-y)^2 \ge 0\), with equality if and only if \(x = y\). Expanding the brackets, and then adding \(4xy\) yields

\( \begin{align*} x^2 - 2xy + y^2 & \ge 0,\\ x^2 + 2xy + y^2 & \ge 4xy,\\ \frac{1}{4} (x+y)^2 & \ge xy. \end{align*} \)

Taking the square root (noting that \(x,y \ge 0\)) gives the required result.

\(\square\)

In this case, the algebraic steps are straightforward, but the starting point and the `adding \(4xy\)' are perhaps not entirely obvious. You will see many direct proofs like this, and you must understand that the arguments are not necessarily presented in the order that the author thought of them. To illustrate this, let's see another proof where the ideas are presented in a different order.

Proof by contradiction

To prove a statement `if \(P\) then \(Q\)' by contradiction, we suppose that \(Q\) is not true and show through some logical reasoning (making use of the hypotheses \(P\)) that this leads to a contradiction or inconsistency. We may arrive at something that contradicts the hypotheses \(P\), or something that contradicts the initial supposition that \(Q\) is not true, or we may arrive at something that we know to be universally false.

A proof by contradiction of Theorem 4.15 could look like:

Proof of Theorem 4.15

Suppose for contradiction that \(\exists x,y \in \mathbb{R}\) with \(x,y\geq 0\) and \(\sqrt{xy} > \frac{1}{2} (x+y)\).

Then squaring both sides (noting that both sides are positive) and rearranging, yields

\( \begin{align*} 4 xy & > 4 x^2 + 2xy + y^2,\\ 0 & > x^2 - 2xy + y^2,\\ 0 & > (x-y)^2, \end{align*} \)

which is a contradiction, since the square of a real number is non-negative. Hence \(\sqrt{xy} \le \frac{1}{2} (x+y)\).

The same steps with \(>\) replaced by \(=\) show that equality holds if and only if \(x=y\).

\(\square\)

One of the useful aspects of proving things by contradiction is that by negating the statement \(Q\), we immediately give ourselves something extra to work with. So if we cannot see a way to make any progress by starting directly from \(P\), then supposing a contradiction may be a good thing to try instead. In the proof above, for example, we simply started with the negation of the result we were aiming for and manipulated the statement to find the contradiction.

Proof by induction

Induction is useful for proving results that can be indexed by the natural numbers. So it would not be useful as a method to prove Theorem 4.15, but it could be used to prove the generalisation:

Theorem 4.16 (AM-GM Inequality)

Let \(n\ge 2\). If \(x_1,x_2,\ldots,x_n\) are non-negative real numbers, then

\(\displaystyle \frac{x_1 + x_2 + \ldots + x_n}{n} \geq (x_1x_2\ldots x_n)^{1/n}.\)

We saw the Principle of Proof by Induction in Proposition 0.3 and we also saw two variants in Corollary 0.5 and Proposition 0.6 (strong induction). I'd like to use proof to demonstrate another variant.

Proof. First consider the case where \(n=2^m\) for \(m\in \mathbb{N}\). We'll prove this by strong induction on \(m\). The case \(m=1\) is Theorem 4.15 which we've already proved (twice!).

Now suppose that the statement is true for \(n\) and consider the statement for \(2n\). We will rewrite the arithmetic mean so that we can use our inductive hypothesis on the first half of the numbers (\(x_i\) with \(1\leq i \leq n\)), and separately on the second half of the numbers (\(x_i\) with \((n+1)\leq i \leq 2n\)), writing

\(\displaystyle \frac{1}{2n}\sum_{i=1}^{2n} x_i = \frac{1}{2} \left(\frac{1}{n}\sum_{i=1}^{n} x_i+\frac{1}{n}\sum_{i=n+1}^{2n} x_i\right).\)

Then we have

\( \begin{align*} \frac{1}{2} \left(\frac{1}{n}\sum_{i=1}^{n} x_i+\frac{1}{n}\sum_{i=n+1}^{2n} x_i\right) & \geq \frac{1}{2} \left( \left(x_1x_2\dots x_{n}\right)^{1/n}+\left(x_{n+1}\dots x_{2n}\right)^{1/n} \right)\quad \text{by the inductive hypothesis}\\ & \geq \sqrt{\left(x_1x_2\dots x_{n}\right)^{1/n}\times\left(x_{n+1}\dots x_{2n}\right)^{1/n} }\quad \text{using the case }m=1\\ & =\left(x_1x_2\dots x_{n}x_{n+1}\dots x_{2n}\right)^{1/(2n)}. \end{align*} \)

This completes the proof for powers of 2. For all other numbers, we proceed by something resembling induction, but in the opposite direction.

Suppose the statement is true for \(n\). We wish to prove the statement for \(n-1\).

Given \(n-1\) numbers with \(n\geq 2\), let \(x_n=\frac{1}{n-1}\left(x_1+x_2+\dots +x_{n-1}\right)\) i.e. the arithmetic mean of the others. Let's call this number \(M\). Now since the statement is true for \(n\), we have

\(\displaystyle \frac{x_1+x_2+\dots +x_{n-1}+M}{n}\geq \left(x_1 x_2 \dots x_{n-1} M\right)^{1/n}\)

The left-hand side simplifies to \(M\) and we have

\(\displaystyle M\geq \left(x_1 x_2 \dots x_{n-1}M\right)^{1/n}.\)

Dividing both sides by \(M^{1/n}\), we have

\(\displaystyle M^{(n-1)/n}\geq \left(x_1 x_2 \dots x_{n-1}\right)^{1/n},\)

and taking each side to the power of \(n/(n-1)\) gives

\(\displaystyle M\geq \left(x_1 x_2 \dots x_{n-1}\right)^{1/(n-1)}.\)

This completes the proof; the statement is true for \(n-1\).

Since every number is either a power of 2, or can be reached by starting from a power of 2 and repeatedly subtracting one, the statement is true for all \(n\geq 2\).

\(\square\)

Counterexamples

Providing a counterexample is the best method for refuting, or disproving, a conjecture. In seeking counterexamples, it is a good idea to keep the cases you consider simple, rather than searching randomly. It is often helpful to consider `extreme' cases, in which something is zero, a set is empty, or a function is constant, for example. If you are relaxing one of the hypotheses of a theorem and contemplating whether the conclusion still holds, make sure to consider cases that contravene the relaxed hypothesis (else you already know that they won't provide the desired counterexample!)

For example, suppose it were claimed that the requirement for \(x\) and \(y\) to be non-negative could be removed from Theorem 4.15 if we simply put a modulus sign inside the square root:

Claim. Let \(x,y\) be real numbers. Then \(\frac{1}{2} (x+y) \ge \sqrt{|xy|}\), with equality if and only if \(x = y\).

Refutation. This claim is not true. A counterexample is \(x = 1\), \( y = -1\).

There is no need to expand with additional arguments about why the counterexample exists; providing a single counterexample is sufficient to disprove the claim.

4.4 Examples of problem-solving

In this section we discuss some example problems to illustrate aspects of problem-solving.

The following example explores how the images and preimages of set intersections behave. It is presented as a simple true/false question, to which it should be understood that we need to provide reasoning for our answer. So we first need to decide whether we think they are true or false (some experimenting and thought may be required); then if true, we should prove it, and if false, we should provide a counterexample. A counterexample may well have been found anyway as part of our initial investigation to decide whether it is true or false, or conversely we may have gained insight into how a proof could work.

Example 4.17 (Images and pre-images)

Let \(f\colon X \rightarrow Y\) be a function and let \(A,B \subseteq X\) and \(C,D\subseteq Y\). Are the following statements true or false?

(i) \(f(A\cap B) = f(A) \cap f(B)\),

(ii) \(f^{-1}(C\cap D) = f^{-1}(C) \cap f^{-1}(D)\).

Solution and commentary.

For both of these it might be reasonable to consider some extreme cases in case we stumble across an immediate counterexample. In this situation, such an extreme case might be if \(A\) and \(B\), or \(C\) and \(D\), are disjoint. In that case the sets on the left hand sides are both the empty set (the image and the pre-image of the empty set are the empty set). In the first case this immediately suggests the possibility of a counterexample, since \(f(A)\) and \(f(B)\) could easily intersect. If the function were constant, for example (that is, if the function assigns the same output to every input), then \(f(A) = f(B)\), so provided \(A\) and \(B\) are not themselves empty their intersection will not be the empty set.

So we claim (i) is false. A counterexample would be \(f\colon \mathbb{R} \rightarrow \mathbb{R}\) defined by \(f(x) = 3\) for all \(x \in \mathbb{R}\). Then if \(A = \{0\}\), \(B = \{1\}\), then \(f(A\cap B) = \varnothing\), but \(f(A)\cap f(B) = \{3\}\).

We might reflect on whether anything can be salvaged (this is not part of the question, but is good practice). In particular, based on our experience finding the counterexample, we might suspect that \(f(A\cap B)\subseteq f(A)\cap f(B)\). This is in fact true, and you might like to prove it.

For (ii), the same thinking does not yield a counterexample, since if \(C\) and \(D\) are disjoint then their pre-images are also disjoint. So we claim that (ii) is true and aim to prove it:

Proof. If \(x \in\) LHS, then \(f(x) \in C\cap D\), so \(f(x)\in C\) and \(f(x)\in D\). Thus \(x \in f^{-1}(C)\) and \(x\in f^{-1}(D)\), and therefore \(x \in\) RHS. Conversely, suppose \(x \in \) RHS. Then \(x \in f^{-1}(C)\) and \(x\in f^{-1}(D)\), so \(f(x)\in C\) and \(f(x)\in D\), and therefore \(x \in\) LHS. So each side is a subset of the other, and the sets are therefore equal.

\(\square\)

The next example relates to ideas that are covered in the Groups and Group Action course.

This example is phrased in a way that is similar to a problem-sheet or exam-style question, with related sub-parts. The way the two questions in (ii) are phrased strongly suggests that there must be a difference between the case when \(n\) is prime and when \(n\) is arbitrary. A `hint' is given at the end, and we should think about how we could relate that to the question that is asked.

Example 4.18 (Modular arithmetic)

Let \(n\ge 2\) be an integer and let \(\mathbb{Z}_n\) be the set of equivalence classes \(\{ \overline{0}, \overline{1}, , \ldots, \overline{n-1} \}\) defined by congruence modulo \(n\) on \(\mathbb{Z}\).

(i) Show that the operation \(\otimes\) on \(\mathbb{Z}_n\) defined by

\(\displaystyle \overline{x} \otimes \overline{y} = \overline{ x \times y },\)

is well-defined, where \(x \times y\) denotes standard muliplication on \(\mathbb{Z}\).

(ii) If \(\overline{x} \neq \overline{0}\), a multiplicative inverse \(\overline{y}\) has the property that \(\overline{x} \otimes \overline{y} = \overline{1}\). Is there a multiplicative inverse for every \(\overline{x} \neq \overline{0}\)? What if \(n\) is prime?

[You may assume Bezout's lemma, which says that if integers \(a\) and \(b\) are coprime, there exist integers \(k\) and \(l\) such that \(a\times k + b\times l = 1\).]

Solution and commentary

Firstly, it is helpful to recall the precise definition of the equivalence relation `congruence modulo \(n\)'; that is \(x \sim y \Leftrightarrow y-x\) is a multiple of \(n\).

For (i), we should consider why the given definition might not be well-defined. The concern must be that the same equivalence class can be represented in terms of different \(x\) and \(y\) (e.g. \(\overline{0} = \overline{n}\), etc.). Since the given definition depends on \(x\) and \(y\) themselves, it might give a different answer if we represent the elements \(\overline{x}\) and \(\overline{y}\) using different values of \(x\) and \(y\). So, supposing \(\overline{x_1} = \overline{x_2}\) and \(\overline{y_1} = \overline{y_2}\), we need to show that \(\overline{x_1}\otimes \overline{y_1} = \overline{x_2}\otimes\overline{y_2}\). By definition of the equivalence classes \(x_2-x_1 = kn\) and \(y_2-y_1 = ln\) for some integers \(k\) and \(l\). So

\(\displaystyle x_2 \times y_2 = (x_1+kn)\times (y_1+ln) = x_1 \times y_1 + (x_1 l+y_1k+kln) n,\)

(we have omitted some of the \(\times\) symbols to save space).

Since the final term is a multiple of \(n\) this means that \(\overline{x_2 \times y_2} = \overline{x_1 \times y_1}\), so indeed we have \(\overline{x_1}\otimes \overline{y_1} = \overline{x_2}\otimes\overline{y_2}\). Hence, \(\otimes\) is well-defined.

For (ii), the question seems to suggest that things may be different depending on whether \(n\) is prime or not. So consider first a case where \(n\) is not prime and experiment a little to see if there are any simple counterexamples. If \(n = 4\), there are only four equivalence classes, \(\overline{0}\), \(\overline{1}\), \(\overline{2}\), and \(\overline{3}\). We observe that \(\overline{1}\) and \(\overline{3}\) are their own inverse, but \(\overline{2}\) does not have one. So we have found a counterexample: \(n=2\) and \(\overline{x} = \overline{2}\). So the answer is No, there is not necessarily a multiplicative inverse for every \(\overline{x}\neq \overline{0}\).

If \(n\) is prime, we might try to use the hint. If \(n\) is prime then for any \(0< x<n\), \(x\) and \(n\) will be coprime, so the lemma implies that there are integers \(k\) and \(l\) such that \(x\times k + n\times l = 1\). But this means that \(\overline{x\times k} = \overline{1}\) so, following the definition, \(\overline{x} \otimes \overline{k} = \overline{1}\). So this \(\overline{k}\), which we know exists according to the lemma, is the multiplicative inverse of \(\overline{x}\). So if \(n\) is prime, the answer becomes Yes, every \(\overline{x}\neq \overline{0}\) does have a multiplicative inverse.

This example has shown that if \(p\) is prime, then every non-zero element of \(\mathbb{Z}_p\) has a multiplicative inverse. This goes part way to showing that \(\mathbb{Z}_p\) is a finite field (meaning that it behaves in many respects similar to \(\mathbb{R}\)).

The final example below relates to things you will see in the Analysis II course.

We are given a definition that involves a complicated-looking statement involving quantifiers, which is describing rigorously what it means for a function to tends to zero at infinity.

We are being asked to apply this definition to two particular functions.

Example 4.19 (Limits)

A continuous function \(f\colon \mathbb{R} \rightarrow \mathbb{R}\) tends to zero as \(x \to \infty\) if

\(\displaystyle \forall \varepsilon > 0, \ \exists X \in \mathbb{R} \,\text{s.t.}\, \forall x \in \mathbb{R}, \text{ if } x>X\text{ then } |f(x)| < \varepsilon.\)

Prove or disprove whether the following functions tend to zero as \(x \to \infty\):

(i) \(f(x) = e^{-x}\);

(ii) \(f(x) = \cos x\).

Solution and commentary

Hopefully we have some immediate intuition (from the shape of their graphs, for example) that the first function does tend to zero and the second one doesn't.

For (i), we aim to show that the definition holds. Since the statement starts with \(\forall \varepsilon\), we should start our proof by letting an arbitrary \(\varepsilon>0\) be given. Then we need to show that there exists an \(X\) such that for all \(x>X\), \(|f(x)|<\varepsilon\). Since the function \(f(x) = e^{-x}\) is decreasing, and is always positive, this can achieved by taking \(X = -\ln \varepsilon\). Then for \(x>X\), \(|f(x)| = |e^{-x}|<|e^{-X}| = \varepsilon\). So we have shown that the definition holds, and \(f(x) = e^{-x}\) does tend to zero as \(x \to \infty\).

For (ii), we aim to show that the definition does not hold, so we need to prove its negation. Following the rules for how to negate quantifiers, that is,

\(\displaystyle \exists \varepsilon > 0 \,\text{s.t.}\, \forall X \in \mathbb{R}, \ \exists x \in \mathbb{R} \,\text{s.t.}\, x>X \text{ and } |f(x)|\ge \varepsilon.\)

(The original statement here is of the form `\(\forall\ \exists\ \forall\ P(x)\)', where \(P\) is itself of the form \(Q \Rightarrow R\), in which \(Q\) is `\(x > X\)' and \(R\) is `\(|f(x)|<\varepsilon\)'. So the negated statement is of the form `\(\exists\ \forall\ \exists\ \text{not}\,P(x)\)', and \(\text{not}\,P(x)\) has been expressed as `\(Q\) and \(\text{not}\,R\)'.)

To see that this negated statement is true, we can observe that if \(\varepsilon = \frac{1}{2}\) then for any \(X\in \mathbb{R}\) there is a multiple of \(2\pi\), say \(2\pi n\), that is larger than \(X\), and for which \(\cos 2\pi n = 1 \ge \varepsilon\). Hence \(f(x) = \cos x\) does not tend to zero as \(x \to \infty\).

4.5 General advice

When seeking to prove or disprove a result, the following suggestions may be helpful:

Make sure you are clear about the hypotheses and conclusions.
`Unpack' any definitions and re-state what exactly it is you know and what it is that you need to show (either in your head or, if it's helpful, write it down).
If you need to show something `for all \(\varepsilon > 0\)', start with `Let \(\varepsilon>0\) be given.'
If you can't see a way to start, consider `seeking a contradiction' and suppose the result is not true to give yourself more to work with.
If you need to show uniqueness, suppose there are two of whatever it is, and try to show that they are equal.
Look for extreme/simple cases as counterexamples.
Don't be afraid to experiment, but have in mind what you're aiming for. If you're not making progress, try a different approach.
Use sketches and diagrams to help gain intuition.
If you get stuck, take a break. Look at it again with fresh eyes.
Re-read your final proof. Be critical, and check that you are convinced by what you've written. (This is probably the most important of all these suggestions!)

Last modified: Wednesday, 1 October 2025, 1:37 PM