We start by outlining how mathematics is built and introducing the mathematical language, in particular we show some conventions that need to be understood. Then we look at proofs.

Mathematics is a study of artificial worlds of thought that have one important feature: Everything is black or white there, therefore formal logic can be applied, in particular the results one gets are totally reliable since their truthfullness is proved. To see how one gets there we have to start at the beginning. Mathematics studies objects, but not real ones (stones, frogs, stars) but rather ideal objects, or better yet, cathegories of ideal objects. One such cathegory may be real numbers, another cathegory can be even numbers, another one may be real functions etc.

If we want to have reliable answers, we need these cathegories to be clear, without any grays. For instance, when we take some number, then it must be clear whether it is an even number or not. In other words, there must be some test that gives definite answers. In this case the test is simple, you try to express the given number as a double of an integer. If it can be done, you've got yourself an even number, otherwise not. The great thing about it is that everybody, anywhere in the world, can make the same test and arrives at exactly the same answer.

Hopefully you appreciate by now how crucial such a clear-cut situation is. This is a big difference compared to art (no test whether something is beautiful or not), psychology (no definite test whether somebody is normal or not), philosophy etc. This lack of precision is what makes reliable results impossible in these fields.

Back to mathematics. Obviously before we start investigating something, we
have to first make clear what we actually talk about, that is, define
precisely what objects we investigate. Thus mathematical texts traditionally
start with definitions. A **definition** is just a piece of text where we
introduce some new cathegory (a name for an object or its property) and a
test that allows us to decide whether an arbitrary object we pick does
belong to this new cathegory or not.

Writing such a definition (or any other mathematical statement) using purely logical symbols would be possible, but also very hard to decipher. Therefore you rarely find those beautiful logical connectives in mathematical books and papers, they are hard on the eyes and mathematicians prefer to see human words when possible. Thus mathematics developed a language of its own which makes logical statements more similar to the way normal humans talk, but preserving precision of thought. Along the way it developed some quirks that one should know about in order to understand things properly.

For example, a definition of an even number may look like this.

Let

nbe an integer. We say that it isevenif there is an integerksuch thatn= 2⋅k.

The first sentence is a preamble, an introduction that focuses our attention
on a specific type of object that is already known (defined). In the second
sentence we single out a special kind of integers (even ones) using a
condition that decides whether an integer has this new property or not. Note
that the preamble is not really necessary and we may also use a different
cathegory there, for instance we may let *n* belong to the set of real
numbers or even try for something more general. Allowing more objects into
play does not spoil anything, since numbers other than integers have no
chance of passing the test of being even anyway. But it might make it harder
for the reader, so we usually use the first sentence to focus our attention
to objects where the new notion makes sense.

In general, a definition typically has the following form.

We say that an object *x* satisfies property *P* (or we call this
object *P*) if it satisfies the following condition *C*.

The idea is simple. When you are given an object, you apply the specified
test and depending on the outcome, this particular object either is or is
not called *P*. Note that there is a two-way relationship, which means
logical equivalence. However, the definition above is written in the form
of implication! This is because of tradition, for some reason
mathematicians decided long time ago to write definitions as implications
and it stuck.
Although it is not technically correct, all mathematicians and their dogs
know that in fact definitions are equivalences, so they feel no need to
change this tradition, after all, if something has been happening for over a
hundred years in most languages, it is really hard to change. You can think
of it as a trade secret, definitions are equivalences although they are
not written as such.

This is essentially the only open breach between mathematical language and
formal logic. Now we pass on. Having defined some types of objects, we start
investigating them and discover some facts about them, that is, some
**statements** that are always true in the world of mathematics,
we often call them theorems. Again, we prefer to use a more
human language when stating them.

To show that we will consider one typical mathematical statement (a theorem in the form of implication) and look at various ways in which mathematicians can express it. Since there are no strict rules, the opinion on what is proper varies according to how picky a particular mathematician is. Some are really into formal correctness (in my misspent youth I even went as far as really writing definitions as equivalences, but I got to my senses since), some are more interested in how well things read. A typical theorem in a book may look like this.

Theorem.

Letfbe a function defined on some open intervalI. Iffis differentiable onI, then it is also continuous onI.

Again we start with a preamble, it states what objects we will talk about in this theorem. Again, while this form is used perhaps most often, it is not completely correct from logical point of view as written. Namely, the word "arbitrary" is missing at two crucial places. We want this theorem to work for all functions on all intervals. A better version may therefore be the following one.

Theorem.

Letfbe an arbitrary function defined on an arbitrary open intervalI. Iffis differentiable onI, then it is also continuous onI.

However, very few mathematicians are picky enough to actually write it like this. Thus we have another convention to remember: The word "let" in preambles has "for every" hidden in it, unless specified otherwise (say, a theorem may also start "Let there exist something", then it is an existencial quantifier there). Correct translation of out theorem into logical language would therefore be

Theorem.

For every open intervalIand for every functionfdefined onIthe following is true: Iffis differentiable onI, then it is also continuous onI.

Now it is written exactly in the proper logical way, so one can even rewrite it using quantifiers:

Theorem.

∀a<b∈ ℝ^{*}∀fa function onI= (a,b): (fis differentiable onI=>fis continuous onI).

With a bit of practice one can readily translate statements as written by mathematicians into proper logical expressions with quantifiers. Mostly it is obvious.

We started with the usual mathematical language and moved towards a more formal shape. Sometimes we move in the opposite way, especially when we just talk about mathematics. A very concise formulation is this:

Theorem.

Differentiable functions are also continuous.

This is a borderline case, used mostly informally, and quite a few mathematicians would raise an eyebrow in a significant way if they saw this in print. Above all, this statement does not say what functions are meant, which is - from the formal point of view - quite a serious breach of good manners (and most mathematicians are rather picky about such things). However, an experienced mathematician can readily fill in the missing parts (preamble) and appreciates how this short version captures the essence, which is useful for instance in lectures or when you just want to refer to this well-known fact.

One thing that mathematicians are proud of is that every statement is
100% reliable, since everything is proved (well, everything apart from
axioms, see below). Given that most statements are implications, we will
focus here mostly (but not only) on various ways of proving implications. So
consider one particular implication
*p* => *q*.

Every implication is satisfied when its assumption is false, so this case is irrelevant when deciding whether a particular implication is true or not. The key situation is when the assumption is true. Then we have to ask whether also the conclusion is true. If it is always so, then the implication is satisfied in all cases. If there is a case when the assumption is true but the conclusion is not, then the implication fails. In other words, one has to show that the case "assumption true—conclusion false" can never happen.

This brings us to one interesting point. When facing a statement, there are
two things we may attempt: Prove it, or prove that it is false, which we
call "disproving" that statement. Thus to disprove an implication,
it is enough to show just one instance when its assumption is true but the
conclusion is not. A typical example: The implication "If a number is a
prime, then it is odd" is false, which we prove by exhibiting a
**counterexample**, namely number 2. On the other hand, the implication
"If a number larger than 2 is a prime, then it is odd" is true. One
way to prove it would be to go through all primes larger than 2, but there
are infinitely many of them and most of us do not have that much time on our
hands. Thus one has to apply different methods, methods that can handle
infinitely many numbers at the same time. This situation is fairly typical.
Most mathematical statements start with a general quantifier, which in most
cases means that we are supposed to show that something works in infinitely
many cases. However, to disprove a general quantifier, it is enough to find
just one counterexample.

This duality is in a sense universal. Some mathematical statements to be proved involve the existential quantifier, for instance, "there is a solution to such and such equation." Then the roles are reversed. To prove such a statement, it is enough to exhibit just one solution that works. On the other hand, to disprove such a statement, one has to prove that no concievable candidate works, which means showing something about infinitely many objects.

But now let us return to the problem of proving an implication.

Direct proof work as follows. Since we are only interested in what happens
when the assumption is true, we simply assume that it is so and then we use
this assumption when building some argument showing that the conclusion is
also true. Since the step from the assumption to the conclusion is usually
far from simple (easy things are not called theorems), we often try to break
it into very small steps that are obvious. That is,
instead of trying to somehow justify the leap
*p* => *q*,

*p* => *p*_{1} => *p*_{2} => ... => *p*_{n} => *q*.

Each of those little implications must be either something so simple that its validity is obvious, or something that was already proved before, one can also expect that the induction assumption shoudl be used in justifying one of those little steps.

Many proofs fall into the direct proof box, however, such a direct path is
not always possible. Sometimes the proof goes
through more twisted paths that may even fork and merge, but as long as we
start with *p* and after a while end up with *q*, it is a direct
proof.

**Example:** Prove the following statement:

Let

xbe a real number. If it is positive, then alsois positive. x⋅(x+ 1)

We can use the definition of a positive number to rewrite this into a logical statement.

In order to prove this statement, one has to show validity of that implication for all real numbers, it is definitely not enough to just choose one concrete number (say, 13) and show that the implication works for it. We have to prove that implication for all real numbers, which is best done by choosing an arbitrary real numberFor every real number

x: [ifx> 0, then alsox⋅(x+ 1) > 0].

For this particular real number *x* (whichever it is) we need to show
that the stated implication is true. We will try the direct proof, that is,
we will take the assumption for granted and see what can be deduced from it.
Thus we now have a number *x* about which we know that it is real and
also that it is positive. As the first intermediate step we will show that
then also
*x* + 1 > 0.*x* > 0*x* + 1 > 1.*x* + 1 > 1 > 0*x* + 1 > 0

Now we will complete the proof by doing a second step. We know that every
inequality can be multiplied by a positive number and still stay true. Thus
we can multiply
*x* + 1 > 0*x* (we assume
and therefore take as true that it is positive) and obtain
*x*⋅(*x* + 1) > 0,

Somewhat more involved, but also more educational direct proof is here. We suggest that you look at it.

Indirect proof works as follows. Instead of proving the desired implication, we prove its contrapositive. It makes sense since we know that an implication and its contrapositive have exactly the same validity, so whatever we prove of a contrapositive (true, false) must also apply to the original implication. Since this contrapositive as a statement is also an implication, we can prove it for instance directly as explained above.

**Example:** We will prove that every prime larger than 2 is odd. One way
to express it formally is this:

∀ n∈{x∈ℕ;x> 2}: [ifnis a prime, thennis odd].

This statement starts with a general quantifier again, so we start by taking
an arbitrary natural number *n* that is more than 2. This means that
when we start our proof, we will be able to use everything that is known
about natural numbers and also the fact that
*n* > 2*n* the following implication is satisfied:

nis a prime =>nis odd.

However, a direct proof is somewhat awkward here. We should start by
assuming that *n* is a prime, but the definition of being a prime is
not very convenient (it says that *n* cannot be writen in some way,
which is a negative statement) and it is not clear how one can go on from
there. We therefore decide to prove the contrapositive instead:

nis not odd =>nis not a prime.

That is,

nis even =>nis not a prime.

Proving this implication seems much easier, since it starts with a definite
information: *n* is even. Thus we now assume that we have a natural
number *n* greater than 2 and moreover, it is even. This means that
there is an integer *k* such that
*n* = 2*k*.*k*,
what do we know about it? Since *n* > 2,*k* > 1.*n*
as a product of two natural numbers different from 1, which means that
*n* is not a prime and the contrapositive is proved. Consequently, also
the original implication is true.

This example shows why it may be useful to pass to a contrapositive: The properties used in the original implication may become much nicer when negated. A typical example would be properties that involve the relation "not equal to", then the negation will change it into equality, a much better relation from practical point of view. This is the case of the definition of a 1-1 function. Another good reason for trying an indirect proof may be that negation changes quantifiers from general to existential and vice versa, which may also sometimes help. Finally, sometimes we simply prefer start working at the "other end" of an implication.

**Remark:** The statement about primes can be formally expressed in other
ways, for instance like this:

∀ n∈ℕ: [if (nis a prime andn> 2), thennis odd].

This is an equally good way to express our fact, but when passing to the contrapositive we would have to do a negation of the conjunction on the left, which seems a bit more work than what we had in our proof above. This is in a sense typical. Often there are more ways to express formally a certain idea, and depending which way we choose, the proof may range from relatively simple to quite long.

Consider some statement *r* (not necessarily an implication). One
possibility to prove its validity is to proceed by contradiction. We assume
that *r* fails and then show that this is somehow in conflicts with
known facts. This then shows that *r* must be true. It is worthwhile to
look at this reasoning closer. Formally, a proof by contradiction means that
we will prove validity of the following implication:

¬ *r* => *F*.

What does it give us? The conclusion of this implication is a false
statement and the only way such an implication can be true is if also the
assumption is false. But ¬ *r* false means that *r* is true, exactly what we
wanted.

Proof by contradiction is useful when proving that something cannot happen. This is a "negative" information, which is generally hard to prove. A proof by contradiction starts with negation, that is, with assumption that something can happen, which gives us something definite to start from.

Now we apply this to the case of implication. We need to negate it,
a proof by contradiction of an implication
*p* => *q*

( pand nonq) =>F.

Thus we assume that *p* is true while *q* is not and then show
that these two assumptions together lead to some nonsense.

**Example:** We will try to prove the above statement about primes again,
this time by contradiction. So take an arbitrary natural number *n*
larger than 2. We need to show that if it is a prime, then it is odd.

Since we want to go by contradiction, we will assume that this number is a
prime but it is not odd. But then this number must be even so we can write it
as *n* = 2*k**k*. Since we assume that *n* is a prime, we must have
*k* = 1*n* = 2.*n* > 2,

In practice we rarely go all the way to some explicit false statement, we
usually we stop a bit earlier, the moment we arrive at conflicting
situation. In the above proof we would go like this:
"...and therefore *n* = 2,*n* > 2.

**Example:** We will prove that there is no smallest positive rational
number, which in particular means that rational numbers cannot be arranged
according to size. We will express it precisely.

The set of positive rational numbers does not have its smallest element (minimum).

Negative statements are often close to impossible to prove directly, so
negation — that is, proof by contradiction — is an obvious start.
So assume that the statement above is false, that is, assume that the set of
positive rational numbers has its least element, its minimum
(see Topology of real
numbers in Functions - Theory - Real functions), call it *r*.
Since it is a rational number, then also the number *r*/2*r*/2*r*/2 < *r*,*r* is a minimum of that set. The proof
is complete.

Note that we did not really arrive at a statement that is false, but stopped
at the first comflicting situation, just like we advertised above. If
somebody really wanted to see this done all the way, here it comes: Since
*r*/2*r* is a
minimal element of this set, we must have
*r*/2 ≥ *r*.*r* (a positive number, so dividing by it does not change
the inequality) and rearranging it a bit we arrive at the inequality

This concludes the section on proofs.

We offer another proof here. It is a direct proof, but somewhat more difficult, so it may be a challenge for you to go through it and try to understand.

You might have noticed that in proofs we rely on things that are already known, but these had to be also proved and those proofs had to use something reliable and so on, where does it all end? Mathematical theories are like trees, from known statements and properties we branch up to more and more statements, each is proved using things that are lower in that tree. But what are the roots, what is at the beginning? Actually, nothing. There are no basic facts in mathematics that are somehow true by themselves.

This may seem unsettling, so let's look at it somewhat closer.
Mathematicians spent a lot of time tracking down proofs and came up with a
relatively short list of things that are needed in order to build all the
rest. It was decided to take those key things for granted, these basic facts
are called **axioms**. They define basic rules and all the rest of
mathematics stems from them. Thus when a mathematician says that something
is true, it actually means that it is true assuming that all the standard
axioms are accepted as true. And where do those axioms come from? Since we
want math to be useful, axioms are chosen in such a way that they agree with
the way the world appears to us (something like

However, this does not mean that mathematicians actually need to believe in these axioms. As a matter of fact, many mathematicians like to play and just for fun they try to see what would happen if they tried to take a different set of axioms. They obtain alternative mathematical theories, describing worlds where things work differently from ours, even a small change in basic axioms may have far reaching consequences. Some of these worlds are fascinating and deserve further attention, some are funny and some weird. Probably the most amazing thing is that some of these different worlds turn out to be useful. For instance, in the 19th century, mathematicians tried to see how things would work in a world where geometry (lines, angles, parallels etc.) is almost but not quite the same as the one we are used to (they changed one axiom). To everyone's surprise, these results were exactly what later Einstein needed to describe the universe. Similarly, mathematicians were curious about what would happen with spaces if the dimension was increased to infinity. Such infinite-dimensional spaces then became a key tool for physicists when they got to playing with atoms (quantum theory).

To sum it up, mathematics can be also viewed as follows: We create a world by specifying its objects and some basic rules governing them (definitions) and mathematics then deduces reliable information about such a world. Investigating worlds whose basic rules are not exactly the same as those of our own world is hard. We cannot use our experience with how things work, we cannot make experiments, we cannot use our senses. There is a constant danger, the temptation to automatically use something that we consider obvious, because we know it from our world. The only defence against such a mistake is logic, it makes sure that whatever we claim about a certain world follows only from its basic rules. Its role of a guardian angel of correctness is crucial also when investigating our world using the standard set of axioms, where it protects us from mistaken assumptions and other mistakes of reasoning. This concludes this remark and this section.