Math Foundations from Scratch


This is part IB of the AI Tripos to learn the basics of mathematical reasoning so deduction and strategies for writing proofs. No background is required. We start with some typical linear progression then it turns into a game learning through solving problems. There is a seperate linear algebra workshop here.

Do 1% effort for 100 days - Prof Ryan O'Donnell.

What is math

As per Poh-Shen Loh it is one of the easiest subject you will ever take because there is little to memorize, you can deduce whatever it is you don't know using mathematical reasoning. The closest analogy would be lawyers, they deduce from statutes and case law new arguments. We will be doing the same thing using methods of reasoning. Once you start to get familiar with this logic then you can use it not only to manipulate models you build in math of real life things but in everyday life your thinking will change to become more rigorous. In math the model studies you while you study the model.

Terence Tao believes mathematics is complex, high-dimensional (as in moving in any direction, like the depiction of an ancient deity with many arms) and evolving in unexpected and adaptive ways.

Start here

We start from the very beginning counting sticks and work our way up to the CMU Putnam Seminar.

The natural numbers

Numbers are objects you build to act like the concept of a number. Here we begin to construct the natural number object.

  • YouTube - Arithmetic and Geometry Math Foundations 1 *note these videos eventually get better in quality

Arithmetic with natural numbers

  • YouTube - Arithmetic and Geometry Math Foundations 2

An interesting explanation of multiplication.

  • Laws for addition
    • n + m = m + n (Commutative)
      • Changing the order of the elements.
    • (k + n) + m = k + (m + n) (Associative)
      • Changing the grouping of the elements.
    • n + 1 = s(n) (Successor)
  • Laws for multiplication
    • nm = mn (Commutative)
    • (kn)m = k(nm) (Associative)
    • n \(\cdot\) 1 = n (Identity)
  • Note the different notation for multiplication:
    • nm
    • n \(\cdot\) m
    • n \(\times\) m
  • Distributive laws
    • k(n + m) = kn + km
    • (k + n)m = km + nm

Exercise: Prove why the multiplication and distributive laws work. Since we have no idea what a proof is yet, we can instead try going through some of our own examples.

Identity law: n \(\cdot\) 1 = n.

  • We are creating n groups of 1. According to our definition of multiplication |||| \(\cdot\) | would be 4 groups of | or ||||. This is still 4.

Commutative law: nm = mn.

  • ||| \(\cdot\) || = || || || or 6. Rearranging this, || \(\cdot\) ||| is ||| ||| which is again 6.

Associative law: (kn)m = k(nm).

  • If k = |, and n = ||, and m = ||| and parens expressions are evaluated first:
    • Left side: (kn)m. We first evaluate (kn) as || (2 groups of 1, the identity law), and 2m is ||| ||| (2 groups of 3) for a total of 6.
    • Right side: k(nm): In the parens nm is 2 groups of 3, so ||| |||. k(6) is the identity law, one group of 6. Thus (kn)m = k(nm).

Distributive law: (k + n)m = km + nm.

  • If k = ||||, and n = ||, and m = |:
    • Left side: (|||| + ||) is 6, and (6) x 1 is six groups of 1 or 6 according to the identity law.
    • Right side: km + nm: 4 groups of one is 4, plus two groups of 1 is 2 (identity law). Answer is again 6.

Laws of arithmetic 2

  • YouTube - Arithmetic and Geometry Math Foundations 3

Wildberger shows another interpretation of multiplication, and walks through the proofs of the laws by rearranging according to the commutative law.

Subtraction and division

  • YouTube - Arithmetic and Geometry Math Foundations 4

Subtraction is the inverse of addition

  • If k + m = n then k = n - m
  • Note: n - m is only defined if n > m as we are still in the realm of the Naturals and negative numbers/integers aren't yet defined.

+, -, are binary operations and can only do 2 operations at a time

  • Laws of Subtraction:
    • n - (m + k) = (n - m) - k
    • n + (m - k) = (n + m) - k
    • n - (m - k) = (n - m) + k
  • Distributive laws of subtraction:
    • n(m - k) = nm - nk
    • (n - m)k = nk - mk

Note to remove parens: we can drop the grouping enforcement when using addition (associative law) but must distribute the negative sign, as subtraction is not associative.

  • n - (m + k) = (n - m) - k
    • removing grouping: n - m - k = n - m - k

The Hindu-Arabic number system

  • YouTube - Arithmetic and Geometry Math Foundations 6

A simplified Roman numeral system is introduced to abstract our previous notation of one's so we can manipulate bigger numbers.

  • Uses powers of 10
    • 327 = 3(100) + 2(10) + 7(1)

Arithmetic with Hindu-Arabic notation

  • YouTube - Arithmetic and Geometry Math Foundations 7

Multiplication with the Hindu-Arabic notation:

  • Same laws as before:
    • (ab)c = a(bc)
    • a(b + c) = ab + ac
    • (a + b)(c + d) = ac + ad + bc + bd

A different way of hand multiplication is shown, taking advantage of the fact the notation is using the distributive law.


  • YouTube - Arithmetic and Geometry Math Foundations 8

Division is repeated subtraction. Laws of Division:

  • 1. \(\frac{k}{m}\) + \(\frac{n}{m}\) = \(\frac{k + n}{m}\)
  • 2. \(\frac{k}{m}\) \(\cdot\) \(\frac{n}{l}\) = \(\frac{kn}{ml}\)
  • 3. \(\frac{k/m}{n}\) = \(\frac{k}{mn}\)
  • 4. \(\frac{k}{m/n}\) = \(\frac{kn}{m}\)


  • YouTube - Arithmetic and Geometry Math Foundations 9

We're still using only natural numbers.

  • Def: A fraction, is an ordered pair (m,n) of natural numbers also written m/n.
    • Fractions m/n and k/l are equal precisely when ml=nk.

Arithmetic with fractions

  • YouTube - Arithmetic and Geometry Math Foundations 10

We're still in the natural numbers, which he defined as beginning with 1 so we aren't working with zeros yet (no division by zero corner cases). The prime notation (the ' character so c', a', b') used in the proof represents another variable. What we are trying to prove is ab' = ba' and cd' = dc'. He introduces some extra notation multiplying both sides by (dd') to help prove this. This is our first experience with a proof strategy: multiply both sides of something linear by a scalar.

Laws of arithmetic for fractions

  • YouTube - Arithmetic and Geometry Math Foundations 11

Wildberger proves the distributive and commutative laws using fractions while still in the natural numbers.


Now we will redo what we just learned rigorously in Terence Tao's Analysis I. The author explains here why we need this rigorous stage. This book was originally supplementary notes for an honors analysis class, and in the preface he mentions we will need some lectures to go with the material to help us understand it. We will read only some of this text at first, then return to it later when we're doing problem challenges, by that time you will be much more sophisticated through practice.

Nature of mathematical logic

Figure 1: What is a proof from 15-251

Axioms are laws without a proof, as there are no earlier laws from which we can deduce their proofs from and they should be self-evident and simple enough that they can be fully understood such as Euclid's parallel postulate that it is obvious there exists a concept of parallelism.

Deduction can be described as if P then Q, assume P and prove Q which is part of the rules of inference. There is a series of lectures given by Per Martin-Lof in 1983, which in clear language anybody can understand, explains the meaning of logical constants and the justifications of the rules of inference, and their history such as hypothesis a Greek word translated into Latin as supposition, meaning an assumption. In math the contrapositive of statements is often used for proofs such as 'if P then Q so not Q then not P' as these are logically equivalent statements.

Variables are used to express general laws, for example how do you express that every natural number is equal to itself ie: 2 = 2 and 3 = 3. You introduce a free variable x = x and it's meaning remains fixed inside this context. Often the prime symbol is used to denote two different variables with the same letter such as b and b' called b and b prime.

The operation of replacing all occurences of one variable for a value is known as substitution such as a + 0 = a and if a = 1, then substitute 1 + 0 = 1. There is also a substitution rule, where equivalent terms or formulas can be substituted for one another and this becomes highly abstract interchanging entire complex expressions. In books you will often see a succession of equal signs: (thing) = (bigger thing) = (even more complex thing) meaning all 3 can be substituted for each other.

Math has data structures just like programming languages, and you can create new ones if you want. There are sets like {3, 1, 5} in which order does not matter, and lists like (a1, a2, …., an) which are ordered data. These are often interpreted as vectors or n-tuples like (x, y) coordinates. A list (a, a, b) is different from (a, b) as order matters but set {a, a, b} is the same as {a, b}. Math is object-oriented so theories describe classes of structures and the instance of those structures models the theory/axioms.

That's it really: rules of inference, logical deduction, building models and manipulation of data structures. Much of the above was taken from texts you can find on library genesis about mathematical logic like Introduction to Metamathematics by Kleene if you want to see how reduction is used to collect a bunch of theories and reduce them to axioms, and how similar it all is to computer science with relational models, decidability, complexity theory and determining semantics of a natural language.

Lectures on logic

This lecture gives examples of an initial object, and deduction rules. You really just need to watch up to 5:30, then skip to 13:35 to see his ATM example of deduction and how he deduces the proof that the ATM can provide any postitive natural number denomination except for 1 and 3. The rest of the lecture is optional where he models deduction using a boolean circuit: given input X, Y or Z these are passed through gates such as NOT, AND, OR to arrive at an output which is your deduced formula.

  • Logic - Lecture from 15-251 CMU

Around 32:00 is the model for 'pretty much all mathematical reasoning' and logical quantifiers like 'for all' or 'there exists' and propositional connectives such as 'implies' or 'iff' are introduced. Most of this is in Tao's appendix.

You only need to know these connectives:

  • AND \(\land\) (conjunction)
  • OR \(\lor\) (disjunction)
  • NOT \(\neg\)
  • Implies \(\implies\) is only false when A is true and B is false
  • If and only if \(\iff\)

These connectives can of course be used to define each other:

  • A \(\implies\) B is \(\neg\) A \(\lor\) B
    • implication is only true when either A is false OR B is true
  • Iff is (A \(\implies\) B) \(\land\) (B \(\implies\) A)

Implication seems confusing at first, it's modern language use relating to casuality is different from the logician's use which is explained here and those Per Martin-Lof lectures above gives a complete justification of the logical rules of implication.

Some more optional lectures on the history of logic from Aristotle to the present state:

  • Aristotle and deduction - Brief history of logic MF 251
    • Origins of modern logic
  • Stoics and other thinkers - Brief history of logic MF 252
    • Implication, disjunction and conjunction is discussed here and highly relevant to all further chapters
  • Islamic/Medieval logic - Brief history of logic MF 253
    • Modal logic origins
  • Leibniz to Boole - Brief history of logic MF 254
    • From deduction to induction
    • Notation for relations
    • Boole's book Analysis of Logic
    • De Morgan's laws

Materials needed

  • The 3rd edition of Analysis I by Terence Tao Buy it cheap from Amazon India/Abebooks, or use library genesis (see Wikipedia page for their latest domain) or WorldCat to get it.

2.1 The Peano Axioms

Start with Analysis I chapter 2. We begin again with basic counting but rigorously defined.

  • The five Axioms
    • Axiom 2.1: 0 is a natural number
    • Axiom 2.2: If n is a natural number, then n++ is also a natural number
    • Axiom 2.3: 0 is not the successor of any natural number
    • Axiom 2.4: Different natural numbers must have different successors
    • Axiom 2.5: Let P(n) be any property pertaining to a natural number n. Suppose that P(0) is true, and suppose that whenever P(n) is true, P(n++) is also true. Then P(n) is true for every natural number n.
      • We make an assumption about a property P(n) being true meaning you do not prove P(n) it is already assumed to be true, you prove P(0) and P(n++) are true, and all three form the logical chain that also proves n+2, n+3, etc.

You can think of n++ being a unary successor operation to generate the natural numbers, while n + 1 will be defined later as a binary operation. The idea is no matter how many natural numbers you have generated, n++ will generate one more.

Remark 2.1.10 the example where no natural number can be 'a half number', because of the increment step n++. The base case of 0 (as per Axiom 2.5, and Axiom 2.1) fails if we are using 'half numbers' instead of natural numbers because if zero is a natural number, and any n++ is another natural number, the only next successive number is 1 it can't be 0.5 or -3 or 3/4. The way you should view these axioms about the natural numbers is:

  • 1. 0 is an element of N
  • 2. If x is an element of N, then so is x++
  • 3. Nothing else is an element of N

This is called an inductive definition: clause 1 is an application, clause 2 a succession of applications, clause 3 is the extremal clause stating only steps 1 and 2 are instances of what N is.

P(n) is not a function it represents "n has the property P". Think about how the natural numbers are generated with successors, those generated natural numbers all inherit properties of previous generated numbers such as P(n): "any natural number n + 0 = n". If you prove that for 0, n and the next successor n+1 that these all have property P(n), any new successor will inherit this property too since they are derived in a chain from P(0) as per Axiom 2.1. You have already encountered mathematical induction in elementary school when presented with a summing progression such as '1+2+3+4…' the inductive step is 'and so on'.

2.2 Addition

The example for addition: 3 + 5 should be the same as (((5++)++)++) and adding 0 to 5 should result in 5.

Definition 2.2.1 we can add n++ to m by defining (n++) + m to be (n + m)++. You can read this as (thing)++ and anything in brackets is just a value to be determined, so you can consider it as a single value to be incremented. Example (1 + 2 + 3)++ + 5 is (n++) + 5.

Try writing your own examples:

  • 0 + m = m
  • 1 + m = ((0++) + m) or m++
  • 2 + m = ((1++) + m) or ((m++)++)
  • 3 + m = ((2++) + m) or (((m++)++)++)
  • 4 + m = ((3++) + m) or ((((m++)++)++)++)
  • 2 + 3 = ((3++)++) or (4++) or 5.
  • 3 + 2 = (((2++)++)++) or ((3++)++) or (4++) or 5.

First proof

Let's go through his proof in Lemma 2.2.2. Notice first it is a written essay, and that's how all your proofs should look. Second, he is using the exact same induction template from 2.1.11. Third he's using a different variable to not confuse it with the already established definition of 0 + m = m reminder these variables are just placeholders for value and if you see m twice or n twice it means the same value.

Induction proofs are a logical chain:

  • P(basis First link contains the property and is true, you prove this
  • P(n) Assume this second link contains the property, you do not prove this because you assumed it's already true
  • P(n++) Prove successor is true, with ability to use the previous 2 like they are already established definitions

How does the base case for n + 0 = n become 0 + 0 = 0, from the previous definition of 0 + m = m? We are inducting on n so once we substitute in 0 for n we now have 0 + (value) = (value) which is from the definition of addition: 0 + m = m. Since (value) is 0 it is 0 + 0 = 0.

Suppose that n + 0 = n (assumption). Then the successor (n++) + 0 = n++. Pointing to the definition of addition where it defines (n++) + m = (n + m)++ we see it is in the same form as (n++) + 0, so if we say something is equal to something else, we can replace them for each other: (n++) + 0 becomes (n + 0)++ as they are equal so interchangable.

Finally we are left with (n + 0)++ = n++. The value in the brackets (n + 0) is in the form of P(n) the inductive hypothesis or assumption, since we have assumed n + 0 = n is true, we can again replace n + 0 for n and arrive at the proof: (n)++ = n++. Remember to consider your assumption 'already true' and a definition you can just point to while in the inductive proof. Why is induction used? Because we are building up a structure. He defined natural numbers as being the successor of another, which means an incremental layer upon layer that you can recursively take apart back to the base case just like you can in computer programming. Property P(n) exists in base case 0, plus 1, plus another, so this property holds for all natural numbers derived from the Peano axioms.

The most important takeaway: no arithmetic was done here at all. We pointed at definitions, and reasoned about them using replacement to exchange their forms until they were the same on both sides of the equal sign. You are a logician now pattern matching forms and moving them around. If thing = thing2, then thing2 = thing. At this early stage you could literally cut out the definitions and assemble proofs by swapping out definitions for each other.

How can you check this is correct? Reverse the logic, if n + 0 does not equal n, then 0 + 0 does not equal 0.

Second proof

Lemma 2.2.3 Here we are proving n + (m++) = (n + m)++, inducting on n. Since we are building a logical chain of deductions we start at the first link, which is 0 and induct on variable n meaning we replace n with 0 to get the form: 0 + (m++) = (0 + m)++. Looking at the right side, get out your scissors and cut apart the definition of addition and replace (0 + m)++ with (m)++. So now we have 0 + (m++) = m++. Since we know from the definition of addition that 0 + (value) = (value), you can replace 0 + (m++) with m++ and now the base case is equal on both sides: m++ = m++

Next, we assume n + (m++) = (n + m)++.

We prove the successor: n++ + (m++) = ((n++) + m)++. Looking at the left side, refer to the definition of addition and notice (n++) + m = (n + m)++ or (successor value) + (something) = (value + something)successor, the left hand side is now (n + (m++))++. This form is similar to the hypothesis of n + (m++) = (n + m)++ so replace the left side and now we have ((n + m)++)++. The right side follows the same logic as the left. ((n++) + m)++ can be seen abstractly as (value)++ or (successor + value)++, and the second is the definition of addition again so ((value + value)successor)++

You can always insert concrete examples to see what is going on, ie for n+(m++) = (n+m)++ let n=2, m=0 then 2+(0+1) = (2+0)+1.

Third proof

Proposition 2.2.4 (Addition is commutative) the base case is point-to definition of addition and Lemma 2.2.2. We use the definition of addition, and lemma 2.2.2 to rewrite the successor of n to be (n + m)++ = (m + n)++ and now it is in the form of the inductive hypothesis: (hypothesis form)successor = (hypothesis form)successor so we can use our assumption that n + m = m + n to commute m and n around.

Fourth proof & Implication

Proposition 2.2.6 we want to prove b = c. In the inductive step, he uses Axiom 2.4 to drop down from the successor to the form: a + b = a + c which is the form of the inductive hypothesis, where we claimed 'a + b = a + c implies b = c' so we can now imply b = c and have 'virtual cancellation without subtraction'. Tao writes about implication in the appendix which you should read then return to many times over as it comes up in later chapters.

Exercise 2.2.1

Prove 2.2.5 (Addition is associative) that (a + b) + c = a + (b + c)

We want to show that if a natural number is inside or outside brackets, it will have the same result when added together. Induct on any variable (a, b or c) you want. Starting our logical chain at the first link 0:

  • Base case:
  • (a + b) + 0 = a + (b + 0) (substitute 0 for c)
  • looking at left side: (a + b) + 0
    • since b + 0 = b (lemma 2.2.2) then (value) + 0 = value
    • (a + b) + 0 = a + b (lemma 2.2.2)
  • looking at right side: a + (b + 0)
    • a + (b + 0) is a + (b) for the same lemma 2.2.2 reasons
  • now we have a + b = a + b and base case is done.

We selected c for the base case, but what if you choose a or b? Try it while looking at the definition of addition. The next step in our logical chain is our inductive hypothesis(IH) that addition is associative, and that (a + b) + c = a + (b + c). The last part of the logical chain is prove the next link, the successor of our hypothesis, is true. There's a few ways to show this, I'll move stuff around randomly and you can see where the proof is:

Inducting on c: (a + b) + (c++) = a + (b + (c++))

  • right side: a + (b + (c++))
    • a + (b + c)++ (lemma 2.2.3)
    • ((a + b) + c)++ (by the IH)
    • ((a + b)++) + c (def of addition)
    • ((b + a)++) + c (commutative)
    • ((b++)+ a) + c (def of addition)
    • c + ((b++) + a) (commutative)
    • c + (a + (b++)) (commutative)
  • left side: (a + b) + (c++)
    • (c++) + (a + b) (commutative)
    • (c + (a + b))++ (def of addition.. consider (a + b) to be (value)): Recall (n++) + m = (n + m)++ so this means (c++) + (a + b) is (c + (a + b))++ because (n++) + (value) = (n + value)++
    • (c + (a + b))++ is the same as ((a + b) + c)++ (commutative around the (a + b) bracket)
    • (a + (b + c))++ (by the IH)
    • ((b + c) + a)++ (commutative)
    • etc., can keep moving these around

Try moving everything around until you understand exactly how commutativity and associativity work. See if you can get it to a form where you can use the inductive hypothesis, replace and permute them around some more until this starts to make sense.

2.2. cont.

Lemma 2.2.10 has an exercise where induction is suggested. Always read the comments on Tao's blog if you're stuck but this is the Peano Axiom 2.4 so not sure why it's even here, it's just restating Axiom 2.4 and 2.3. Assume for contradiction that b++ = a and c++ = a so a positive number is the successor of two different natural numbers. But by the Peano axioms this is not possible so c++ must equal b++. That is my good enough proof so we can later use this lemma to replace 'a positive number' for b++.

Definition 2.2.11 n is greater or equal to m if n = m + some other natural number. If that number is 0, then the equality holds, if that number is 0++ then the inequality holds.

Another way to think about order of the natural numbers, is because they are inductively defined and generated, they already have an order. You can think of m < n if m is generated before n.


Try a few yourself, write a convincing argument to yourself like how Tao writes proofs. Click on details button to see my scratchwork you do not have to do all the exercises.

Proposition 2.2.12 / Exercise 2.2.3: Prove the basic properties of order for natural numbers. None of this is a properly written proof only notes on moving through steps to understand the problem then after write a proof in the style Tao has been teaching us. See the appendix what reflexive, transitive and anti-symmetric properties are.

  • Order is reflexive
  • Reflexive property is that any natural number is equal to itself.
  • We want to show order is reflexive: a \(\ge\) a
  • a = a + n (def 2.2.11 of ordering of natural numbers)
  • From lemma 2.2.2 we know that a + 0 = a and so a = a and is reflexive
  • By definition 2.2.11 we also have a \(\ge\) a
  • Transitive order property
  • a \(\ge\) b and b \(\ge\) c, then a \(\ge\) c
  • If a = b + 0, b = c + 0, and a = c + 0, then we have lemma 2.2.2 and transitivity (equality) holds with a = b, b = c and a = c.

Try some more equivalency replacement to prove a \(\ge\) b and b \(\ge\) c, then a \(\ge\) c

  • a = b + n
  • b = c + n
    • b is equal to (c + n) so replace for b in a = b + n:
  • a = (c + n) + n
  • a = c + (n++)
    • By lemma 2.2.10 we have a = c + (value) since n++ will always equal some nonzero natural number.
  • By definition 2.2.11 a = c + (value) is a \(\ge\) c + (value)

Possible proof strategy: By the definition of ordering of the natural numbers, if a = (b) + m and b = c + n, then a = (c + n) + m. By associativity a = c + (m + n) and since if a = b when m or n are 0, then a = c when m and n are 0. By replacement d = m + n, and since m and n are natural numbers, d is a natural number by the Peano axioms and 2.1.3 (m + n)++ = d++, and by Axiom 2.4 (m + n) = d so we have a = c + d as desired by the definition of ordering of the natural numbers.

  • Anti-symmetric order
  • If a \(\ge\) b and b \(\ge\) a then a = b
  • See the entry for Antisymmetric relation
  • If these two inequalites hold then they must be equal
  • We need to show a = b:
    • a = b + m by 2.2.11
    • b = a + n by 2.2.11
  • Replace b for (a + n):
    • a = (a + n) + m
    • a = a + (n + m)
    • a + 0 = a + (n + m) (lemma 2.2.2, if a = a + 0 then if you see a single variable, adding 0 to it is fine)
    • 0 = n + m (cancel a by cancellation law)
    • 0 = 0 + 0 (corollary 2.2.9)
  • If n = 0, and m = 0 by above, then a = b + 0 which is a = b

Scratchwork for (d) preserving order exercise:

  • d) Addition preserves order
  • a \(\ge\) b iff a + c \(\ge\) b + c
    • Use def 2.2.11 to convert a \(\ge\) b to the form a = b + d, add c to both sides:
    • a + c = b + d + c by 2.2.11
    • a = b + d by cancellation law and a \(\ge\) b by 2.2.11
  • Another method: if a + c \(\ge\) b + c then:
    • a + c = b + n + c by 2.2.11
    • a + c = b + c + n (commutative law)
    • (a + c) = (b + c) + n (associative law)
    • This is (nat number) = (nat number) + n or by 2.2.11 a \(\ge\) b
      • Again don't get hung up on what is inside the brackets, think of larger picture of def 2.2.11 that (value) = (value) + number

Scratchwork for exercise (e):

  • e) a < b iff a++ \(\le\) b
    • if a is less than b, then b = a + n and n does not equal zero (2.2.11)
    • Since n must be a positive natural number by 2.2.7 and 2.2.11 then b = a + (n++) by 2.2.10
    • b = a + (n + 1) (corollary from 2.2.2/2.2.3)
    • b = a + (1 + n ) (commutative law)
    • b = (a + 1) + n (associative law)
    • b = (a++) + n by previous corollary of n++ is n + 1
  • We have a++ \(\le\) b by 2.2.11

Scratchwork for exercise (f):

  • f) a < b iff b = a + d for some positive number d
  • For the sake of contradiction, assume a \(\le\) b in the a = b case for some positive number d. Then a + 0 = a + d by lemma 2.2.2, and 0 = d by cancellation law. This is a contradiction that d must be a positive number so a cannot equal b and a \(\lt\) b

Proposition 2.2.13 fill in the gaps:.

  • When a = 0, we have 0 \(\le\) b for all b (why?)
    • 2.2.1 Addition of natural numbers, 0 + m = m so 0 + b = b and by 2.2.11 0 \(\le\) b
  • If a > b then a++ > b (why?)
    • a = b + d (as per 2.2.12, d must be positive)
    • c++ = d (lemma 2.2.10)
    • a++ = b + (c++) (Replacing c++ for d)
    • a++ > b by def of ordering
  • Or another way:
    • a++ = (b + c)++ (def of addition, (b + c)++ must be positive
    • a = b + c (Axiom 2.4)
    • a > b (2.2.12 d is positive)
  • If a = b, then a++ > b (why?)
    • If a = b, then a++ = b++ by Axiom 2.4
    • a++ = b + 1 (corollary 2.2.2/2.2.3)
    • this is a++ > b by 2.2.11

Proposition 2.2.14 (Strong induction) Intuitively, if the natural numbers are generated by an inductive definition, then you can assume all numbers generated up to m that are less than n, meaning n is no longer just the next step after the base case, it can be anywhere in the chain of generation and you assume an entire sequence of generated naturals up to n have property P(n).

First look up vacuous truth or see the appendix on mathematical reasoning where Tao defines it.

  • Let Q(n) be the property that P(m) is true for all m\(_0\) \(\le\) m < n for n \(\ge\) m\(_0\).
  • Q(0) is true since:
    • m\(_0\) \(\le\) m < 0 is vacuously true as stated by Tao in the exercise hint
    • 2.2.14 proposition of for all n \(\ge\) m\(_0\) is true 0 \(\ge\) 0\(_0\)
  • Show that Q(n++) is true:
  • P(m) m\(_0\) \(\le\) m \(\lt\) n \(\lt\) n++ is true (2.2.11)
  • P(m) m\(_0\) \(\le\) m \(\lt\) n++ is true for all n \(\ge\) m\(_0\)

Try an example.

  • P(n) is the property that n + 0 = n
  • Q(0) is 0 + 0 = 0 and P(0)
  • Q(1) is 1 + 0 = 1 and P(1), P(0)
  • Q(2) is 2 + 0 = 2 and P(2), P(1), P(0)

2.3 Multiplication

Proposition 2.3.4 follow along with definition 2.3.1 of multiplication where (n++) x m = (n x m) + m also recall that n++ is n + 1: (n + 1)m so nm + m

  • left side:
    • a(b + (c++)) is a(b + c) + a because (m(n++)) = m(n) + m from def 2.3.1
    • a(b + (c++)) is a(value)++ or a(value) + a
  • right side:
    • ab + a(c++) is ab + ac + a
    • ab + ac + a is in the form of the inductive hypothesis so change it for a(b + c) + a

Remark 2.3.8 the contradiction is that ac = bc by corollary 2.3.7 when fed into the trichotomy of order can only result in a = b. This is a scheme to produce a temporary notion of division/cancellation.


Scratchwork and notes

Exercise 2.3.1 From def 2.3.1 we have 0m = 0. So we need m0 = 0. Tao states to mimic 2.2.2 but for multiplication:

Lemma X: For any natural number n, n0 = 0.

  • base case: 0(0) is 0 x 0 = 0 by 2.3.1 so 0 = 0
  • IH: n0 = 0
  • Inductive step: (n++)0 = 0
    • (n++) x m = (n x m) + m by def 2.3.1 so (n0) + 0 = 0
    • Remove the + 0 (lemma 2.2.2)
    • Now it is in the form of the IH: n0 = 0 so we can replace 0 for n0
    • 0 = 0 and induction is closed.

From 2.3.1 we have (n++) x m = (n x m) + m but not n x (m++). We need a Lemma Y: n x (m++) is (n x m) + n rewriting Lemma 2.2.3 but for multiplication. Plug in values to make sure it's correct: 1 x (2++) = (1 x 2) + 1 or 2 x (2++) = (2 x 2) + 2.

Lemma Y: n(m++) is (nm) + m:

  • base case: Our lemma X and def 2.3.1
  • IH: n x (m++) = (n x m) + n
  • Inductive step on n: (n++ x m++) = (n++ x m) + n++
  • Left: (n++ x m++)
    • (n++) x (thing) is (n x thing) + thing by def 2.3.1
    • (n x m++) + m++
    • It's in the form of IH now, we can replace (n x (m++)) for (n x m) + n
    • ((n x m) + n)) + m++
      • (n x m) + n + m + 1 since m++ is m + 1
  • Right: (n++ x m) + n++
    • (n++ x m) can be rewritten to (n x m) + m as per def 2.3.1
    • ((n x m) + m)) + n++
      • (n x m) + m + n + 1 since n++ is n + 1

Now we have to prove mn = nm

  • Base case: def 2.3.1 and lemma X
  • Hypothesis: mn = nm
  • Inductive step: m(n++) = (n++)m
    • Lemma Y = def 2.3.1

Exercise 2.3.2

The hint is to prove the second statement first, if n and m are both positive, then nm is also positive. You could use Proposition 2.3.6 as a template for this:

  • If n > 0, and m > 0, then nm > 0
    • We have nm = 0 + d for some positive d
    • Since d is positive, n and m must be positive (or else 0 = 0 + d, a contradiction)

Or you could use Lemma 2.2.10, n x m = (b++) x m and then mimic 2.3.6 to determine n x m is positive if n and m are positive. Note that (b++) x m expands to the form (b x m) + b, and since you know a x b + a is the successor form of multiplication it can't be zero.

Lemma 2.3.3: n x m = 0 iff n = 0 or m = 0. You could try a contradiction using the nm > 0 proof we already wrote, so it must be n = 0 or m = 0 if nm = 0 or a contrapositive of Tao's statement: see the appendix on mathematical reasoning. If Tao states something, and then proves it, then there is a contrapositive statement we can use.

Exercise 2.3.3 For any natural numbers a, b c, we have (ab)c = a(bc). Tao suggests using the distributive law:

  • Inductive Step: (a(b++))c = a((b++)c)
    • (a(b + 1))c by corollary n++ = n + 1
    • (ab + a)c by distributive law
    • (ab + a) is also a(b++)
    • a(b++)c is also ac(b++) by commutative law
      • you can see where this is going, keep distributing and moving things around like we did before to prove associativity for addition

Exercise 2.3.4

Prove the identity.

  • (a + b)(a + b) = (a + b)a + (a + b)b by distributive law/exponentiation
    • aa + ba + ab + bb by distributive law
    • a\(^2\) + ba + ab + b\(^2\) by def 2.3.11/Examples 2.3.12
    • ba + ab is ba + ba by commutative law
    • ba + ba is 2ba by def 2.3.1 where 2 x m = 0 + m + m
      • (a + b)\(^2\) = a\(^2\) + 2ab + b\(^2\)

Set Theory

History of set theory

There is a lecture on the history of sets/functions, explaining ordinals, cardinals, various schools of logic, and a second lecture about the history of set theory with Godel and Turing.

Appendix A: the basics of mathematical logic

This chapter on set theory in Analysis I relies on the fact that you've already read the appendix on logic. Tao makes a point that this chapter is meant to be read and understood not studied and memorized. Proposition A.2.6, proof by contradiction, we will learn later that when pi/2 (90deg) the sine function is 1.

Exercise A.5.1

  • a) For every x++, and every y++, y*y = x or x is the square root of y. Is this true? Square root of two is the obvious candidate for claiming this is a false statement since what y * y = sqrt(2).

Lectures on deduction

Optional lecture, you've already done some deduction from writing proofs, however this lecture shows what is and what isn't deducible. The end of the lecture has overlap with the appendix and the first logic lecture in this workshop.

Quiz: Logic

These multiple choice quizzes are from Tao's Analysis class, try the logic quiz. If you struggle search YouTube for proofs by implication, or search 'discrete math inference rules' there are dozens of tutorials.

3.1 Fundamentals (Set Theory)

You may want to watch this lecture by Wildberger he visualizes sets.

Example 3.1.2 is similar to a nested list such as [list: 1, [list: 1, 2], 100] is length 3, even though there is a nested list overall it still only contains 3 elements. Remark 3.1.3 is in the above historical lectures on set theory. We are referred to Appendix A.7 Equality to read about the substitution axiom which you should have already read.

Lemma 3.1.6 if set A is equal to set (empty), then if x is in A it must also be in set empty, a contradiction, so A must be a non-empty set is how I interpret the lemma, which is a way to say that if an empty set exists, there must also be a possibility of a non-empty set. Remark 3.1.8 Cardinality if you watched the Wildberger History of Mathematics lecture linked above assigns a value.

Note the errata for Definition 3.1.4 from Tao's site: "Definition 3.1.4 has to be given the status of an axiom (the axiom of extensionality) rather than a definition, changing all references to this definition accordingly. This requires some changes to the text discussing this definition. In the preceding paragraph, “define the notion of equality” will now be “seek to capture the notion of equality”, and “formalize this as a definition” should be “formalize this as an axiom”. For the paragraph after Example 3.1.5, delete the first two sentences, and remove the word “Thus” from the third sentence. Exercise 3.1.1 is now trivial and can be deleted." So this is now Axiom 3.1.4, but the definition remains the same.

Exercise 3.1.2 prove emptyset, {emptyset}, {{emptyset}}, and {emptyset{emptyset}} are all not equal.

  • If \(\emptyset\) = {\(\emptyset\)} then by Def 3.1.4 \(\emptyset\) \(\in \emptyset\) a contradiction to Axiom 3.2
  • Same reasoning as above for \(\emptyset\) is not equal to {{\(\emptyset\)}} or {\(\emptyset\){\(\emptyset\)}}
  • {\(\emptyset\)} is not equal to {{\(\emptyset\)}} because [list: 1] is not equal to [list: [list: 1]]. The first has element 1 in it, the second has element [list: 1].

Remark 3.1.21 clarifies the above exercise how {2} and 2 are distinct objects. is not in {1,2,3} but 2 is in {1,2,3}. The rest of this chapter is straight forward definitions of set operations.

Example 3.1.32 looks a lot like programming syntax. There seems to be an order of evaluation, applying the axiom of specification first to create a new set, then applying the axiom of replacement so working right to left in {n++ : n \(\in\) {3,5,9); n<6}

Quiz Sets

Try Tao's quiz. There are many Youtube lectures on basic theory.


You can skip these, even in the preface of this book Tao advocates to just read this chapter so I've only done a few.

  • 3.1.3 prove the remaining claims in 3.1.13:

If a and b are objects, then {a, b} = {a} U {b}: By (new axiom) 3.1.4 we need to show every element x in {a, b} is in {a} U {b}. Suppose that x is an element of {a, b} then by axiom 3.3 x = a or x = b. By Axiom 3.4 this means at least one of the following statements is true: x \(\in\) {a} or x \(\in\) {b}. We divide into cases. If x is in a, then by axiom 3.4 x \(\in\) {a} U {b}, similarly if x is in b, it is also in {a} U {b}. Thus in both cases we see every element of {a, b} is in {a} U {b}. Note the proof works by showing x is in both a AND b which is the definition of union.

  • The next claim, prove the union operation is commutative so A \(\cup\) B = B \(\cup\) A

Follows the same proof he did for the union operation is associative. Suppose x is an element of A \(\cup\) B. By def 3.1.4 and axiom 3.4 pairwise union then x \(\in\) A or x \(\in\) B. By disjuntion this is also x \(\in\) B or x \(\in\) A. So x is in B \(\cup\) A and x is in A \(\cup\) B.

  • 3.1.4 prove claims in Proposition 3.1.18

You can write both these proofs exactly like his 3.1.18 proof for transitivity. Suppose A \(\subset\) B and B \(\subset\) A. To prove that A = B we have to prove that every element A is an element of B, and every element B is an element of A by def 3.1.4. Let us pick an arbitrary element x \(\in\) A. Then, since A \(\subset\) B, x must also be an element of B. But since B \(\subset\) A, then every element of A is an element of B and every element B is an element of A. By 3.1.4 they are equal.

  • 3.1.15 Show how union, intersection and subset are logically equivalent:

Show that any two set operations implies another. If x is in A \(\cup\) B, then by definition of union x \(\in\) A or B, so by disjunction x is in B because logically we can pick either one. If x is in A, since A is a subset of B then x is again \(\in\) B. Thus x \(\in\) A \(\cup\) B if and only if x \(\in\) B and by definition of union, and axiom of equality 3.1.4, A \(\cup\) B = B. Since x is in B, if x is in A \(\cap\) B it must also be in A, so x is in A and B, which is the definition of intersection, and if A \(\cap\) B = A, and if A \(\cup\) B = B, then every element of A is in B so A \(\subset\) B.

3.2 Russell's Paradox

Wildberger has a lecture in his history of math series about this if you watch the lecture on set theory. 3.9 Axiom of Regularity is worth reading to see some more examples of what is and what isn't an element of a set otherwise you can skip this chapter.

3.3 Functions

Example 3.3.3 if you forgot range notation which Tao will teach us in later chapters. Remark 3.3.5 clears up some notation for functions notably a sequence subscript also denotes a function. Example 3.3.9 asks why 3.3.7 asserts that all functions from {empty} set to X are equal, because they have the same domain (nothing) and range.

Our first proof in this chapter is in Lemma 3.3.12, and if we can go by all the previous chapters, this will be our template for the exercises. It also doesn't make sense, so I checked the errata page and sure enough there is errata for this lemma depending what version of the book you're using.

Def 3.3.10 composition, you can think of that composition symbol as being a binary operator (function) that accepts two functions as input and outputs a single function.

3.3.14 See this tutorial and notice Tao has defined this function by it's contrapositive, which you can do as it is logically equivalent. Write your own examples for this definition. If two distinct elements of A map to the same element of B, this is not injective. However if those two A elements are equal to each other then it is injective:

  • A = {a, b, a} (function input)
  • B = {1, 2} (output)
    • f(a) = 1
    • f(b) = 2
    • f(a) = 1
      • but {a} = {a}, so f(a) = f(a) and this function is injective.
  • A = {b, c}
  • B = {2}
    • f(c) = 2
    • f(b) = 2
      • but {c} doesn't equal {b} so this not injective

3.3.17 Onto functions, another tutorial and for Remark 3.3.24 here is a tutorial for bijective inverse functions

Quiz Functions

Another multiple choice quiz from Tao's analysis class.


All optional, do you get what a bijective/surjective/injective function is? Then you're good we will do all these definitions again in a calculus course.

All of these proofs have a youtube tutorial or stackexchange answer, these are common proofs first year students do.

  • Ex 3.3.1 was deleted in errata (See Tao's site)
  • Ex 3.3.2 The proof for this is in the book
  • Ex 3.3.3 The empty function is always injective since it's 'vacuously true' that f(x) = f(x'). It is never surjective since there are no elements in the empty set that fit the surjective definition of every element in Y is a value of f(x) or y = f(x), and thus never bijective, since a function must be both injective and surjective to be bijective.
  • Ex 3.3.4 The tilde hat thinghy is just notation for two different fuctions, like f and f'. Cancellation law means we're trying to prove two different functions are the same so g = g' and f = f'. If f and g are injective, then g(f(x)) = g(f'(x)) and f(x) = f'(x), and by one-to-one definition of injective functions f = f' (since f is g's input). Surjective is similar, g(y) = g'f(x)) = g(f(x)) = g'(y) so g = g'. Does g need to be injective for this proof to always hold? Try corner case examples from the text, like if g = n2, then g(f(-1)) and g(f'(1)) have the same output, so two different inputs to g with same output means f(x) can't equal f(x') as per definition of one-to-one functions so we can't always derive f = f'.
  • Ex 3.3.5 If f is injective in g(f(x)), then g doesn't need to be injective. Consider an example A = {1}, B = {2, 3}, and C = {4}. f(1) = g(2) = 4 so f is still injective. The difference in 3.3.4 is the wording 'show that these two input functions are always equal if g in injective' and the second question 'Is this still true if g is not injective?' which means a single case where if g is not injective, and we get two non distinct inputs that equal the same outputg, the whole proof is false whereas here we are asked 'must g always be injective' so just needa single case of not needing to be injective.
  • Ex 3.3.6 Is described here see Remark 3.3.24 if you forgot what an inverse function is. Note how the inverse of f and f cancel each other out.
  • Ex 3.3.7 Similar to above and his association proof

Proof strategies

Watch this starting @ 36:00, from 41:00+ he shows how he solves problems step by step, trying small cases to understand the problem better and poking at the problem enough to finally figure out a proof.

3.4 Images and inverse images

This is only casual reading, skip the exercises and return to do them when these topics come up again.

Chapter 3.4 of Tao's Analysis I. Example 3.4.5 makes more sense if you read Example 3.4.6 too, f-1({0,1,4}) is the set of everything that could result in {0,1,4} if f(x) = x2, so {-2,-1,0,1,2}. The rest of this chapter, if it doesn't make sense on first read search youtube ie: indexed sets.

3.5 Cartesian products

Again more casual reading, come back to do the exercises as needed.

Another topic with plenty of youtube tutorials. Definition 3.5.7 notation for product is introduced, it looks complicated but all it's doing is specifying an index (i is greater or equal to 1, and less than or equal to n) and repeated cross product. The exercises some of the solutions are walked through on youtube. The cartesian plane itself is a cartesian product of R x R, you have tuples (x, y) or (x, y, z) so R x R x R.

3.6 Cardinality of sets

Skip the exercises like the last two chapters, we just want to be familiar with the material. Reminder in the preface of this book, Tao himself advocates to skip the exercises as 'set theory is not essential for analysis' but it will be essential for us to understand numerous other math topics, at which time we can come back here.

A good tutorial. Tao is using #(A) notation which means finite set cardinality, while |A| cardinality notation means infinite. There exists errata for the exercises specifically 3.6.8. The last exercise the pigeonhole principle if set A has greater cardinality than B, then f(x) A -> B can't be injective, since there would be two elements in A that point to a single element in B.

Integers and rational numbers

Introducing the integers

  • YouTube - Arithmetic and Geometry Math Foundations 12

Wildberger's definition is similar to Tao's, he walks through a proof of a-(b-c)=(a-b)+c

4.1 The integers

Chapter 4 of Analysis I by Tao. Definition 4.1.1 is exactly the definition from the above Wildberger foundations video. Lemma 4.1.3 proof you can finish most of it yourself without reading the whole proof. Lemma 4.1.5 Tao notes the same concerns Wildberger expressed why they both used the 'a less b' style of definition instead of just using Lemma 4.1.5 as the definition of integers, because of the many different cases that would be needed. Tao mentions this definition of the integers forms a commutative ring because xy = yx.


You don't have to do the exercises, though proving equality as per A.7 for every object he creates is worth trying. This is the poor scratchwork I did

4.1.1 Proving A.7 Equality again. From Def 4.1.1 equality is defined as: a–b = c–d iff a + d = c + b and the variables are natural numbers. Does a–b = a–b? In this defition: a + b = a + b so yes, since we already proved natural numbers obey the reflexive axiom. Does x–y = y–x or x + x = y + y and yes, x and y are natural number objects. If a–b = c–d, and c–d = x–y, does a–b = x–y? Converting everything to their respective natural number definition, and arranging on each side: a + d + c + y + a + y = c + b + x + d + x + b cancel and you're left with 2a + 2y = 2b + 2x, divide each by 2: a + y = x + b, converting this back into an integer this is a–b = x–y similar to Tao's proof in 4.1.1 of transitivity.

4.1.2 If -(a–b) = -(a'–b') are defined as b–a = b'–a' then b + a' = b' + a. This is the integer b–a = b'–a'. Now convert a–b = a'–b' to a + b' = a' + b and notice they are the same, and since natural number construction -(a–b) = -(a'–b') if a–b = a'–b'.

4.1.3 (-1) x a = -a to solve use Tao's advice in remark 4.1.7 to assign a = (a–b) or some other integer notation. In lemma 4.1.3 Tao writes 1 is equal to 1–0, so if want to negate 1, this is 0–1, and in full integer notation: (0–1)(a–b) = ((0a + 1b)–(0b + 1a)) which is (b–a), which is what (a–b) would be if -(a–b).

4.1.4 Prove the rest of 4.1.6, we already did x1 = 1x = x kind of in exercise 4.1.3, convert 1 to (1–0) and pick any arbitrary variables to represent x as (a–b) and this is easy to prove. Same with 0, use lemma 4.1.3 to convert to (0–0) so (x–y) + (0–0) = (x + 0)–(y + 0) or (x–y).

4.1.5 ab = 0 is (x–y)(r–s) = (0–0) or xr + ys = 0 which is true only if xr = 0 and ys = 0. So (0–0)(r–s) or (x–y)(0–0) or both are 0 since we are dealing with natural number constructions as per lemma 2.3.3 nm = 0 only if n=0 or m=0.

4.1.6 if a, b, c, are integers such that ac = bc and c is non-zero, then a = b. (a–a')(c–c') = (b–b')(c–c') and c is non-zero, then (a–a') = (b–b') or a + b' = b + a'. (ac + a'c')–(ac' + a'c) = (bc + b'c')–(bc'+b'c). This is ac + a'c' + bc' + b'c = b'c' + bc + a'c + ac' or factored to c(a + b') + c'(a' + b) = c(b + a') + c'(b' + a). Now we are the exact same place as Tao's proof in Lemma 4.1.3 where he wrote since a + b' = b + a' two sides are equal. If (c–c') was (0–0') then this would be 0 = 0 not (a–a') = (b–b'). Not a proof, but explains why the cancellation law works. If a + b' = b + a' we can also replace, so c(b + a') + c'(b + a') = c(b + a') + c'(b + a'). Subtract one of the c(value) or c'(value) from both sides so c'(b + a') = c'(b + a') and now we can divide by c' to get b + a' = b + a', and since a + b' = b + a' then replace back b + a' = a + b' which won't work if c' = 0. Not really the proof Tao wanted but it kind of works.

4.1.7 You can also do these in integer notation. If a,b,c are integers, then a > b iff a - b is a positive natural number. Ordering is defined as a = b + n and a does not equal b in strictly greater than a > b. Taking a = b + n subtract b from both sides, we get a - b = n. If n = 0, then that means a - b = 0 which is a contradiction to Tao's definition that a and b are not equal and thus a - b = some positive n iff a > b is how I read that exercise, the rest are similar.

4.1.8 find an example property, where Axiom 2.5 as stated in the book doesn't work for integers. The axiom relies on the fact that P(n++) is true so negative n can break many properties since n++ will be another negative number unless you state like Tao does 'for all positive integers'.

Rational numbers

  • YouTube - Arithmetic and Geometry Math Foundations 13

This definition is similar to Tao's in Analysis I. The inverse law, write your own example, the rational number 4/1 inverse is 1/4 and 4/1 x 1/4 is 4/4 or 1. A rational number is (a-b)/(c-d) according to it's definition but we write them as p/q and in some cases just p if q = 1.

Visualizing the rational numbers

  • YouTube - Arithmetic and Geometry Math Foundations 14

The rational number line sits on [0, 1], so you don't end up with a zero denominator which isn't defined. A rational number is a line through the origin [0,0], and an integer lying on point [m,n]. Ford circles are an example of the massive complexity of the rational numbers, which are expanded upon in the next lecture.

The deep structure of the rational numbers

  • YouTube - Real numbers and limits Math Foundations 95

This is about 35m long but worth the watch if you want to see what a convex combination is, learn about rational intervals, and a more complete description of Ford Circles set up with Farey sequences. This will help us understand what a sequence is later in Tao's book. The overall takeaway of this lecture is that the smaller the interval you take on the rational line, the more complex the numbers become with enormous denominators meaning it has a very deep structure.

The next videos, #96 and #97 about Stern-Brocot trees and wedges are highly recommended and can also be found in the book Concrete Mathematics. In Lecture #97, he takes a visualization on the plane of the Stern-Brocot tree and shows it also can be used as a geometric interpretation for the rational numbers, it's kind of crazy when you see it. The 'fraction' 1/0 is also explained.

4.2 The rationals

Tao's Analysis I chapter 4.2, definition 4.2.1: 3//4 = 6//8 = -3//-4 we know from Wildberger's visualization of the rational numbers, these numbers all lay on the same line. Lemma 4.2.3, in the proof of addition getting rid of +bb'cd is a simple case of subtracting it from both sides, cancelling it out leaving ab'd2 = a'bd2 and he can divide both sides by d because d is known to be non-zero, since it's in the denominator of a/b + c/d.

To recap we defined the natural number object, embedded them in the integers (a - b) to define integers, then embedded integers inside the rational numbers (another constructed object) where the arithmetic of the integers is consistent with the arithmetic of the rationals. The rationals are a field because of the last identity in 4.2.4 the inverse algebra law. Tao's example proof is similar to the proofs we've seen Wildberger do, convert the letters into integer // integer form, apply law, commute the results together so that they are the same.


These are pretty easy, try them. 4.2.6 in my copy must be a mistake 'show that if x, y, z are real numbers…' that must be 'rational numbers' and sure enough checking Tao's errata page it is a mistake. There's a mistake in exercise 4.2.1 as well depending which version you're using.

  • Ex 4.2.1

We have the usual exercise, show that the definition of equality for rationals is reflexive, symmetric and transitive. We go into the Appendix A.7 Equality and see that the Reflexive axiom is given any object x, we have x = x. In Definition 4.2.1 equality of two rational numbers is defined as a/b = c/d iff ad = cb. If a/b = a/b then ab = ab thus reflexive. Symmetry axiom, if x = y then y = x. If a/b = c/d, symmetric equality is c/d = a/b. By definition 4.2.1 a/b = c/d is ad = cb, and c/d = a/b is cb = ad so Symmetry axiom satisfied.

The hint for proving transitivity is to see Corollary 4.1.9 the cancellation law for integers: If ac = bc, and c is non-zero, then a = b. In A.7 the Transitive axiom states if x = y and y = z then x = z. If a/b = c/d, and c/d = e/f, then a/b = e/f or if ad = cb, cf = ed, then af = eb. We can also rewrite this as adcf = cbed and simplify to af = eb. Staring at this we isolate af(cd) = eb(cd) so the rest of the proof involves logical deducation to determine that c and d aren't zero so we can cancel them from both sides, or through this deduction infer another equality of ab = ef, there is a few ways you could prove transitivity. For example we know that d can't be zero since c/d wouldn't be a rational number. If ad = 0 since we know d is positive, a must be 0. If c is zero, then cf = ed is 0 = ed and again d must be positive so e = 0. So now we have a and e as possible 0, so af = eb is 0 = 0 or: 0/b = 0/f and transitivity is proven though in a convoluted sort of way using deduction and making a lot of implications.

The transitivity proof is a great example of just naively writing your proofs until you get better eventually. We used Tao's hint of finding numbers than can or can't be zeros and are getting better using mathematical logic to deduce answers.

  • Ex. 4.2.2

Prove whatever remains to be proved in Lemma 4.2.3, let's try the negation since the multiplication is simple. If a/b = a'/b' then ab' = a'b is the setup Tao gives us. If -(a/b) = (-a')/b' then by def 4.2.1 we get (-a)/b = (-a')/b' which is -ab' = -a'b and multiplying each side by -1: ab' = a'b holds. Lemma 4.2.3 multiplication: If (a/b * c/d) = ac/bd and (a'/b' * c/d) = a'c/b'd we need to prove ac(b'd) = a'c(bd) by manipulating it into the form ab' = a'b since a/b = a'/b'. We can just do what Tao did, cancel c and d from both sides, leaving ab' = a'b however we should make sure they aren't zero, using the same kind of deduction we used to prove transitivity. We know d can't be zero since c/d would mean it's not a rational number if zero. Another strategy of course is to just add a scalar to the original ab' = a'b of (cd) on each side, which is fine, and solves the proof, providing you point to the definition in the book where a = a is the same as ab = ab.

  • Ex. 4.2.3

Prove the rest of Proposition 4.2.4. The first is the same as Tao's demo proof. Scratch work, you should try all these yourself they are simple:

  • x + 0 = 0 + x = x
    • a/b + 0/1 is (a*1) + (0*b)/b*1 or a/b
    • 0/1 + a/b is (0*b) + (a*1)/b*1 or a/b
  • x + (-x) = (-x) + x = 0
    • a/b + (-a)/b is (ab + -ab)/bb = 0/bb or 0
    • (-a)/b + a/b is (-ab + ab)/bb = 0/bb or 0
  • xy = yx
    • a/b*c/d is ac/bd
    • c/d*a/b is ac/bd
  • (xy)z = x(yz)
    • (a/b*c/d)e/f is ac/bd*e/f or ace/bdf
    • e/f(a/b*c/d) is e/f*ac/bd or ace/bdf
  • x1 = 1x = x
    • a/b*1/1 is a/b and also 1x as per xy = yx which we just proved
  • x(y + z) = xy + xz
    • simple task to do on paper a/b(c/d + e/f) = (cfa + aed)/bdf
    • a/b*c/d + a/b*e/f = ac/bd + ae/bf or (acbf + aebd)/bbdf and cancel one of the b's which we know can't be negative since it's in the denominator
  • rest same as above
  • Ex 4.2.4

Prove Lemma 4.2.7 is exactly the same as Tao's proof for Lemma 4.1.5 of the integers just replacing rational numbers and their definitions.

  • Ex 4.2.5

We've already proven propositions of order in naturals and integers, this is similar but with a/b < c/d

  • Ex 4.2.6

If you multiply -1 to both sides negating them, then 1 < 2 will end up -1 > -2 which is what should happen since -1 is not as small a qty as -2 if our interpretation is a number line. Proof is somewhere in Def 4.2.8, if x < y then x - y is a negative rational number so a/b - c/d = (-e)/f. Then come along with negative rational z: (-g)/f and multiply both sides, we get a positive number so x > y, at least in scratch work I did until I ran out of time.

Decimal numbers

The next chapter in Tao's Analysis I uses decimals to represent tiny positive quantities (epsilon) so we should probably know what decimals are.

Wildberger has a basic introduction lecture using natural numbers that assumes you know nothing about exponents and reviews the hindu-arabic notation, shows how to convert a decimal to a fraction. The second lecture introduces rational number decimals and their visualization.

His definition is a decimal number is a rational number a/b, where b is a power of 10 (hence deca/decimal), then shows various examples how you can manipulate rational numbers like 1/2 to be 5/10, or 0.50, we also find out why 1/3 is not a decimal number. In binary, which is a base-2 system, the only prime factor is 2. If you load up any language interpreter and try and add 0.1 + 0.2 you will get surprising results, typically this number: because of the way floating point is used to represent decimals.

At 36:00 arithmetic w/decimals is covered, showing how the rational number and decimal number arithmetic methods are the same so you can just stay w/rational numbers if you want converting back to decimals at anytime.

Appendix B: the decimal system

From Tao's book Appendix B.1 The decimal representation of natural numbers his opinion is that the decimal system is not essential to mathematics, computers have replaced it with binary/floating point and 'we rarely use any numbers other than one-digit fractions in modern mathematical work'. Of course to understand scientific notation we still need to know how decimals work and we just did with those Wilberger lectures so this chapter is optional (we can't do it anyway before learning the the real numbers). The single exercise in B.1 is interesting, 'the purpose is to demonstrate that the procedure of long addition taught to you in elementary school is actually valid'.

4.3 Absolute value and exponentiation

Def 4.3.1 you apply that def to the rational number, so if x = -3, the absolute value is -x so -(-3) which of course makes it a positive rational number, as seen in the example of 4.3.2 where the absolute value of 3 - 5 = 2.

Def 4.3.4 that symbol is epsilon, meant to be a tiny quantity he defines as greater than 0. The example 4.3.6 if two numbers are equal, and their difference is zero, then they are 'epsilon close' for every epsilon since their difference is not bigger than any positive epsilon.

Proposition 4.3.7 recall that all these definitions are for d(x, y) or d(z, w) so x - y or z - w. His proof he moves xz to one side, so it becomes negated (subtract from both sides). He then manipulates |az + bx + ab| according to previous definitions to get |a||z| etc., and then uses the various propositions in 4.3.7 to rewrite them as eps|z| + delta|x| + eps(delta). This will make sense after we attempt to prove the rest of the propositions.

Rest of this chapter covers exponents and their properties.


Exercise 4.3.5 Prove that \(2^n \ge n\) for all positive integers n using induction.

  • base case: n = 0
    • 20 = 1 and since n = 0, 1 > 0.
      • but Tao specifically wrote 'positive integers' so base case must be 1
    • 21 = 2 and since n = 1, 2 > 1
  • assumption: \(2^n \ge n\) for all positive integers n
  • successor: \(2^{n + 1} \ge n + 1\)
    • left side:
    • \(2^{n + 1}\) is 2n*2 by def 4.3.9
    • 2*2n is also 2n + 2n by prop 4.3.10 (a)
    • now we have 2n + 2n \(\ge\) n + 1
  • since 2n \(\ge\) n (pointing to our assumption), then 2n + 2n \(\ge\) n + n
    • n is only positive, so n + n is \(\ge\) n + 1
    • 2n + 2n \(\ge\) n + n \(\ge\) n + 1
      • here we have standard transitivity proving \(2^{n + 1} \ge n + 1\)

This of course falls apart if Tao changed it to 'for all integers' instead of positive integers, or if we allowed our assumption 2n > n to have n = 0, then we could not assume n + n is \(\ge\) n + 1. Search around for proving inequalities on YouTube there are dozens of demonstrations of various strategies.

4.4 Gaps in the rational numbers

The idea here is that certain things don't exist in the rational numbers object like the square root of 2, so there are 'gaps' where the square root of 2 should be.

Proposition 4.4.1 there is an integer n, and a rational number x, such that n <= x < n + 1 so sandwiched between integers 2 and 3 are rational numbers such as 25/10 or 100/40 etc.

Proposition 4.4.3 there is always some rational you can squeeze in between two other rationals as we've already seen in Wildberger lectures. 1/800 < 1/500 < 1/100 etc. If you want put in examples for his generalized proof to better understand it like x = 2, y = 3 and z = (2 + 3)/2.

Proposition 4.4.4 famous proof by contradiction there does not exist a rational number x for which x2 = 2 though Tao's is significantly longer than most and as per the preface this is for the reader's benefit. He rewrites (p/q)2 = 2 into p2 = 2q2 by multiplying each side by the denominator, which is q2 as p/q squared is p/q * p/q or p*p/q*q.

Proposition 4.4.5 is basically saying for any epsilon bigger than 0, there exists a rational number that when squared it's less than two, and less than x + epsilon. Remember there is an infinite amount of rational numbers between 0 and 2 like trillion/1000 billion = 1. 12 is 1, if epsilon is 1 so that x + epsilon = 2, then (x + eps)2 is (1 + 1)(1 + 1) and bigger than 2. He asks why ne (e = epsilon) is non-negative for every natural number n in (ne)2 and the parens grouping ne and squaring it obviously make it positive ie (-2)2.

Tao mentions Dedekind cuts will be avoided, and the theory of infinite decimals has problems so also will be avoided. Wildberger has numerous lectures on both theories if you're interested the most detailed ones are in his Famous Math Problems playlist. Wildberger makes the point that you will only ever find the cherry picked example of the square root of 2 for Dedekind cuts in any undergrad material because of the mind boggling complexity involved trying to partition a set for pi2/6 and other irrational numbers.

The infinite descent exercise Tao has basically handed the solution to us, if you fix any sequence of natural numbers then by their definition there are finitely many numbers from 0 to your fixed point, so you can't choose infinitely many distinct natural numbers less than your fixed point so infinite descent does not exist for the naturals because you'll run out of them. Integers you can infinitely descend to negative infinity and of course rationals we know you can infinitely descend with bigger denominators until you approach zero or descend to negative infinity.

Real Numbers

Real numbers are basically a convenient abstraction for doing pure math analysis that every popular math text uses however our computers all use floating point arithmetic which is basically a stream of data we grab from to take finer approximations. You could view the following chapters as Cauchy sequences are just proving the real numbers exists instead of actually proving the real number system.

What exactly is a limit

  • YouTube - Real numbers and limits Math Foundations 106

Watch the first 20 minutes of this lecture and you'll see what the notion of an epsilon displacement is. Note that Tao will define all these things for us later, but at least you'll have an idea of what epsilon of a sequence is. The rest of the lecture is excellent covering a different definition of limits but optional. 3Blue1Brown has an epsilon-delta definition of a limit here.

2D geometric interpretation of Cauchy sequences

  • YouTube - Real numbers and limits Math Foundations 94

Watch from 2:20 to 16:00, he shows the 2D geometric interpretation of Cauchy sequences, and how they are interpreted as 1D on the real line, the idea of 'converging sequences'.

Real numbers and Cauchy sequences I

  • YouTube - Real numbers and limits Math Foundations 111

He explains the triangle inequality around 8mins, we just read about this in chapter 4.3 of Analysis I (and is one of the exercises).

Real numbers and Cauchy sequences II

  • YouTube - Real numbers and limits Math Foundations 112

An excellent lecture why we even need these complex limit and real definitions in the first place, showing how limits make many subjects in analysis much easier. Archimedes bracketed measurement is covered when he tried to figure out the approximation of pi. We also get an introduction to transcendental functions as a geometric series, something we will see in Analysis II by Tao in detail. The next lecture in the series is optional but Wildberger presents an alternative definition of real numbers, using Archimedean style of nested intervals where equality, algebra laws, order etc are all trivial to prove.

5.1 Cauchy sequences

Wildberger has good lecture introducing what a sequence is here and more precisely here with index notation.

Def 5.1.1 the notation outside the bracket is n = starting index, and the infinity symbol is how far the index should go which is forever because we're now doing continuous mathematics.

Example 5.1.2 reminds us of Def 4.3.4 d(x,y) = |x - y| \(\le \epsilon\) so if x and y are 2, then 2 - 2 = 0 and every single epsilon is valid since x - y will always be less than epsilon.

The definition 5.1.3 of epsilon-closeness means two adjacent indexes like 1 and 2 in our above array when subtracted are less than or equal to the epsilon we declare as being epsilon-close. 1.4 - 1.41 (absolute value). The next set of adjacent sequence indexes is 2 and 3, so 1.41 - 1.414. Everytime we subtract two of the indexes their displacement or 'absolute value distance' should be closer and closer meaning a smaller epsilon.

Let's do an example:

  • a4 = 1.41 1.412 1.4143 1.41424
    • index 1 - index 2 = 0.1 (absolute value)
    • index 2 - index 3 = .004
    • index 3 - index 4 = .0002

This sequence is 0.1-close, after index 2 it's 0.004-close, after index 4 it's 0.0002-close.

Def 5.1.6 if you watched the Wildberger lectures on What Exactly is a Limit then you can see what Tao is setting up here with epsilon-steadiness, there is some index in the sequence where the entries past that index are epsilon-steady. Example 5.1.7 the sequence 10,0,0,0.. d(10,0) is 10, so this sequence is not eps-steady for any epsilon less than 10, but if we choose N past 10, then all the remaining sequence is 0,0,0.. and d(0,0) is 0 - 0 so as per Def 4.3.4 will always be less than any epsilon so 'eventually eps-steady for every eps > 0'.

Def 5.1.8 if you saw the above Wildberger lectures on Cauchy sequences this is defining a convergence so any sequence that eventually becomes epsilon-steady, is a Cauchy sequence. Example 5.1.10 depending on your edition there's errata, in mine (3rd edition Springer from libgen) the example makes no sense.

Prop 5.1.11 proof he cites the fact that there always exists some rational number you can squeeze in between numbers and chooses 1/N noting that no beginner would come up with this proof and it is merely staging in preparation of learning what a limit is. Later in Example 5.1.13 he notes Prop 4.4.1 can again be used to prove the infinite sequence 1, -2, 3, -4.. has no bound, I would guess because of the 4.4.1 statement 'there exists a natural number N and rational x such that N > x ie: no such thing as a rational that bounds all the natural numbers so an infinitely increasing and decreasing sequence can't be bounded.

Exercise 5.1.1 since we declared it a Cauchy sequence, you can point to the definition of Cauchy sequences being an object that eventually becomes epsilon-steady, so Lemma 5.1.14 proves the finite part of Cauchy sequences up to epsilon-steady are bounded, and obviously epsilon-steady means it's bounded since you can subtract two indices and they will always be less than epsilon if this is a Cauchy sequence and epsilon-steady, or 1-steady, 2-steady, 1billion-steady. So both are bounded are ergo all Cauchy sequences are bounded using his definition of episilon-steadiness.

There is a lecture here if you're interested more about the proofs used in this chapter.

5.2 Equivalent Cauchy sequences

Prop 5.2.8 is the proof that 0.9999… = 1, and again prop 4.4.1 is brought out to claim we can always choose any rational number we want to make 2/N <= epsilon. If you follow this proof:

  • (1 + 10-n ) - (1 - 10-n )
  • 10-n + 10-n = 2 x 10-n
  • 10-N is \(\le\) 1/N because of the chapter on exponents, if N = 3 then 10-3 is 1/3x3x3 which is smaller than 1/3
  • Now choose an N that satisfies the requirement that 2/N \(\le\) \(\epsilon\) and it will automatically be true that it also satisfies 2 x 10-N \(\le\) \(\epsilon\)
    • Prop 4.4.1: you can always choose some rational to shove in there that will satsify the req

Only two exercises here, simple point to definition pretend you are a lawyer writing an argument and convince an invisible jury through explanation why it's true an equivalent sequence of rationals is a Cauchy sequence if and only if one of them is a Cauchy sequence.

5.3 The construction of the real numbers

Tao has one of the few books anywhere that actually defines the real numbers. Just like understanding the constructions of rationals and integers helped us understand their operations better, understanding the construction of the reals should help us better understand things like xe or square roots.

Def 5.3.1 the formal limit of a Cauchy sequence, meaning the number a sequence is converging towards, is a real number. Every rational number is also a real number, for example if we choose 5, then a Cauchy sequence converging to 5 like 4.99999… is bounded by 5 and is a real number. The notation is: x is a real number if x = the limit of a Cauchy sequence an, as the sequence goes from index n to infinity so a1, a2, a3

The rest of the chapter covers the same definitions we've already seen except now we are working with a different object, but since it's construction is entirely made up of rational numbers in a sequence almost all the laws are the same. Division by zero not being allowed for real numbers is explained using the sequence 0.1, 0.01, 0.001.. is actually converging towards 0, and it's inverse 1/0.1, 1/0.01, 1/0.001 is 10, 100, 1000… an infinite unbounded sequence therefore not a Cauchy sequence, therefore not a real number, so division by zero is undefined.

The #1 takeaway for this chapter is the real numbers are all a limit. I am just casually reading the proofs in this chapter, Tao has simplified it for us with his epsilon-close 4.3.7 propositions and his definition of equivalent Cauchy sequences. The proof of Lemma 5.3.6 was already explained by Wildberger here if you're wondering what epsilon/2 means.

Looking at the exercises, I'm guessing to prove 5.3.5 we show the reciprocal of 1/n is unbounded so we must be working with a sequence not bounded away from zero. In prop 5.1.11 Tao proved the 1/n sequence is a Cauchy sequence, and since all Cauchy sequences are bounded you can also show it's bound is 0. There is worked out solutions for this chapter here but you will never need to know how to prove any of these definitions.

5.4 Ordering the reals

For me it was interesting to go back to prior definitions and see how they work when cast into this Cauchy series abstraction. Prop 5.4.8 if 5 > 4 then 1/5 < 1/4.

Prop 5.4.9 'non-negative reals' means 0 or positive, while positive reals do not include 0. Closed I take to mean a set is closed iff the operation on any two elements of that set produces another element of the same set, in other words adding two non-negative real numbers produces another non-negative real. The explanation of the proof is in the comments here but Tao tells us we'll see this more clearly in future chapters on closures.

Prop 5.4.14 we already know from previous proofs and Wildberger lectures that there is an infinite amount of rational numbers you can squeeze in between two other numbers.

5.5 The least upper bound property

Example 5.5.7 the empty set doesn't have a least upper bound because of prior definitions where you can always find a rational number (which are also contained in the real numbers) to act as an upper bound to nothing like 1/trillion, 1/septillion, etc.

The proof for Theorem 5.5.9 is huge requiring going back to look at the Archimedean property again and some exercises. Wildberger has broken down this proof showing the gist of what is going on but the last part of the proof that the upper bound S is less than or equal to every upper bound of E is easy to follow.

Remark 5.5.11 we get the answer to 'can you add negative infinity with positive infinity and get zero' which is no.

The proof of prop 5.5.12 makes sense until you get to that inequality, wondering where he got x2 + 5e, which is the Archemedian property. This is basically saying x is the least upper bound, x plus some tiny displacement (epsilon) squared is less than or equal to the Archimedian property of x2 + 5(displacement) since if epsilon is really small, squaring it means it will be even smaller. He derives a double contradiction that x2 is bigger than 2, and x2 is less than 2, so x2 must equal 2 and we have a number that solves x2 = 2 which is not rational.

Exercise 5.5.1 you can type into youtube 'proof of infinum' and get a dozen tutorials or you can go through Prop 5.5.8 and Theorem 5.5.9 to change everything to prove a least lower bound which gives some more experience with the Archmedian property which all these proofs seem to rely on. The rest of the exercises 'draw a picture' think of Wildberger with his within epsilon/2 picture in earlier lectures.

5.6 Real number exponents I

Finally we learn square roots. Depending on your copy of Tao's book there's some minor errata here to look up. Note in the proof of Lemma 5.6.8 he's using equivalency replacement, if \(y^a = x^{1/b'}\) then we can replace them both for any occurences, such as \((y^{a})^{a'}\) = \((x^{1/b'})^{a'}\). Anytime in the future when you see some complicated nested square roots you can go back to this chapter and understand them.

Exercises 5.6.2 and 5.6.3 are simple and worth doing, using replacement to manipulate exponent laws. Exercise 5.6.3 that's the square root of x2, so x or using Lemma 5.6.6(g) or Lemma 5.6.9(b): \((x^{2/1})^{1/2}\) is \(x^{2/1 \cdot 1/2}\) which is x1 or x.


7.1 Finite series

Wildberger has a lecture on summation of arithmetic and geometric series to go with this chapter.

We are just manipulating indices here. Make an example series for yourself: ai = 11, 32, 53. \(\sum_{i=2}^3 a_i\) means a2 + a3 or 32 + 53 as per Definition 7.1.1.

In Lemma 7.1.4 the triangle inequality proof is here note how they get the base case of 0 by setting the variable on top of the sigma less than the initial starting variable so all indexed summations are zero (let b < a) like Tao's beginning of chapter examples with m - 2, m - 1.

Check the errata for this chapter, in my version (third edition (hardback)) there's a few problems. Remark 7.1.9 the proof is straight forward when you flip back to Lemma 7.1.4 and note that because he already assumed x to be g(n + 1) he could not also assign x to h(n + 1), which is why he needed to introduce h(j).

7.1.13 why is P(0) true? Because the assertion P(n) is true for any set X with n elements, so if it has P(0) elements, it's sum is 0. Reminder to also re-read what a cartesian product is, the cartesian plane itself is a cartesian product of every value of x with every value of y.

Proving all these finite sum identities are easy once you get one of them, all you need is 7.1.1 and the recursive definition he uses to pull out values, manipulate and reindex the sum.

  • Try 7.1.4(d) without induction, expand the series, factor out the c, done:
    • \(\sum_{i=1}^n (ca_i)\) = ca1 + ca2 + … + can
    • c(a1 + a2 … + an) by distributive law
    • rewrite as c(summation notation)
  • 7.1.4(c) expand ai + bi, group them together:
    • (a1 + b1 +.. an + bn) = (a1 +.. an) + (b1 +.. bn)
    • rewrite in summation notation
  • 7.1.4(a) let m = 1:
    • (a1 +.. an) + (an+1 +.. ap) = (a1 +.. ap)

7.3 Sums of non-negative numbers

7.3.7, do you get how he rewrote that sum to the geometric series?:

  • \(2^{k}\frac{1}{(2^{k})^q}\)
  • \(2^k(2^{-kq})\) by Lemma 5.6.9(d)
  • \((2^{k +(-kq)})\) by exp laws xk * xk = xk+k
  • \((2^{1 - q})^k\) which is \(2^{k - qk}\) by Lemma 5.6.9(b) (xq)r = xqr
    • the geometric series is the sum xn
    • \((2^{1 - q})^k\) is (thing)k or xn

Just flip through the remainder of this chapter noting things you can always come back to if needed to understand better, such as rearranging infinite sums.

Trigonometry crash course

Let's take a very short crash course to remember trig:

In case you forgot what \(\pi\) (pi) is (for us just an applied constant of 3.14… or 22/7 or 355/113) Wildberger talks about the history of pi such as Archimedes interval for pi, Newton's formula, a hexadecimal version of pi, his own formula.

Let's start with the crash course lecture.

  • If you forgot highschool this video @ time index 6:00 shows how sine(x) is the y-axis, and cosine(x) is the x-axis, the whole thing is worth watching
  • Radians you could even make up your own notation and define it as DoublePI = 6.283… and then 360deg is DouplePI, 90deg is 1/4DoublePI, 180deg is DoublePI/2, etc.
  • We really only need to know how sine/cosine functions work, all others are derived from them

In the scaling of triangles example where \(\frac{1}{\sqrt 2}\) becomes \(\frac{\sqrt 2}{2}\) they are rationalizing the denominator:

  • \(\frac{1}{\sqrt 2}\cdot\frac{\sqrt 2}{\sqrt 2}\) is \(\frac{\sqrt 2}{\sqrt 2 \cdot \sqrt 2}\) which is \(\frac{\sqrt 2}{\sqrt 4}\) or \(\frac{\sqrt 2}{2}\)

Summary: There exists 3 triangles, which we should know their measurements, we scale them down since this is a unit circle with maximum x and y values of 1. Then we superimpose the triangles in the pi/2 (90deg) angle space on the unit circle in such a way that we have known values of cos()/sin() and other trig functions, because we can derive those other functions from cosine and sine.

Try the worksheet you can actually draw these out and count rotations if you want or use this radian to angle picture from Wikipedia. The equations if one side is squared, then you need to take the square root of both sides so it's (not-squared = value) instead of (squared thing = value-when-squared). If you're stuck watch the video that walks through all the solutions. How I solved these: subtract 1 pi if it's bigger than 2pi, or a 1/2pi if smaller and see what you get, look at the 3 triangles if they have a corresponding value for sin/cos. If not subtract pi/3, pi/6 etc.

If you hate this trigonomentry this guy reinvented it here with rational trigonometry and has a book on library genesis where angles are removed and replaced with vectors, he even respecified hyperbolic geometry to be completely rational it's crazy when you see it. If you want/need to practice Trigonometry, we will do so as this workshop progresses or search library genesis for Titu Andreescu's 103 Trigonometry Problem a book for training the USA IMO team.

Math circle

Now we completely change from linear progression through textbooks to a solving olympiad style problems. This is similar to a math circle essentially an informal group of all ages from professors to student TAs to middle school students get together to group solve a problem and repeat.

Let's start with Poh-Shen Loh's combinatorics problem solving seminars and a Russian olympiad-style book on basic algebra skills because you don't need a background more than what we've already done for either.


  • Mathematics via problems: Part 1: Algebra by Arkadiy Skopenkov.
    • A masterclass in basic algebra everybody should know
    • "This book attempts to build a bridge (by showing there is no gap) between ordinary high school exercises and the more sophisticated, intricate, and abstract concepts in mathematics"
    • The author is a professor at MIPT and uses this book for an olympiad circle
  • 21-228 Discrete Mathematics course (Sp 2021)
    • Lectures on his YouTube channel
    • Prof Loh is the current national coach of the US olympiad team who has won gold 4x (and even wins the math competitions for former medalists) so surely his content will be worth viewing.

Discrete math lecture 1

The first lecture begins ~14:30 mins. We start here so you can see how to investigate problems in math. He's breaking down some elementary combinatorics problems to show the logic of solving them. @37:40 why does the last digit have to be even? Because it's supposed to be an even 4-digit number so ending in 0, 2, 4, 5, 8. The general form is crazy but he did it by writing down a table of one digit numbers, two digits .. n + 1 digits to find the general pattern.

Assignment 1

Let's try the assignment. This discrete math course is unusual in that there is no hundreds of drill problems to do in the book, just the challenges he hands out. I will attempt to work through them like he does, writing down little examples and ideas and seeing what happens:

1.) The first is a symmetric difference assignment which is defined here. Look under properties, the answer is there 'from the property of the inverses in a Boolean group, it follows…'. We can eliminate B and are left with:

(A △ C) △ (C △ A) and from that same article, we can eliminate C so are left with A △ A which is just the empty set.

2.) We have a set bigger or equal to 3, and we need to know how many subsets contain exactly 2 permutations of 1, 2, 3 so {1, 2, 3, 5} or {1, 2, 3} does not count as it contains 3 permutations of 1, 2, 3 and the assignment wants only 2. This seems like a powerset problem, generating all possible subsets. Google any free powerset online calculator and try the base case, set {1, 2, 3} it contains 3 sets with exactly 2 numbers from 1, 2, 3. Try the next bigger set {1, 2, 3, 4}, I count 6 occurences. The set {1, 2, 3, 4, 5} has 12 occurences. I'm going to guess {1, 2, 3, 4, 5, 6} is 24 occurences, and that {1, 2, 3, 4, 5, 6, 7} will have 48 occurences. The pattern is 3x2x2x2.. now how to figure out what the n set is? If n = 4, it's 3x2, if n = 5, it's 3x2x2, if n = 6, it's 3x2x2x2. Writing out more examples I see it's: Length 4 choose 1 of the 2's, length 5 choose 2, length 6 choose 3, 7 choose 4, the choose n is the amount of 2's in 3x2x2x2.. and they differ by 3 each time. The solution seems to be any set with length n, subtract 3 and that's how many 2s to multiply to get the answer. A set {1, 2, 3…. n} where n = 10, it's 3 times seven 2's or 3x2x2x2x2x2x2x2 or 384 subsets.

3.) This is exampled by Wildberger here where he shows how to count the number of paths on a grid, it's Pascal's array or n choose k as detailed in his Famous Math Problems 5 lecture. There is also a writeup here how to avoid paths on a grid and is the answer to this problem. First we can visualize this with free online graphing software, either plotting y = 59 streets and x = 8 avenues or reduce the scope, since we're only interested in the (59 - 34) and (8 - 5) avenues. Zoom to see all the streets. If King Kong goes (59 - 34) streets and (8 - 5) avenues he has 25 streets + 3 avenues or 28 total moves in any direct path to the Empire state building then we have to figure out how to subtract the paths that involve 42nd and 7th ave.

Algebra chapter 1

Reading Mathematics via problems book by Skopenkov. The contents of this book are similar to undergrad courses like Math 135: Algebra for Honours Mathematics and if you search for the course description they removed the sage on the stage approach of lecturing and students solely do problems together in class, like a math circle. In the preface of Skopenkov's book we are informed that you don't read this book linearly, you take sections that interest you solving the level 1 to 2 problems, then going back later and trying the level 3 and 4 problems as you will be 'at a new level'. Prof K in the software workshop recommends the same method that you iterate over material you've already done multiple times each time returning to it at a higher level so you can see how concepts there transfer over to concepts you're currently learning. The preface also says we are doing this to learn both how a mathematician reasons their way through problems, and because 'the lack of skill (manipulating algebraic expressions) amoung olypians often leads to ridiculous and annoying mistakes'.

1.1.1(a) What are the rules of divisibility by 2, 4, 5, 10, 3, 9 and 11. This is an example of a great problem you could figure out yourself during spare time using just a phone calculator. Investigate first seeing if you can discover the rule, then look at author's proofs.

Page 2 are the proofs, he's using generalized hindu-arabic notation we already learned. First proof, subtract from n the last digit, ie: 112 - 2 = 110. What does he mean by 'divides each term of the sum' he means in hindu-arabic notation form, so 112 = 100 + 10 + 2 or 336 = 300 + 30 + 6 and 2 divides each term and the sum. The proof of divisibility by 4, any number n if you subtract the 10's and the last digit, it will be divisible by 4 like 177 is 177 - 70 - 7 or 100. The 4|n notation is in the preface, it means 4 divides n.

  • Exercise, prove divisibility of 5: The number n - a0 is divisible by 5. Suppose a0 is 5, then rewrite his other example proofs.
  • Exercise, prove divisibility of 10: The number 10a1 is divisible by 10, could use the fact that anything divides zero: 'suppose a0 is 0, or by definition a0 is < 10 so must be zero, then if a number divides every term of the sum, it divides the sum'.

The rest of the proofs put in concrete values if you don't understand them like the proof of divisibility by 3: 234 is (200 - 2) + (30 - 3) + (4 - 4) or 225 which is divisible by 3. 1224 is (1000 - 1) + (200 - 2) + (20 - 2) + (4 - 4) or 1215 which is also divisible by 3, and by 9. He inserts a function f(n) to represent the algorithm of summing even/odd indexes of numbers divisible by 11.

1.1.1(b) Trying some examples, 111,111|(106m - 1). It will also divide repeating number of 1's that are 106m digits long, so it's digit sum needs to be divisible by 6 ie 1012, 1018 etc. A huge number consisting of all 1's who's digits sum to 1993 (a prime number, so sum of digits not divisible by anything except itself) is not divisible by 111,111.

1.1.1(c) prove that a number consisting of 2001 1's is divisible by 37. We are only asked to prove for 11111… not a general number or rule of division by 37. Trying examples, any repeating number (111, 222, 333) who's digit sum is divisible by 3 is divisible by 37. Since 3|2001, it should be divisible by 37. Further investigation reveals that 37|1110 and 37|11,100 and 37|111,000 etc. 3|2001 is 667 so there are 667 'groups of 3' of the digits 111. If n = 111,111 then 37|n - (10a2 + 10a1 + a0) and 37|(10a2 + 10a1 + a0).

Finish the rest of chapter 1, so 1.1.3, 1.1.4 and 1.1.5. We will return again later to the same concepts over and over so you don't have to memorize anything.

  • 1.1.3(a): 2|(a2 - a) input small examples, like 32 - 3 is 6, whatever you substitute for a ends up even and is divisible by 2. A lot of Tao's rules are here, like not being able to cancel each side if the cancelled number is 0.
  • 1.1.3(f) if a = kb then ac = k(bc) which is still in the definition of division form: (thing) = k(thing).
  • 1.1.4 if a is divisible by 2 and 3, then a is obviously even, and since it is divisible by 3 as well, 3 x (even) will be divisible by 6. He uses an interesting proof for these but if 17|a and 19|a then 17x19 = 323 and 323|a because 17|323 and 19|323 is how I read it.

A good first chapter covering division which Wildberger in his beginning math foundations lecture claims is the one thing most students don't understand. Poh-Shen Loh also claims students usually don't understand ratios/fractions/division, that's why we are doing this.

Discrete Math Lecture 2

CMU Discrete Mathematics lecture 2/3. This class covers some linear algebra, which I do in the first assignment of 10-301 in the AI workshop and also here which also covers Eigenvalues/vectors. You can also watch Wildberger's lectures on linear algebra to see how a 2x2 matrix is multiplied. Around 43m if you're curious how to invert a matrix you can either type it into wolfram alpha and see instant results or watch this other Wildberger lecture you take the determinant, factor it out and multiply it by the 'adjoint' which is a matrix where every entry is it's determinant. Since this is all 1's we simply invert the minus/plus signs. This was a crazy lecture.

Last 10 minutes are about paths, which is the first homework we already did. Around 53mins notice LRRLLRRL no matter how you permute those instructions, you get to the destination. If you did LLLLRRRR you walk the edges of the square. If you did RRRLLLLR you end up in destination again. End of lecture is the homework, total amount of paths minus top path multiplied by bottom path. We figured this out ourselves. We will learn more linear algebra in the Putnam seminar, which is next up.

CMU Putnam Seminar

Another Prof Loh seminar we can do at the same time as the algebra and combinatorics problems, this one follows the book Putnam and Beyond second edition you can find on library genesis what has completely worked out solutions for all problems. Lectures are archived on YouTube and the class page is here.

According to Stanford "The competition emphasizes ingenuity rather than knowledge, so freshmen are not at much of a disadvantage compared to seniors".

Lecture 1 Introduction

Let's watch the Putnam Seminar 9/4 introduction.

At 13:31: the top 500 scores are only ~23 points out of 120 max possible points, the exam is that hard, meaning you solved the 2 easiest problems and maybe had insight into a third but didn't solve it. He explains having just 2 insights to a problem requires a mental search tree of 100 x 100 options repeatedly claiming this not an exaggeration, the Putnam is actually this hard since no 3rd party materials are allowed during the exam though in 2021 it's not proctored and entirely online, so all results will be unofficial.

@30:00 the first seminar starts, the pdf he pulls up is here. The very first problem, the 8x8 block pattern he already has a talk about this that gives away the answer, you need 6 bits of memory. Third problem @31:45 'integral coefficients' means they are integers. If you forgot what a polynomial is Wildberger has a brief crash course describing their properties and graphing in this linear algebra lecture. A root means an input to the polynomial so it evaluates to 0. We will of course learn polynomials doing problems about polynomials as we go.

Now we go through ideas, poking at the problem to solve it like p(1) to sum all the coefficients, a strategy to show all the integers are neither even nor odd so can't be rational. Notice how he's just abstracting away using boxes ignoring all the values. If you forgot what modular arithmetic is Wildberger also has a lecture for this in typical Wildberger style where he introduces the problem it's trying to solve, the history and then at around 11:43 the rules of k congruent to l mod m. A conceptual crash course is: consider an abstract system of the type (D, 0, successor) where D has two distinct objects 0 and 1, and let 0++ = 1 and 1++ = 0. This is called residues modulo 2. The natural numbers become this mod2 system when each natural number is replaced by it's remainder after division by 2:

  • 0/2 is remainder 0
  • 1/2 is remainder 1
  • 2/2 is remainder 0
  • 3/2 is remainder 1
  • 4/2 is remainder 0

He cleared the fractions by multiplying the polynomial by cn which you can see here (rational root theorem), which makes sense if (b/c)n then it's (bn/cn) ie (3/4)2 then 3/4 x 3/4 or 9/16. This introduction is worth rewatching a few times to learn how this style of problem solving works.

His other site Expii Solve has similar kinds of challenges they start easy and get harder with links at the bottom of each question where to find background material, you may want to try those too they help you learn all kinds of topics such as spherical geometry, probability and even calculus. It's a community written site so kind of like math overflow where anybody can write a tutorial for a subject and these are upvoted if other people find them useful, which could be good or bad, a well written Tao-like fundamentals post would likely not be upvoted compared to some bag of tricks with a clickbait youtube embed.

Lecture 2 Polynomials

The only prereq for this lecture is Wildberger's crash course in calculus here. Note how one function tells you how the other is changing and you can switch from velocity to acceleration to position. You now know basic calculus.

Watching the second lecture on Polynomials. If you don't know what the constant e is, watch this. The problem we are doing, try expanding the left hand side n = 3: $a1x1 + a2x2 + a3x3 is equal to 0 (has a real root, meaning not a complex number root) if the sum or the constants divided by the exponents + 1, equals 0. Recall from Wildberger's calc crash course that a polynomial constant like 1 in: 3x2 + 1, the 1 is really 1x0 so our problem could have constant/0+1 as well. Once again notice how he is brainstorming ideas and writing them all down.

What is an 'antiderivative'? First let's learn what an integral is.

An optional lecture, Wildberger breaks down exactly what an integral for a polynomial is. The derivative for xn is \(nx^{n-1}\) and the integral for xn is \(a^{n+1}/n + 1\) or x4 = x5/5. @6:48 he begins concrete examples starting at n = 0. If you watch part 2 he'll show different interpretations of the integral like Lebesgue integrals.

More integration and what an antiderivative is, which is f' = f meaning a function who's derivative is equal to the original function f. If f = 3x2 then the antiderivative is x3.

Back to the Poh-Shen Loh polynomial seminar. Do you now get why the antiderivative of a generic polynomial is how he wrote it? Take the antiderivative's derivative and the n+1 both cancel, you're left with \(a_{n}x^n + ... + a_{0}\). Idea #4, if you watch 3Blue1Brown's lecture on epsilon-delta definition of Limits, then this proof that all polynomials are continuous functions will make sense. A continuous function just means it has no abrupt changes to it's output/input ratio, ie: it it's geometric form is a continuous line with no breaks in the graph. MVT solution you can google, MIT's 18.01 has lectures and slides showing the geometric interpretation but Poh-Shen Loh gives a clear enough explanation. He explains he tried to get advanced placement when he was a student by just guessing his way through a differential calc class by using polynomials/power series to solve everything which actually worked.

He moves on to another question, a classic result for derivatives. "We're doing these problems to do a very fast review of all concepts in math" which is exactly what we want. Here we even see the fundamental theorem of algebra. He mentions complex number derivatives, Wildberger has a whole series on this with his dihedron algebra infinitesimal theory.

Very end of this lecture… it's the Russian algebra book chapter on divisibility we did! 'If I have a 'not multiple of 5', plus a multiple of 5, it's still not a multiple of 5'.

Algebra 2 Polynomials

I was shaky on the Putnam polynomial roots seminar, so let's see what the Russian olympiad book can teach us about finding roots for polynomials. Reading Chapter 3 Polynomials and complex numbers of the book Mathematics via Problems by Skopenkov.

3.1.1 and 3.1.2 we already know, since a rational number is two integers that produce a quotient. 3.1.3 is interesting in that I've never seen it before, that if a(p) = 0 then p divides a0. Example: A(x) = 2x + 4 and A(-2) = 0, since it divides a0 which is 4. This was taught to me a different, more confusing way. Make an example for (c): A(x) = 2x - 1 then A(1/2) = 1 - 1 = 0. Pick arbitrary k: 13. A(13) = 2(13) - 1 = 25 and this should be divisible by (1 - 13(2)) or -25, which it is.

3.1.4 because this is a Russian text, we are saved from cringe pi memes being used here. You don't have to hand calculate these, search for a trig unit circle diagram on google images. Remember cos() is the x value, sin() is the y value. 3.1.4(a) is 1/2 so rational, b) is square root of 3 over 2 so obviously not rational, the rest you can type into some online trig calculator and see they are infinite decimals, or see their phi definitions and phi if you search contains the square root of 5, an irrational number.

3.1.5 you can enter all of these into desmos and compare the graphs. Many of these were covered in the 3blue1brown lockdown math lectures on trig which I already went through in the calculus workshop. Just enter them and think about how you would prove this, that's good enough.

3.1.6 these are Chebyshev Polynomials and looking at section 3 Some properties of the polynomials Tn, and the identities in 3.1.5, you can guess the answer for example b), if you input cos(x), and get back cos(nx) then the polynomial is one of the Chebyshev polynomials but what is it for any n? The hint is 'use induction' or n + 1, and of course there is a helpful tutorial from on YouTube completely solving this problem if you're interested it's typical 'this is now in the form of an identity, so we will swap it out' type substitution we are already used to, and solving recurrence relations. The whole video is worth watching, seeing him substitute recurrence relations if you didn't get it from the compsci workshop. This guy is pretty good, if you search for his name Prof Sankaran Viswanath you get a lot of good lectures in other topics. Prof Viswanath runs an abstract algebra course using Artin/Dummit & Foote texts, and the Algebra II course is full of category theory like functors.

3.1.7 we didn't do the section on mod yet, but you can also guess and check using an online trig calculator. None of this is what we want though, we wanted to solve polynomials which is in chapter 3.2. TODO