What is the size of an angle in a triangle whose side lengths are equal to each other?
You might laugh at this question and reply: it is obviously
And your final answer is right—but how did you know that all three angles are equal to each other?
To answer this question properly, we need to properly discuss congruent triangles, from which we can discuss isosceles triangles, of which equilateral triangles are a special case.
Definition 1. Consider the two triangles in the diagram below.
We say that is congruent to , denoted , if their corresponding side lengths and angles equal each other:
In ordered tuple notation:
The topic of congruent triangles is commonly presented as a collection of “cookbook recipes” to determine if two triangles are effectively the same.
But how do we know that these recipes actually work? It turns out that we have the necessary “meta-recipes” to construct these recipes, and that’s what we shall do.
Theorem 1 (RHS Criterion). Let be the following two right-angled triangles with :
Then if and only if
Proof. In the direction , we obtain all of the equalities by Definition 1.
For the direction , suppose
We will deal with the case later. Observe that exactly one of the following hold:
We claim that the remaining two cases are impossible.
Suppose . Then we can overlay the triangles on their edges as follows:
Hence, so that . However, by Pythagoras’ theorem,
a contradiction, ruling out the case .
The case can also be ruled out by relabelling with and vice versa, and using the same reasons above (i.e. a symmetric argument).
Since all corresponding sides and angles equal each other, .
For the case , we could use a symmetric argument, or take the following even shorter approach: by Pythagoras’ theorem,
so that and implies , therefore
which, as shown just now, implies .
The RHS Criterion only describes congruence between right-angled triangles. Yet, more is true. Since right angles open our discussion on general angles, right-angled triangles also open our exploration of general triangles.
Theorem 2 (SAS Criterion). Let be the following two triangles:
Then if and only if
Proof. The direction is trivial.
For the direction , the hypothesis tells us that
Construct the perpendicular heights (i.e. the altitudes) below.
Using rotation, we can assume . By construction,
are all right-angled triangles. Our line of attack is as follows:
Show that .
Deduce that using the RHS Criterion.
Similarly, .
Conclude that .
The first point is the most challenging.
Since and and , we must have . We already have by hypothesis, so it remains to prove .
If , then aligning with , we can draw the diagram below.
By Pythagoras’ theorem,
Since and , we have , a contradiction. Similarly, we can rule out the case . Therefore, .
By the RHS Criterion, . In particular, .
We next establish . By hypothesis, . In particular,
By Pythagoras’ theorem,
Since and , we have . By the RHS Criterion again, . In particular,
Finally,
Remark 1. This proof holds true in all three cases (i.e. and can be acute, right, or obtuse), so long as both of the other two angles are acute.
Once we have a congruence criterion test that involves both sides and angles, we can derive many other useful congruence criterion tests.
Theorem 3 (SSS Criterion). Let be the following two triangles:
Then if and only if
Proof. The direction is trivial.
For the direction , by hypothesis. Construct the altitudes once again.
If and , then by Pythagoras’ theorem, implies
so that , a contradiction. Therefore, by a symmetric argument, we can conclude that .
By the RHS Criterion, , so that
By hypothesis, and . Hence, by the SAS Criterion,
as required.
Example 1. Consider the triangle below.
Prove that if and only if . The direction is called the converse of Pythagoras’ theorem.
Solution. The direction is the already-proven vanilla Pythagoras’ theorem.
For the direction , suppose . Construct a right-angled triangle with base and height as follows:
By the vanilla Pythagoras’ theorem, . By hypothesis, .
Since and , we have .
By the SSS Criterion, . In particular,
Theorem 4 (ASA Criterion). Let be the following two triangles:
Then if and only if
Proof. It suffices to prove . By hypothesis, since angles in a triangle sum to ,
Similar to the proof of Theorem 2, we claim that . Superimpose onto such that and .
Since the corresponding angles and , we have . Since , passes through . By Playfair’s axiom, must lie on . Using a symmetric argument, must lie on .
Therefore, lies on the intersection between the line and the line . Since there exists one and only one intersection point between the two lines, namely , we must have , so that .
By hypothesis, and . Hence, by the SAS Criterion, , as required.
Theorem 5 (AAS Criterion). Let be the following two triangles:
Then if and only if
Proof. Since angles sum to ,
Therefore, by the ASA Criterion, , as required.
Definition 2. A triangle is said to be isosceles if it has two sides with equal length.
Example 2. Consider the triangle below with base angles and .
Prove that if and only if . In this case, we say that the base angles of an isosceles triangle are equal.
Solution. Our proof boils down to the analysis of with its “rearranged” self . Clearly , since they denote the same angle.
We prove in two directions. For the direction , suppose . Almost trivially, and . By the SAS Criterion, . In particular, .
The direction is proven similarly. Suppose . Then by relabelling, . Furthermore, , since they denote the same side. By the ASA Criterion, . In particular, , as required.
Definition 3. A triangle is said to be equilateral if all three sides have the same length.
Example 3. Determine the size of any interior angle in an equilateral triangle.
Proof. Denote the equilateral triangle by , and denote .
Since , by Example 2,
Since , by Example 2 again,
Since the angles in a triangle sum to ,
Therefore,
That is, each angle in an equilateral triangle has a size of .
If the geometry on a triangle is fascinating, the geometry on a circle is even more astounding! We will visit the geometry of circles in the next post.
Calculus, in the 21st century, continues to be the sorrow of most students required to learn it against their will. It doesn’t need to be this way, though.
Consider the graph of below.
Define the point on the graph by . Using algebra, we can evaluate the gradient of the tangent at to .
Now consider the graph of .
Define the point on the graph by . Using algebra, we can evaluate the gradient of the tangent at to .
We are going to generalise this observation.
Definition 1. Let be the graph of a function. The derivative of at , denoted , is defined to be the gradient of the tangent at .
Define the derivative of by
Since , we can also write
We describe this process as differentiating with respect to .
Remark 1. Strictly speaking, the derivative is defined as a limit, and the gradient of the tangent is defined to be the derivative. However, we adopt the convention in Definition 1 for the sake of visual intuition.
Using more mathematical tools to establish a common pattern, we will define the derivative of , where is any rational number.
Theorem 1. The derivative of with respect to is given by
Proof. See this post for the more formal perspective, and see this exercise for the algebraic calculation. We see that
and these results agree with our investigation in earlier examples.
Example 3. Evaluate the following derivatives:
Solution. The first expression is immediate using Theorem 1:
The second expression requires the two laws of exponents and :
The third expression requires :
The fourth expression requires and more generally, :
The fifth expression requires and :
Is it possible to evaluate derivatives of combinations of these functions? Yes!
Theorem 2. Given functions and constants ,
Proof. See this post from a calculus perspective and this post from a linear-algebra perspective. This result is known as the linearity of the derivative.
Example 4. Given functions , show that
Solution. Using the linearity of the derivative,
Example 5. Evaluate the following derivatives:
Solution. For the first expression, we use linearity as per Example 4:
Using the results in Theorem 1 and Example 3, we have
For the second expression, we expand , then use the previous answer:
For the third expression, we first expand :
We then use linearity and established results as per Theorem 1 and Example 3:
To shorten notation, we write .
Example 6. Given functions , show that
Solution. Since , we use linearity to derive
Example 7. Let be positive constants. In economics, the revenue earned from selling units of a good is given by
Define the two quantities related to the revenue:
The average revenue is defined by .
The marginal revenue is defined by .
Show that the -intercept of is half of the -intercept of .
Solution. By algebra, and
Using linearity,
Let denote the -intercept of and denote the -intercept of .
We first solve :
We next solve :
Hence, , so that .
What other combinations can we differentiate? Given functions , we can differentiate the following:
Tragically, they don’t follow the neat rules that we think they do.
Example 8. Which of the following equations are true?
Justify your answer.
Solution. Sadly, none of them are true.
We will use the counter-example and . Using Theorem 1, and .
For the first equation,
however,
Therefore,
For the second result, we leave it as an exercise to check that
For the third result, we leave it as an exercise to check that
To deal with these latter three results, we will need to look at the three musketeers of differentiation techniques: the chain rule, the product rule, and the quotient rule. For more complete proofs, see this post on the chain rule and this post for the product and quotient rules.
We will explore this differentiation trio next time.
This answer agrees with Problem 1, since setting gives
Problem 3. Given and , evaluate the gradient of .
Verify your answer in Problem 2.
(Click for Solution)
Solution. Since and , so that the gradient is given by
This answer agrees with Problem 2 since setting gives
Problem 4. Let denote the line passing through . Show that is tangent to if and only if its gradient is .
(Click for Solution)
Solution. Denote the gradient of by . Then
When intersects the curve ,
so that solving yields
Then or . For to be a tangent to the curve at , we must have both roots equal , so that
as required.
Therefore, by setting in Problem 3, we obtain the answer in Problem 4: the expression . Intuitively, this expression describes the gradient of the tangent at .
Problem 5. Now consider the graph of .
Define the point on the graph by .
For any and , evaluate the gradient of in terms of .
(Click for Solution)
Solution. Using the same strategy as per Example 1, since , the gradient is given by
We first expand :
Therefore,
If we set in the final result, we obtain the expression . Intuitively, this expression describes the gradient of the tangent at .
Problem 6. Given that lies on , evaluate the gradient of the tangent to . You may freely use the factorisation
without proof.
(Click for Solution)
Solution. Denote and
so that for brevity (and possible generality).
We follow the approach in Problem 4. Denote the gradient of the tangent to by . Then
When intersects the curve ,
Solving yields
Hence, or . For to be tangent to , we must have and vice versa. In particular,
Remark 1.Problem 6 generalises to other kinds of functions using Carathéodory’s theorem.
Pythagoras’ theorem is the most important idea in geometry, all of mathematics even. But before discussing it, let’s ask a simple question.
Given a unit length , and a positive number , how can we construct ?
Recall that means that . We can achieve this construction goal using Pythagoras’ theorem.
(A proper construction, as with many other geometric ideas, relies on real analysis.)
Theorem 1 (Pythagoras’ Theorem). Given a right-angled triangle below,
We call the longest side with length the hypotenuse of the triangle.
If are positive integers, we call a Pythagorean triple. The most famous such example would be the –– triple:
Proof. Draw three extra identical triangles, so that we get a large square with side length .
We claim that the slanted four-sided shape (i.e. a quadrilateral) in the middle is indeed a square.
Since the right-angled triangles are identical, we take two adjacent triangles and label their angles as follows.
We want to prove that . Since the angles in the centre are non-overlapping adjacent angles on a straight line,
On the other hand, since the angles in a triangle sum to ,
Therefore, using usual calculations,
Since the hypotenuses are all equal, we know that the quadrilateral has four equal sides. Hence, we have a square in the middle whose area is , so that
There are many applications of Pythagoras theorem that, in my humble opinion, are not worth discussing outside a tuition class. Yet, Pythagoras’ theorem can resolve for us a simple yet non-trivial question: what is a circle?
You might say, “a circle is a non-straight bendy or curvy line that closes in on itself”. This formulation is valid when formalised in the language of algebraic topology and homology theory.
Let’s keep things simple for now.
Definition 1. A circle with centre and radius is the unique set of points whose distance from is . In particular, we define the unit circle to be a circle with centre and radius .
There is one problem with this definition: how do we calculate the distance between two points without requiring any additional measurement?
Lemma 1. The distance between two points and is given by
Proof Sketch. Suppose the special case and .
Construct the right-angled triangle below.
Then the hypotenuse of this right-angled triangle has length . Applying Pythagoras’ theorem,
Taking square roots gives the desired result.
The other cases involving or are left as an exerice to the reader.
Call any collection of points a graph.
Theorem 2. The graph is a circle with centre and radius if and only if it is defined by the equation
In particular, the unit circle has equation .
Proof. By definition, a point belongs to the circle if and only if its distance to is . By Lemma 1, this condition holds if and only if
Squaring on both sides, since , we obtain the equation
as required.
Once we can properly discuss circles, all sorts of interesting mathematics arise. But we will revisit circles later on.
For now, let’s answer the question we opened with: constructing geometrically.
Theorem 3. Fix . Define . Define the line by and the circle to be a circle with centre and radius .
Then .
Proof. By Theorem 2, has equation
To obtain the coordinates of , we need to solve the equations
and require . We will use the substitution method as follows—substitute the equation into the circle equation:
Since , by definition, . Since and , by Lemma 1,
There is a lot more to circles than this one example, but for now, we return to triangles in the next post. In particular, the isosceles triangle will be a central tool in our geometric expedition.
Here’s our question today: what is the sum of angles in a triangle?
Previously, we have seen how linear equations can represent lines. If , then
so that the line has gradient and -intercept .
If , then . In this case, if , then , and the line is vertical with -intercept .
We can make a few more basic observations about our lines below, properly constructed here, which uses many tools from linear algebra.
Firstly, we use the word “angle” to describe the amount of separation between two lines. The simplest case occurs when the two lines are at right angles (i.e. ortho-gonal) to each other. We use the symbol to denote angles that are right, and in this case write :
By convention, we declare one right angle to equal (i.e. degrees), so that the sum of adjacent non-overlapping angles on a straight line add up to .
Definition 1. An angle is said to be
right if ,
acute if (i.e. smaller than a right angle),
obtuse if (i.e. larger than a right angle).
For example, the angle in the diagram below is acute, while the angle is obtuse.
Eventually, we will consider angles that are reflex (i.e. if ).
Remark 1. Usually, the Greek letter ‘theta’ denotes an angle. When there are multiple angles, we use the Greek letters ‘alpha’ , ‘beta’ , ‘gamma’ , and so on.
Remark 2. We can formalise the idea of angles using calculus (i.e. the study of rates of change), discussed elsewhere. The choice of the number has a rich human history, and its use is strengthened by its factors .
Consider two lines :
We call two non-identical lines parallel if they don’t intersect and write .
In this case, we will use the decorations , , etc to denote parallel lines.
We call the pair of angles a pair of interior angles if .
We make the key observation that assuming a flat screen, if and only if any pair of interior angles sum to . This result is known as the parallel postulate in Euclidean geometry.
We can also use our construction to conclude Playfair’s axiom:
Lemma 1 (Playfair’s Axiom). Let be a line and be a point that does not lie on . Then there exists a unique line that passes through and is parallel to .
Going forward, we will take advantage of these basic observations to derive several foundational tools for geometric inference, and resolve the sum-of-angles question that we started with.
Theorem 1. Given two non-parallel lines, the vertically opposite angles are equal to each other.
Proof. Construct the angle below.
Since adjacent non-overlapping angles on a straight line sum to ,
Therefore, since we can cancel the term on both sides (this is called the cancellation law),
Theorem 2. The two lines below are parallel if and only if the alternate angles are equal to each other.
Proof. Construct the angle below.
Since adjacent non-overlapping angles on a straight line sum to ,
By the parallel postulate,
Using the cancellation law in Theorem 1,
Theorem 3. The two lines below are parallel if and only if the corresponding angles are equal to each other.
Proof. Construct the angle below.
Then are vertically opposite. By Theorem 1,
Furthermore, are alternate angles. By Theorem 2,
Remark 2. In most school problems, you would be hand-given the parallel lines , and then, if needed, prove the angle equality result (e.g. ). You can use the following presentation for school assessments (e.g. homework, tests, examinations, etc):
At least, this working got me full credit when I was a secondary school student.
Theorem 4. The angles in a triangle sum to .
Proof. Consider the triangle below without loss of generality.
We need to prove that .
Use Playfair’s axiom to construct a line parallel to the base of the triangle that passes through the opposite point (i.e. vertex) of the triangle:
Construct the angles and .
Since are non-overlapping adjacent angles on a straight line,
We also observe that and are pairs of alternate angles. Since , by Theorem 2,
Therefore, using the usual rules of addition,
While there are many results that can and should be proven using these basic results, we will delay them to connect their usefulness with other geometric objects. For example:
The fact that a triangle is isosceles if and only if its base angles are equal is a consequence of our discussion on congruent triangles.
Two angles in the same segment in a circle must equal each other, and we will explore this idea when considering circles in more depth.
Rather than explore these results all in one post, we will smatter them throughout our discussion in connection with other parts of geometry.
For now, we will use Theorem 4 to prove one of the most important results in secondary school (and hence higher-level) mathematics—Pythagoras’ theorem.
Previously, we tried to earn some money selling fidget spinners. To do that, we will sell fidget spinners, where is given by the -coordinate of the intersection point of the two graphs below.
Remark 1. In economics terms,
the graph of is called the supply curve,
the graph of is called the demand curve.
In more mathy terms, we need to find a pair of numbers that satisfy both equations:
We call this process solving a system of simultaneous linear equations.
Example 1. Solve the system of simultaneous linear equations:
Solution. If satisfies both equations, then the terms on the left-hand side equal. Subtracting the first equation by the second:
Therefore, we just need to solve for :
Hence, . What is its matching -value?
Since this -value must satisfy both equations, we can back-substitute into either equation:
Supply curve: .
Demand curve: .
Therefore, our solution is and .
In ordered-pair notation, the solution is .
Whether we calculated the intersection point , or read it off from the graph, we obtained the same answer. This observation is massive—it connects pure symbols with familiar pictures!
This connection is the simplest building block behind linear algebra—which uses calculations to represent “straight-ish” objects like lines and planes (see Theorem 1 below). Its humble beginnings arise from what we just did—solving linear equations simultaneously.
In higher-level mathematics, the elimination method we used is given the fancier title Gaussian elimination.
There is an alternate (and more general) strategy to solve simultaneous equations, called the substitution method. We will explore this method when investigating quadratic curves.
Example 2. Let denote the line given by the equation . Determine the intersection point between and the -axis, called the -intercept of .
Solution. Let denote the -axis. Since is horizontal and passes through the point , it has equation . Therefore we need to solve the system of simultaneous linear equations
Multiplying the second equation by :
Subtracting the first equation by the second:
Back-substituting into the second equation, since there are no terms, we just have . Therefore, has an -intercept .
Example 3. Using the definition of in Example 2, determine the -intercept of .
Solution. Let denote the -axis. Since is vertical and passes through the point , it has equation . Therefore we need to solve the system of simultaneous linear equations
Subtracting the second equation from the first:
Back-substituting into the second equation, since there are no terms, we just have . Therefore, has a -intercept .
Our examples aren’t special; here’s the more general theory:
Theorem 1. If , then there exists a unique pair of real numbers that solves the system of simultaneous linear equations
In fact, , .
Proof Sketch. Multiply the first equation by and the second equation by :
Subtract the first equation by the second:
Assuming , we can obtain a unique value for , then back-substitute to obtain a unique value for .
Corollary 1. If and are both nonzero, then the line has -intercept and -intercept .
Proof. Use Theorem 1:
Set to obtain the -intercept.
Set to obtain the -intercept.
Theorem 1 gives us a taste of the theory of linear algebra, and our discussion on supply and demand curves illustrate one (over-simplified) real-world use case of said theory. We can summarise our findings with a simple cliche:
X marks the spot!
Next time, we will discuss some simple ideas about lines and angles, which can be described using lines, and be precisely formulated using—you guessed it—linear algebra. 90% of my blog was dedicated to ensure that what we do at the O-Level really is legitimate, and thankfully, I can confidently say that it is!
You want to earn money by selling fidget spinners, but two questions arise:
How much revenue can you earn by selling the fidget spinners?
How much cost would you incur by obtaining the fidget spinners in the first place?
Your profit is then defined by
Let’s first investigate your potential costs.
Example 1. Suppose you buy fidget spinners at a unit cost of per fidget spinner. Make the following definitions:
represents the number of fidget spinners you obtain.
represents the total cost of buying fidget spinners.
What would be the relationship between and ?
Solution. Using basic counting,
Hence, . For clarity, we suppress the symbol and write .
Example 2. In Example 1, if you can spend a maximum of , what is the maximum number of fidget spinners you can buy?
Solution. Since the maximum cost is capped at , the maximum number of fidget spinners is captured by setting :
We need to determine the number that represents (i.e. the value of ). Recalling that and ,
Therefore, is the maximum possible number of fidget spinners.
Remark 1. Suppose you had instead. Then the equation
produces the value .
We may be tempted to reject the answer , since of a fidget spinner doesn’t arise in real life.
However, if refers to the number of fidget spinners measured in groups of , then the answer corresponds to fidget spinners—a perfectly reasonable solution!
Therefore, in general, we can accept non-whole number values of .
Example 3. If you need to pay a fixed delivery fee of for your fidget spinners, what would your total cost, in terms of , now be?
Solution. By including the delivery fee,
For example, if , then . Here, we write
using the usual order of operations.
We can picture this relationship using a graph.
The horizontal right arrow, called the -axis, represents the different possible -values that we can substitute into the expression.
The vertical up arrow, called the -axis, represents the different possible -values that we can obtain.
If we allowed to take non-integer values, we will recover a richer picture. For example, if , then . Repeating this process, we get a striking picture:
We recover a straight line!
Example 4. Calculate the change in cost incurred by obtaining one additional fidget spinner. This change is called the gradient of the graph.
Solution. Let be any fixed amount of fidget spinner bought. The cost of buying this number of fidget spinners is
By incrementing the number of fidget spinners, we want to buy fidget spinners and incur a new cost given by
Hence, the required change in cost is
Example 5. Using , calculate the fixed cost that you would incur.
Solution. Setting , the fixed cost is given by
Definition 1. A graph is called a (non-vertical) straight line with:
gradient,
-intercept,
if it is continuously drawn using the equation
The value describes the steepness of the line:
if , then the line is upward-sloping,
if , then the line is horizontal,
if , then the line is downward-sloping.
Furthermore,
the more positive the gradient, the steeper the increase,
the more negative the gradient, the steeper the decrease.
The value fixes the position of the line:
if is positive, then the -intercept of the line is positive.
if is negative, then the -intercept of the line is negative.
if is zero, then the line passes .
If is positive (resp. zero or negative), then the -intercept of the line is positive (resp. zero or negative).
In short,
describes the steepness of the line,
fixes the position of the line.
Definition 2. Define a point as a position in space described by a pair of real numbers, denoted , called the coordinates of the point.
Lemma 1. If two points and with lie on a non-vertical straight line with gradient , then
ProofSketch. Let the non-vertical straight line have equation . Substituting the two pairs of values gives us the equations
We leave it as an exercise to check that
Lemma 2. The equation of a non-vertical straight line passing through the point with gradient is given by
Proof Sketch. Using Lemma 1, any point that lies on the straight line must satisfy the equation
Hence, .
Example 6. You discover that:
people are willing to pay per fidget spinner,
people are willing to pay per fidget spinner.
Let represent the price that people are willing to buy (of course, one fidget spinner per person).
Stating your assumptions, determine a reasonable relationship between and .
Solution. We can picture the two situations in the graph below.
By assuming a straight-line (i.e. linear) graph, we will first apply Lemma 1 using the points and to calculate its gradient:
Since the straight line passes through , by Lemma 2, it has the equation
Hence, .
Definition 3. A graph is a vertical line with -intercept if it is continuously drawn using the equation .
Example 7. The line corresponds to the -axis and the line corresponds to the -axis.
Definition 4. A graph is called a straight line if it can be continuously drawn using the linear equation, where are not both zero.
Theorem 1. A graph is a straight line if and only if it is either a vertical line or a non-vertical straight line.
Proof Sketch. We notice either or :
Using Definition 3, holds if and only if the graph is a vertical line.
Using Definition 1, holds if and only if the graph is a non-vertical straight line.
The complete proof using the definitions is left as a good exercise in algebraic manipulation.
Remark 2. This blog post is inspired by ideas in business and economics:
Example 3 pictures the supply curve: how much you are willing to pay in order to sell fidget spinners.
Example 6 pictures the demand curve: how much people are willing to pay in order to buy fidget spinners.
Let’s now draw Example 3 and Example 6 on the same graph.
What is the maximum number of fidget spinners that would be worth selling?
Obviously, when the two graphs intersect!
By using our eyeballs (i.e. inspection), the point of intersection has coordinates . However, we will explore this idea more systematically next time without inspection by solving simultaneous equations.
The number line helps us picture positive numbers, negative numbers, and even any point in between: we call these points real numbers, and represent them using the symbol :
All familiar numbers belong on this line:
Natural numbers:
Integers:
Fractions:
Negative fractions:
A number is called a rational number if it is either or a fraction or a negative fraction. Other numbers like
or
are also real numbers that are not rational numbers:
calculates the length of the circumference of a circle with unit radius.
calculates the length of the diagonal of a unit square.
Using this point of view (pun intended) we can accept these rules for adding real numbers:
When positive, real numbers can also be used to describe areas.
Each positive real number length matches a rectangle with base and height . If , the rectangle becomes a line, which has area, giving the trivial equations
We can therefore use different rectangles with the same area to recover many correct equations.
In particular, we obtain the following area properties for positive real numbers:
Commutativity:
Multiplicative identity:
Furthermore, we can use different rectangles with the same area to construct different real numbers. For example, we can use a square with area to construct the real number .
Likewise, we can use a square with area to construct the real number approximately equal to , which we represent using the symbol .
Remark 1. We will revisit this example when discussing geometry.
For this reason, we call this number the “square root” of : it is the root number, that when used to construct a square, produces a square with area . Replacing with any positive number ,
Write to mean the area of the rectangle with height and base :
Since the two smaller rectangles combine into a larger rectangle with height and base , we recover the “rainbow method” to expand expressions (i.e. distributivity):
where the second identity follows from the first:
Example 1. Evaluate . In particular, evaluate .
Solution. By using the “rainbow method”,
Setting and ,
Remark 2. Many students will sadly make the rookie mistake .
Example 2. Explain why it is not true in general that .
Solution. Consider the case and . Then
We call this violation of a proposed general pattern a counterexample.
We can do better. Consider the rectangle below with base , height , and area .
Since there is only one rectangle with base and area , there is exactly one positive number such that
We define , and inverse property:
Hence, we define real number division by
Remark 3. If , then no matter what height we have, the area of the corresponding rectangle will be . Hence, the quantity is undefined. In other words: division by is illegal.
Example 3. Given , show that .
Proof. Using real number division,
If we want to multiply many times, we can bump things up by one dimension. Consider the volume of a cuboid with base , width , and height .
Since both cuboids have the same volume, we can multiply in either order (i.e. associativity):
Hence, we can define three-way multiplication by
without confusion. For products of even more numbers, though we would find it difficult to imagine objects beyond volume, we don’t need to worry:
Example 4. Given , show that . In particular,
Proof. Since we can multiply in any order of “brackets”,
Corollary 1. Given ,
Proof. Using Example 4,
To add two fractions, use Example 3:
Remark 4. By the principle in Remark 3, you might think that the number is also undefined. After all, how can a rectangle have negative base length? That is because we are using areas as an analogy to describe real number calculations. Nevertheless, the quantity is still well defined; indeed, using Corollary 1,
and is a clearly well-defined positive fraction, so that is a perfectly well-defined rational number. However, trying to define will create problems:
Numbers mean nothing unless we can perform meaningful calculations with them. To add two fractions, we first find their (lowest) common denominator, then add them normally.
Example 5. Evaluate without a calculator.
Solution. By using the common denominator and Corollary 1,
Example 6. Evaluate without a calculator.
Solution. Following Example 5 but instead using and Corollary 2,
Example 7. Explain why it is not always true that .
Solution. Consider the case and . Then
Oddly enough, we have derived the important main ideas for calculating using real numbers, so we can extend these ideas to discuss basic algebraic manipulations in the next post.
What remains are some technical remarks about our discussions.
Remark 5. It turns out that for any non-zero real number ,
is always not true (i.e. false)! Similarly, if and , then the statement
is always false.
Remark 6. We have built the basics of real number calculations, and since we will use the letters and soon, it gets very messy to write . To simplify our work, we write , so that the “rainbow method” looks like
In particular, Example 1 looks like the following:
Sometimes, we use a dot to to clarify that we are doing multiplication:
Hence, we have the identities
Without brackets, we will always multiply before adding:
leading to the usual PEMDAS rule:
parentheses (or brackets),
exponents (e.g. ),
multiplication / division,
and finally addition / subtraction.
In fact, division is defined using multiplication:
Likewise, subtraction is defined using addition:
Remark 7. To properly construct the real numbers, beyond the rectangle-area picture, takes a lot more effort. We first need to define the rational numbers, then the real numbers, and finally for completeness, the complex numbers.
Let’s say that you started investing . You bought an asset for , and the price changes.
If the price increases to , then you have gained.
If, however, the price decreases to , then you have lost.
A positive number, therefore, represents our idea of an increase in a certain amount. A negative number, therefore, represents our idea of a decrease in a certain amount—i.e. a reverse of an increase.
A useful way to visually think of these ideas is by considering the number line below.
Here, we fix the “no-change” point :
We represent a positive change using a rightward arrow whose length describes the extent of said change.
Likewise, we represent a negative change, by flipping the rightarrow into a leftward arrow whose length describes the extent of said change. We denote points to the left of with the negative sign prefix .
It is clear that , since , and so we can “take away” a smaller number from a larger number. This idea, however, fails when trying to make sense of .
However, the number line can help us develop some needful ideas.
Example 1. Use the number line above to evaluate the expression .
Solution. We re-draw the number line below.
We can read the expression as follows:
start at , then
move units rightward , and finally,
move leftward .
The resulting position is . Therefore,
Since there is nothing special about the numbers and in this example, we obtain the following result:
Lemma 1. Given whole numbers , we can define subtraction by
Proof. Use the number line representation for negative numbers.
Definition 1. A negative number is a number of the form , where is a positive whole number.
The collection of positive whole numbers, negative numbers, and zero, is together called the collection of integers.
We define addition of integers according to the following rules, where are whole numbers and :
Furthermore, we define subtraction of integers by .
For a more complete construction, see this post that uses higher-level mathematics.
Using the number line above, we see that flipping a unit rightward direction gives the same result as directly moving units leftward :
If we flip a leftward direction , say , then the negative sign flips the leftward direction into a rightward direction :
Definition 2. We define multiplication by to mean
whenever is an integer.
Example 2. Evaluate .
Solution. By Definition 2, and our prior discussion,
Consolidating, we have the following sign conventions.
Lemma 2. The following multiplication of negative signs hold:
Example 3. Evaluate the expressions , , and .
Solution. Using Definition 2,
Similarly,
Finally, using Example 2,
To shorten our working, we use the double negative rule:
Since there is nothing special about the numbers and , we obtain the following multiplication rules involving negative signs:
Theorem 1. Given integers , the following multiplication rules hold:
Proof. Use Definition 2 and Lemma 2 and adapt the solution in Example 3.
At last, expressions of the form make sense, whether are positive numbers, negative numbers, or even just zero.
What we have done works for positive and negative fractions as well. We call them rational numbers, since fractions are ratios of numbers. Filling in the holes between the rational numbers gives us the real numbers, which we picture using the number line. More on that in the next post!