Section 1.2 Logic and set theory
Unfortunately mathematics in school is often taught as a collection of rules and skills that allow students to solve problems, such as equations, graphing problems, simplification of algebraic expressions, or complex calculations. Discussions of the meaning of the concepts involved and the reason why the rules and techniques work are often neglected. This course will be based on definitions, that precisely introduce and describe a new concept, statements (also called theorems, or lemmas, or propositions) based on these definitions, and proofs of the statements. We will now discuss a little each of these notions.
Subsection 1.2.1 Statements
A statement is any sentence that must be either true or false. So “5 is a natural number” is a true statement and “New York is the capital of the United States” is a false statement. But “Who are you?” is not a statement because it makes no sense to ask if it is true or false, and “onions taste good” is not a statement because whether it is true or false depends on who we ask. We will often use symbols with special font, such as \(\cal P \) or \(\cal Q \text{,}\) followed by a colon, to denote statements. So we could write “\(\cal P\) : 5 is a natural number” or “\(\cal Q\) : New York is the capital of the United States”. If we write \({\cal P}(x)\text{,}\) it means that the statement \(\cal P\) depends on whatever \(x\) may be. So for example “\({\cal P}(x)\text{:}\) \(x\) is a natural number less than 5” is a statement that is true if \(x\) is 1,2,3 or 4, but it is false otherwise. In this situation we call \(x\) a variable. Of course a variable need not represent a number. We could consider the statement “\({\cal P}(x)\text{:}\) \(x\) is a student in this class”, and then \(x\) represents a person.
Subsection 1.2.2 Logic
We will now review some logic. If \(\cal P\) is a statement, we denote by \(\sim \cal P\) its negation. So \(\sim \cal P\) is true when \(\cal P\) is false, and \(\sim \cal P\) is false when \(\cal P\) is true. For example, if \({\cal P} (x)\text{:}\) “\(x\) is an integer greater than 1” then \(\sim {\cal P} (x)\text{:}\) “\(x\) is an integer less than or equal to 1”.
If \(\cal P\) and \(\cal Q\) are statements, we can construct another statement, called an implication denoted by \({\cal P} \Rightarrow {\cal Q}\) that is false only when \(\cal P\) is true and \(\cal Q\) is false. In all other cases it is true. We read this as "\(\cal P\) implies \(\cal Q\)", or "If \(\cal P\text{,}\) then \(\cal Q\text{.}\)" The logic is that we are thinking that if \(\cal P\) is true, then \(\cal Q\) must also be true, or that \(\cal Q\) follows from \(\cal P\text{.}\)
To understand why we say that the statement \({\cal P} \Rightarrow {\cal Q}\) is always true except in the case that \(\cal P\) is true and \(\cal Q\) is false, think of it as a promise. So suppose \(\cal P\) is the statement “the New Orleans Saints win the game” and \(\cal Q\) is the statement “Dustin will take Beverly out to dinner”. Then \({\cal P} \Rightarrow {\cal Q}\) is the promise that if the Saints win the game, Dustin will take Beverly out to dinner. How can this promise be broken? Only if the Saints win, and yet Dustin does not take Beverly out to dinner. In all other cases, the promise will not be broken. So if the Saints lose, the statement \({\cal P} \Rightarrow {\cal Q}\) remains true for sure, whether or not Dustin takes Beverly out to dinner, because there is no broken promise.
This logical property of implications is sometimes referred to by saying that “a false statement implies anything”. To show that it can play a role in understanding what a definition really says, we consider an example from Linear Algebra. One of the conditions for a matrix to be in Row Echelon Form is the following: “If the matrix has any zero rows, they must all be at the bottom.” This is an implication \({\cal P}\Rightarrow {\cal Q}\text{,}\) with “\({\cal P}\text{:}\) The matrix has some zero rows” and “\({\cal Q}\text{:}\) All zero rows are at the bottom”. It is easy to understand that the matrix
\begin{equation*}
\begin{pmatrix} 1 \amp 2 \amp 3 \\ 0 \amp 1 \amp 5 \\ 0 \amp 0 \amp 0 \end{pmatrix}
\end{equation*}
satisfies this condition, and the matrix
\begin{equation*}
\begin{pmatrix} 1 \amp 2 \amp 3 \\ 0 \amp 0 \amp 0 \\ 0 \amp 1 \amp 5 \end{pmatrix}
\end{equation*}
does not. But what about the matrix
\begin{equation*}
\begin{pmatrix} 1 \amp 2 \amp 3 \\ 0 \amp 1 \amp 5 \\ 0 \amp 0 \amp 1 \end{pmatrix} ?
\end{equation*}
It is not unusual for students to say that this matrix is not in Row Echelon Form, because there is no zero row at the bottom . But in fact the implication “\({\cal P}\Rightarrow {\cal Q}\)” is true, because \({\cal P}\) is false (there are no zero rows). So the condition is satisfied.
\({\cal P} \Longleftrightarrow {\cal Q}\) means that \({\cal P} \Rightarrow {\cal Q}\) and \({\cal Q}\Rightarrow {\cal P}\text{.}\) In this case we say that \(\cal P\) and \(\cal Q\) are equivalent. Here is an example from elementary Algebra. Suppose that \(x\) is a rational number. Then
\begin{equation*}
3x=4 \Longleftrightarrow x=\frac{4}{3}.
\end{equation*}
In fact, all the steps involved in solving Algebra or Calculus problems are mostly sequences of equivalent statements of this type.
The negation of \({\cal P} \Rightarrow {\cal Q}\) is often useful when writing proofs. It must be a statement that is true precisely when \({\cal P} \Rightarrow {\cal Q}\) is false, and we know that this happens only when \({\cal P}\) is true and \({\cal Q}\) is false. So we see that
\begin{equation*}
\sim({\cal P} \Rightarrow {\cal Q}) \Longleftrightarrow ({\cal P} \mbox{ and } \sim {\cal Q}).
\end{equation*}
Another useful logical equivalence is :
\begin{equation*}
({\cal P} \Rightarrow {\cal Q }) \Longleftrightarrow (\sim {\cal Q} \Rightarrow \sim {\cal P}).
\end{equation*}
The implication \(\sim {\cal Q} \Rightarrow \sim {\cal P}\) is called the contrapositive of the implication \({\cal P} \Rightarrow {\cal Q }\text{.}\) To understand this using the same example as before, saying “If the Saints win the game, then Dustin takes Beverly to dinner” is the same as saying “If Dustin doesn’t take Beverly to dinner, then the Saints have lost the game”.
Subsection 1.2.3 Quantifiers
When a statement \({\cal P}(x)\) depends on some variable, there are two especially important situations that we need to consider:
The two symbols (called quantifiers) \(\forall\) (read as: “for all”) and \(\exists\) (read as “there exists”) are useful to write such statements. So for example, suppose \(x\) represents people and \({\cal P}(x)\text{:}\) “\(x\) likes chocolate”. Then the satement “Everybody likes chocolate” can be written as \((\forall x){\cal P}(x)\text{,}\) while the statement “There are some people who don’t like chocolate” is \((\exists x)\sim {\cal P}(x)\text{.}\)
Note that the statement “There are some people who don’t like chocolate” is the precise negation of the statement “Everybody likes chocolate”. This is reflected by the fact that \(\forall \) turns to \(\exists \) and \({\cal P}\) turns to \(\sim {\cal P}\text{,}\) and provides a useful rule of thumb to write the negation of complex logical statements: change all \(\forall \) to \(\exists \) and all \(\exists \) to \(\forall \text{,}\) and change every statement to its negation.
Example 1.2.1.
Suppose that we work with rational numbers, so all the variables represent rational numbers. Consider the statement “Every number has a square root.” We can write the statement in symbols as
\begin{equation*}
(\forall x)(\exists y)(y^2=x).
\end{equation*}
According to our rule, the negation is
\begin{equation*}
(\exists x)(\forall y)(y^2\neq x).
\end{equation*}
In fact, the last statement says: “There is some rational number \(x\) with the property that whatever rational number \(y\) we use, the square of \(y\) will not be \(x\text{.}\)” This is the same thing as saying that there is some \(x\) that does not have a square root.
Subsection 1.2.4 Sets
We will review some of the basic notions of set theory. A more complete treatment of the theory can be found in many freely avaialble online sources such as
this textbook on Discrete Mathematics 1 .
As we already mentioned, we need to be precise when making definitions, and we should only use concepts that have already been previously defined. But we need to start somewhere. There are notions that must be treated as understood, without appealing to any other previously defined concepts. Basic examples of these notions are those of
set 2 and
membership. A set is a collection of elements. An element is a member of a set (or it belongs to it, or is contained in it) if it is one of the elements that make up the set. These notions are primitive, in the sense that they cannot be explained or defined in terms of other notions. We need to start from here. We will use curly braces
\(\{ \ldots \}\) to represent what a set contains, as we have already done in the notation for
\(\mathbb N\text{,}\) \(\mathbb Z\) and
\(\mathbb Q\) in the previous section.
So \(\mathbb N\text{,}\) \(\mathbb Z\) and \(\mathbb Q\) are basic examples of sets. If \(A\) is a set, we will use the notation \(x\in A\) to mean that \(x\) is an element of \(A\text{,}\) or that \(x\) belongs to \(A\text{,}\) and \(x\not \in A\) to mean that \(x\) is not an element of \(A\text{.}\) So for example \(3\in {\mathbb N}\) and \(\frac{1}{2}\not \in {\mathbb Z}\text{.}\)
But how do we make other sets? Here logic, and the primitive notion of membership, come into play. A common and useful way to describe sets is the set builder notation. Suppose \({\cal P}(x)\) is a statement depending on some variable \(x\text{.}\) We can then consider the set of all possible \(x\) that will make \({\cal P}(x)\) true, and we denote it by \(\{x| {\cal P}(x)\}\text{,}\) or \(\{x: {\cal P}(x)\}\text{.}\) So if
\begin{equation*}
{\cal P}(x): x \mbox{ is a natural number less than 5},
\end{equation*}
then
\begin{equation}
\{x| {\cal P}(x)\}=\{1,2,3,4\}.\tag{1.2.1}
\end{equation}
The vertical bar
\(|\) (or the colon) in this notation is read “such that”. So we would read the expression
(1.2.1) as “The set of all
\(x\) such that
\({\cal P}(x)\) is true.” Sometimes we may also include some information about
\(x\) before the vertical bar (or the colon), and then we need less information in the statement
\({\cal P}(x)\text{.}\) So we could write the above example as
\begin{equation*}
\{x\in {\mathbb N}| x\lt 5\}.
\end{equation*}
It is still the same set, \(\{1,2,3,4\}\text{.}\)
Subsection 1.2.5 Set operations
Once we feel comfortable with the notion of a set and what it means for something to belong to it, we can define the basic set operations such as union and intersection. This gives us the opportunity to illustrate our first definition.
Definition 1.2.2.
Let \(A\) and \(B\) be sets. The union of \(A\) and \(B\) is the set consisting of all elements that are either in \(A\) or in \(B\text{,}\) and it is denoted by \(A\cup B\text{.}\) The intersection of \(A\) and \(B\) is the set consisting of all elements that are both in \(A\) and in \(B\text{,}\) and it is denoted by \(A\cap B\text{.}\)
Note that the word “or” in math is used in a different way than it is used in ordinary language. In math, \(\cal P\) or \(\cal Q\) means either \(\cal P\text{,}\) or \(\cal Q\text{,}\) or both. So if \(x\) is in \(A\cup B\text{,}\) then \(x\) could be in \(A\text{,}\) or in \(B\text{,}\) or in both \(A\) and \(B\text{.}\) Compare with the ordinay use of "or" in the sentence “Would you like tea or coffee?”. In this sentence it is evident that the choice is understood to be between one or the other, not both.
Let’s pause a moment to examine what the above definition of union and intersection does. First of all, it uses plain English. Then, it does two different things: it explains what the new concepts are, and it assigns a name to them. The assigned names are union and intersection. Then, it also introduces some notation in order to efficiently use the new concepts in writing. The notation introduced consists of the two symbols \(\cup\) and \(\cap\) that represent union and intersection, respectively. In summary, a definition often does three things: it explains the meaning of a new concept; it reserves a word or phrase to be used for it; and (sometimes) it also reserves a symbol to be used for it.
We stress the fact that plain English was used. This could have been done in a different way. We could have written the definition as follows:
Definition 1.2.3.
The union of \(A\) and \(B\) is the set \(A\cup B\) defined by
\begin{equation*}
A\cup B =\{x|x\in A \mbox{ or } x\in B\},
\end{equation*}
and the intersection of \(A\) and \(B\) is the set \(A\cap B\) defined by
\begin{equation*}
A\cap B =\{x|x\in A \mbox{ and } x\in B\}.
\end{equation*}
This second way to define union and intersection is perfectly good and correct. But, in this course we will prefer to use plain English language whenever possible, in definitions, in statements of theorems, and in proofs. This does not mean that we will not use symbols. On the contrary, symbols are very often essential to carry out complex proofs. But they should not be a substitute for words when the concept is more clearly explained or understood without them. Most people who have never heard of union and intersection will more easily understand their definition in the one with the words rather than in the one with the symbols. It is often useful, as an exercise, to express in words concepts presented using only symbols, and likewise to write in compact symbolic form statements that were made using only words.
We will now give a few more especially important definitions.
Definition 1.2.4.
If \(A\) and \(B\) are sets, we say that \(A\) is a subset of \(B\) if every element of \(A\) is also an element of \(B\text{,}\) and in this case we write \(A\subset B\text{.}\)
Note again how this definition provides three pieces of information: the explanation of a new concept, its name in English (subset) and the symbol used
\(\subset\text{.}\) We can make the same definition using only symbols:
\begin{equation*}
(A\subset B) \Longleftrightarrow (x\in A \Rightarrow x\in B).
\end{equation*}
Using the \(\subset\) symbol, we can write the statement that every natural number is an integer, and every integer is a rational number (made at the beginning of section 1.2) as:
\begin{equation*}
{\mathbb N}\subset {\mathbb Z} \subset {\mathbb Q}.
\end{equation*}
Subsection 1.2.6 Cardinality
In our discussion of sets so far we have not addressed how large a set can be. Of course we already know that there are infinitely many numbers, so the sets \(\mathbb N, Z, Q\) are all infinite. But sets such as \(\{n\in {\mathbb N}: n\leq k\}\text{,}\) where \(k\) is a positive integer, are clearly not infinite, in fact this set has exactly \(k\) elements. A set that is not infinite is said to be finite. For such a set, we define its cardinality to be the number of its elements, and denote the cardinality of the set \(A\) by \(|A|\text{.}\) So for example if \(A=\{1,3,5,7,9\}\) then \(|A|=5\text{,}\) and \(|\{n\in {\mathbb N}: n\leq k\}|=k\text{.}\)
Subsection 1.2.7 De Morgan’s laws
Definition 1.2.5.
If \(A\) and \(B\) are sets, the set difference \(A\setminus B\) is defined to be the set of all elements in \(A\) that are not in \(B\text{.}\) In symbols,
\begin{equation*}
A\setminus B = \{x\in A| x\not \in B\}.
\end{equation*}
For example, \(\mathbb Z \setminus \mathbb N\) is the set of all negative integers, together with 0, and \(\mathbb Q \setminus \mathbb Z\) is the set of rational numbers that, when written in lowest terms, have a denominator larger than 1.
Sometimes we consider all our sets as being subsets of some fixed set, that we call the universal set. So for example, if we work with positive integers only, the universal set is \(\mathbb N\text{.}\)
Definition 1.2.6. Complement of a set.
Suppose we use \(X\) as universal set. If \(A\subset X\text{,}\) the complement of \(A\) is the \(A^c=X\setminus A\text{.}\)We can now state some basic relationships between the complement and the set operations of union and intersection.
Theorem 1.2.7. De Morgan’s laws.
Suppose \(A,B\) are subsets of some universal set \(X\text{.}\) Then
\begin{equation*}
(A\cap B)^c = A^c\cup B^c
\end{equation*}
and
\begin{equation*}
(A\cup B)^c = A^c\cap B^c.
\end{equation*}
Subsection 1.2.8 Cartesian product
Definition 1.2.8.
Let \(A\) and \(B\) be sets. The Cartesian product of \(A\) and \(B\) is the set of al pairs \((a,b)\text{,}\) where \(a\in A\) and \(b\in B\text{,}\) and it is denoted by \(A\times B\text{.}\) So in symbols,
\begin{equation*}
A\times B=\{(a,b): a\in A, b\in B\}.
\end{equation*}
In particular, the Cartesian product of a set \(A\) with itself is the set \(A\times A\) of all pairs \((a,b)\) where \(a,b \in A\text{.}\) Two important notions are associated with \(A\times A\text{.}\)
Definition 1.2.9.
A Binary relation on a set \(A\) is any subset \(R\) of \(A\times A\text{.}\)Assuming we already know what \(a\lt b\) means, an important example of a binary relation is \(R=\{(a,b): a\lt b\}\text{.}\) In fact, in the next chapter, we will use the notion of binary relation to define what \(a\lt b\) means.
Definition 1.2.10.
A Binary operation on a set \(A\) is any function with domain \(A\times A\) and co-domain \(A\text{.}\)The usual operations of addition and multiplication are basic examples of binary operations. Instead of using the notation \(f(x,y)\) we use \(x+y\text{,}\) or \(xy\text{,}\) but that is the only difference: both addition and multiplication are just functions with domain \(A\times A\) and co-domain \(A\text{,}\) where \(A\) can be \(\mathbb N\text{,}\) or \(\mathbb Z\text{,}\) or \(\mathbb Q\text{.}\)
discrete.openmathbooks.org/dmoi3/sec_intro-sets.html
Actually, in a more advanced treatment, the primitive, undefined notion is called a
class and a set is defined to be a class with some desirable properties. See
this discussion if you are interestd in the difference between classes and sets. But we will consider a set a primitive notion.