Lecture 6: Mathematical Structures and Functions - CSCI 381 Goldberg

Cost of Exhaustive Searching (Brute-Force Method)

Here, the Time Complexity equals the size of the search space.

Initially, mathematical structures are stored in un/ordered sets.

Structure 1: Subsets of a Set - $2^n$

How many subsets are there of a given set?

Suppose a set $S = \{E_0, \ldots, E_{n-1}\}$ - $E$ for elements, and in C/C++ based languages the first index is 0.

n = |S| \quad \text{(cardinality)}

A method to encode such a set with its subsets is by using a one-dimensional array/vector:

$E_0$	$E_1$	$\cdots$	$E_{n-1}$
1	1	0 0 0	1

Sub/set $S_i = \{E_0, E_1, E_{n-1}\}$

Any subset $S_i$ (of a set of size $n$ ) can simply be encoded by boolean values stored in the above vector.

For example, subset $S_i = \{E_0, E_1, E_{n-1}\}$ :

1100012_2 = 49 \qquad 000\ldots000_2 = 0 \qquad 111\ldots111_2 = 2^n - 1

The small 2 after the digits denotes base 2.

So counting the number of subsets is equivalent to determining how many bitstrings (arrays of only 0/1 values) of size $n$ there are. As mentioned many times before, this finite string of length $n$ can be encoded by natural numbers ranging from $0$ (when the vector contains all 0’s) to $2^n - 1$ (when the vector contains all 1’s). So a simple for loop enumerating the natural numbers from $0$ to $2^n - 1$ can each be decoded into the binary representation of each such natural number, which automatically represents a unique subset of an initial set $S$ of size $n$ .

In summary, there are $2^n$ such subsets, which is also $|PS(S)|$ where $PS$ = powerset.

Structure 2: Relations(hips) from a Set $S_1$ to a Set $S_2$ - $2^{mn}$

Conventionally, $m = |S_1|$ and $n = |S_2|$ .

S1          S2

A1          B1
A2    -->   B2
Ai          .
.           .
Am          Bn

A graph is a “graph”ical representation of a relation. Standard (default) relations have two sets, whereas standard (default) graphs are from one set to itself. But this is only a default convention - relations can be over one set and graphs can be over two sets. A graph from set $A$ to a set $B$ is called bipartite.

Note: For diagramming purposes, the two sets appear of the same length (size) but mathematically of course of any size. This is indicated by the fact that $|S_1| = m$ , $|S_2| = n$ , with most probably $m \neq n$ .

The Cartesian Product is ONE such relationship/mapping but contains the maximum number of connections FROM/TO:

S_1 \times S_2 = \{\langle A_1, B_1 \rangle, \ldots, \langle A_i, B_j \rangle, \ldots, \langle A_m, B_n \rangle\}

|S_1 \times S_2| = |S_1| \cdot |S_2| = mn

Now, an arbitrary relation $R$ from $S_1$ to $S_2$ will be a subset of $S_1 \times S_2$ . $S_1 \times S_2$ has ALL possible connections from $S_1$ to $S_2$ ; a random subset has SOME of the possible connections. SOME is a subset of ALL. Therefore:

R \subseteq S_1 \times S_2

Since $R$ is one subset of $S_1 \times S_2$ , then ALL relations are ALL subsets of $S_1 \times S_2$ , which is $PS(S_1 \times S_2)$ . Based on Structure 1:

|PS(S_1 \times S_2)| = 2^{|S_1 \times S_2|} = 2^{mn}

In sets, the relation with the most number of connections is the Cartesian Product $A \times B$ , and in its corresponding graph, the graph with all possible edges is called Complete, noted by the letter $K$ for Komplete. (Complete, but not to be confused with other usages of $C$ .) $K_{m,n}$ if two sets and $K_n$ if over one set.

Structure 3: Directed Graphs over a Set $S$ - $2^{n^2}$

Graph $G = (V, E)$ where $V$ is the vertex set (aka nodes) and $E$ is the edge set (aka arcs).

n = |V|, \qquad E \subseteq V \times V

(Compare with Structure 2 with $R$ as $E$ ; here $S_1 = S_2 = V$ .) In fact, a graph is a “graph”ical representation of a relation.

The conventional defaults for relations and directed graphs are slightly different but they both are “two sides” of the same coin. In the case of relations (Structure 2), the default is a mapping from a set $S_1$ to a set $S_2$ (i.e. two sets involved), but the default for directed graphs is a mapping over a single set $V$ . Now the two are the same because there is no restriction on $S_1$ and $S_2$ , and in fact we can have $S_1 = S_2$ , in which case relations are identically directed graphs. Not just that $m = n$ but that $A_i = B_i$ - the elements are the same.

Therefore, we can use Structure 2 to count the number of directed graphs over $V$ , $n = |V|$ . Namely, set $m = n$ and $S_1 = S_2$ . Then, the number of directed graphs over a single set $V$ :

|PS(S_1 \times S_2)| = 2^{|S_1 \times S_2|} = 2^{mn} \quad \text{where } S_1 = S_2 = V

|PS(V \times V)| = 2^{|V \times V|} = 2^{n^2} \quad \text{so } m = n

Consider a relatively “small” vertex set $V$ with $n = 10$ nodes/vertices. The number of different directed graphs over this set $V = 2^{n^2} = 2^{10^2} = 2^{100}$ (WOW!!!!). (That is larger than the claimed number of molecules in the universe.)

(The number of undirected graphs is left as a homework exercise.)

Structure 4: Partial Mapping from a Set $S_1$ to a Set $S_2$ - $(n+1)^m$

S1          S2

A1          B1
A2    -->   B2
Ai          Bj
.           .
Am          Bn

In a partial mapping, each element $A_i$ has independently $n$ possible $B_j$ elements to choose from to map into (though, at most one), but in addition, each element $A_i$ has the extra right to not go at all (this is what causes the mapping to be “partial” or undefined). The partial mapping can only make one of these choices for each element $A_i$ .

Each $A_i$ therefore has $n + 1$ choices:

\underbrace{(n+1)}_{A_1} \times \underbrace{(n+1)}_{A_2} \times \cdots \times \underbrace{(n+1)}_{A_i} \times \cdots \times \underbrace{(n+1)}_{A_{m-1}} \times \underbrace{(n+1)}_{A_m}

In basic probability theory (and basic analysis of algorithms), the independent choices are multiplied together. The number of partial mappings from a set $S_1$ of size $m$ to a set $S_2$ of size $n$ equals $(n+1)^m$ .

Structure 5: Total Mapping (Function) from a Set $S_1$ to a Set $S_2$ - $n^m$

S1          S2

A1          B1
A2    -->   B2
Ai          Bj
.           .
Am          Bn

In a total mapping, each element $A_i$ has independently $n$ possible $B_j$ elements to choose from to map into, but each element $A_i$ must go to exactly one element $B_j$ . Note that all $A_i$ can go to the same $B_j$ .

Exactly one is not the same as uniquely one (next structure). Exactly one is purely a quantitative issue. Uniquely one is also qualitative.

Total functions that all map to the same element are called Constant Functions. For example $F(x) = 5$ .

Each $A_i$ has $n$ choices:

\underbrace{n}_{A_1} \times \underbrace{n}_{A_2} \times \cdots \times \underbrace{n}_{A_i} \times \cdots \times \underbrace{n}_{A_{m-1}} \times \underbrace{n}_{A_m}

In basic probability theory (and basic analysis of algorithms), the independent choices are multiplied together. The number of total mappings from a set $S_1$ of size $m$ to a set $S_2$ of size $n$ equals $n^m$ .

Structure 6: Injective Mapping (1-1 Function) from a Set $S_1$ to a Set $S_2$ - $\frac{n!}{(n-m)!}$

aka Permutations

S1          S2

A1          B1
A2    -->   B2
.           .
.           .
.           .
Am          Bn

Injective mappings are also Total. Based on the pigeonhole principle, the uniqueness property of 1-1 mappings requires therefore that $m \leq n$ .

In a 1-1 mapping, each element $A_i$ must uniquely choose one of the remaining possible $B_j$ to map into, and each element $A_i$ must go to some $B_j$ (since 1-1 is also Total).

To see why the number of choices decreases, imagine a waiter at a party carrying a tray of 100 different shaped cakes - each one unique. The first person the waiter approaches has 100 choices and takes one. The second person now only has 99 choices, the third has 98, and so on. Because of the uniqueness requirement, whatever cake a person takes is no longer available to the next person. The same logic applies here: $A_1$ makes its choice first (wlog), which removes one $B_j$ from availability, leaving $A_2$ with $n-1$ choices, and so on.

Each $A_i$ gets fewer choices as prior elements have already claimed a $B_j$ :

\underbrace{n}_{A_1} \times \underbrace{(n-1)}_{A_2} \times \cdots \times \underbrace{(n-i+1)}_{A_i} \times \cdots \times \underbrace{(n-m+2)}_{A_{m-1}} \times \underbrace{(n-m+1)}_{A_m}

In basic probability theory (and basic analysis of algorithms), the independent choices are multiplied together. The number of injective mappings from a set $S_1$ of size $m$ to a set $S_2$ of size $n$ equals:

n \cdot (n-1) \cdot (n-2) \cdots (n-i+1) \cdots (n-m+1)

This sort of looks like a defective factorial function. So, the literature uses an algebraic trick to count the number of such 1-1 mappings - multiply by 1:

\bigl[n \cdot (n-1) \cdot (n-2) \cdots (n-i+1) \cdots (n-m+1)\bigr] \times \frac{(n-m)!}{(n-m)!} = \frac{n!}{(n-m)!}

This is ALSO the number of ways to permute a subset of a set. As above, $m \leq n$ . Thus, we can view the 1-1 mapping as a two-step process:

(Select or) Choose $m$ of the $n$ elements in set $S_2$ . This is $C(n,m)$ , aka $\binom{n}{m}$ , aka $nCm$ . Note: the convention for combinatorics is that the LARGER size is written first, here $n$ (since $m \leq n$ ), even though in terms of its cousin the 1-1 mapping, the set of size $m$ is visited/used first. Likewise for $\text{Permute}(n,m)$ even though we select (“deal with”) $m$ first. $C(n,m)$ = Combinations.
Count the number of permutations of a set of size $m$ . How many permutations of size $m$ ? $m!$

In summary, what we have here is the following:

\text{Permute}(n,m) = C(n,m) \times m! = \frac{n!}{(n-m)! \cdot m!} \times m! = \frac{n!}{(n-m)!}

Thus, a 1-1 mapping is identical to permuting a subset of elements from a larger set.

Structure 7: Surjective Mapping (Onto Function) from a Set $S_1$ to a Set $S_2$ - see below

Every element in $S_2$ is reached.

S1          S2

A1          B1
A2    -->   B2
.           .
.           .
.           .
Am          Bn

Onto mappings are also Total, so no $A_i$ can go to more than one $B_j$ . All elements of $S_2$ are connected to (“reached”) by the elements of set $S_1$ . Therefore, $m \geq n$ .

No one knows how to directly calculate the number of surjective mappings. Instead, the approach utilized is to count the number of total functions that are NOT onto:

\text{Total}(m,n) = \text{ONTO}(m,n) + \text{NON\_ONTO}(m,n)

\text{ONTO}(m,n) = \text{Total}(m,n) - \text{NON\_ONTO}(m,n) = n^m - \text{NON\_ONTO}(m,n)

(cf. Structure 5 above.)

The complication that occurs is due to the Inclusion-Exclusion (I-E) principle, which deals with offsetting the possibility of counting the same elements more than once.

Inclusion-Exclusion Principle

For example, suppose the math students club and the computer science society decide to have an end of year party (only students allowed). $|\text{MATH}| = 50$ and $|\text{CS}| = 50$ . Every student attends the party. How many students in total attended the party?

The first glance approach would be $|\text{MATH}| + |\text{CS}| = 50 + 50 = 100$ students. But this is not necessarily correct, because this ignores the possibility that the same student(s) is a double major. Thus, the correct answer is $50 \leq x \leq 100$ .

I-E applied:

|A \cup B| = |A| + |B| - |A \cap B|

|A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|

Pascal’s triangle has an interesting connection to I-E - the combinatorics involved in the I-E principle are the coefficients present in Pascal’s triangle:

\begin{array}{ccccccc} & 1 & & 2 & & 1 & \\ 1 & & 3 & & 3 & & 1 \end{array}

These are rows of Pascal’s triangle.

Applying I-E to Count (Non-)Surjective Mappings

S1          S2

A1          B1
A2    -->   B2
.           .
.           .
.           .
.           Bn
.
Am

(Note that $m \geq n$ .)

How many ways can we prevent the general total mapping from becoming surjective? (wlog means “without loss of generality”: the theorem does not depend on which element(s) we use; we choose the elements simply for the ease of discussing them.)

We will not allow any element to reach (wlog) $B_1$ - (one $B_i$ element)
We will not allow any element to reach (wlog) $B_1$ or $B_2$ - (two $B_i$ elements)
$\vdots$
We will not allow any element to reach (wlog) $B_1, B_2, \ldots, B_{n-1}$ - ( $n-1$ $B_i$ elements)

Once we have stopped the total mapping from becoming surjective, the total mapping is then free to map to whichever remaining $B_j$ it wants.

Putting this all together:

How many ways can we prevent the total mapping from reaching $i$ elements of $S_2$ ? $C(n,i)$
How many $B_j$ remain after preventing the total mapping from reaching those $i$ elements? $n - i$
How many total functions are there from $S_1$ (all of the $A$ ‘s) to the remaining $n-i$ free $B$ elements? $(n-i)^m$

How does the I-E principle apply here? When you prevented the mapping from reaching one element $B_j$ of $S_2$ , this also includes the situation of preventing it from reaching two elements of $S_2$ , as follows:

(forced) The first of the two $B$ elements is prevented because you explicitly prevented it.
(free) The second of the two $B$ elements can also be prevented because the remaining total function had no obligation to go to this second element.

This latter situation is then numerically equivalent to one of the cases where you explicitly prevented the mapping from reaching two of the $B_j$ . Thus, the I-E principle applies and we need to put in the $\pm$ correction factor.

Putting this all together:

\text{ONTO}(m,n) = n^m - \sum_{i=1}^{n-1} (-1)^{i+1} \cdot C(n,i) \cdot (n-i)^m

where the sum accounts for: Possible NON_ONTO, I-E correction, Choosing $B_i$ ‘s, and $\text{TOTAL}(m, n-i)$ .

Therefore:

\text{NON\_ONTO}(m,n) = \sum_{i=1}^{n-1} (-1)^{i+1} C_i^n (n-i)^m

\text{ONTO}(m,n) = \text{TOTAL}(m,n) - \text{NON\_ONTO}(m,n) = n^m - \sum_{i=1}^{n-1} (-1)^{i+1} C_i^n (n-i)^m

Structure 8: Partitions and How They Relate to Surjective Mappings - $\frac{\text{ONTO}(m,n)}{n!}$

A partition of a set $S = \{e_1, \ldots, e_n\}$ is a subdivision into $k$ subsets $S_i$ such that $\text{Partition}(S) = \{S_1, S_2, \ldots, S_k\}$ with the following conditions:

Each $S_i$ is nonempty - (every box purchased is being used)
$S_i$ must be a subset of $S$ - (no extraneous element crept in) (*)
$\bigcup S_i = S$ - (all original elements have been packed)
$S_i \cap S_j = \emptyset$ unless $i = j$ - (mutually disjoint)

(*) Assumes $S$ is unordered and hence has only unique elements and no duplicates.

The above assumed that a) the owner of the original set decided ahead of time how many boxes (or subsets) to use (here, $k$ ). A different scenario is that it is up to b) the packers to decide on their own how many boxes to use. These scenarios get slightly different names in discrete mathematics:

a) k-partition: $\text{PARTITION}(S, k)$ - $k$ , the number of pieces (subsets) in the partition, is fixed ahead of time.
b) partition: $\text{PARTITION}(S)$ - no initial restriction is placed on the number of boxes.

Connection Between k-Partitions and ONTO Mappings

There is a fascinating connection between k-partitions and ONTO mappings. Consider:

S1                  S2

a1                  b1
a2                  b2
a3      -->         b3
a4                  b4
a5
a6
a7
a8
a9
a10

Here, $k = |S_2| = 4$ .

For example, one surjective mapping vs. a second surjective mapping:

Surjective mapping 1:

$B_1$ reached by $\{a_1, a_2, a_3, a_4\}$
$B_2$ reached by $\{a_5, a_6, a_7\}$
$B_3$ reached by $\{a_8, a_9\}$
$B_4$ reached by $\{a_{10}\}$

Surjective mapping 2:

$B_4$ reached by $\{a_1, a_2, a_3, a_4\}$
$B_3$ reached by $\{a_5, a_6, a_7\}$
$B_2$ reached by $\{a_8, a_9\}$
$B_1$ reached by $\{a_{10}\}$

In terms of ONTO, these are different ONTO mappings. But in terms of partitions, they are the same - since the ORDER of the boxes does not matter, only which elements are packed with which elements. In both cases, the same elements are with the same elements as before. This then becomes the same difference between Combination and Permutation, so that the counting differs by $n!$ .

Conclusion:

\text{PARTITION}(m, n) = \frac{\text{ONTO}(m, n)}{n!}

Note: $n!$ is the number of ways of ordering $n$ elements (here the $S_2$ boxes that the $S_1$ elements are packed into). Dividing removes the consideration of order, whereas multiplying by $n!$ would add in the consideration of order.

The Bell number is the number of partitions for a set of a given size. The above is for k-partitions. So, the Bell number is the summation over all possible $k$ :

\text{Bell}(m) = \sum_{n=1}^{m} \text{Partition}(m, n)

( $\text{Partition}(m,n)$ are generically called “k-partitions” even though here $k = n$ .)

Structure 9: Bijections - Also Known as 1-1 Correspondence - $n!$

A bijection (aka 1-1 correspondence) is a (total) function that is both 1-1 (injective) and onto (surjective). Its importance is that it allows for the definition of an inverse - inverse means that the relation holds true in the other direction as well.

S1          S2

a1          b1
a2    -->   b2
.           .
Ai          Bj
.           .
am          bn

m = |S1|    n = |S2|

Recall that both injective and surjective have constraints on the sizes of the sets:

Injective: $m \leq n$
Surjective: $m \geq n$
Therefore, Bijective: $m = n$

Discrete Mathematics then has a theorem that proves that an alternative (but equivalent) definition for bijective can be: 1-1 (injective) and equal sized sets ( $m = n$ ).

So, recall that the number of injective mappings equals $\frac{n!}{(n-m)!}$ . But here $m = n$ , so the number of bijections equals:

\frac{n!}{(n-n)!} = \frac{n!}{0!} = n! \quad \text{since } 0! = 1

Why Can’t an Injective Mapping Alone Already Have an Inverse?

S1              S2

a1          b1  |
a2    -->   b2  |  (reached by S1)
.           .   |
am          bm  |
            bm+1
            .
            .
            bn

m = |S1|    n = |S2|,  m <= n

Simply, can’t we argue that the manner in which we mapped from an element in $S_1$ to an element in $S_2$ should provide the exact manner of getting back? Imagine for simplicity $A_i \to B_j$ . Couldn’t the bijection simply be $B_j \to A_i$ ?

The official answer is that since $S_2$ can be bigger in size, there will possibly be some extra $B_j$ elements that are not involved in this mapping. This makes the complete mapping of set $S_2$ back to $S_1$ a partial function.

A counter-argument is that the extra elements of $S_2$ were never reached by the elements from $S_1$ and therefore should not be part of our discussion of going back (“inverse”). In numerical analysis, this argument is presented and based on this, the mapping back (the inverse) uses only the subset of elements from $S_2$ that were initially reached by the elements of $S_1$ . This type of inverse is termed a pseudo-inverse.

Partial Calculator

The faceplate of a scientific calculator illustrates partial functions in practice. Several buttons yield undefined (invalid) results for certain inputs:

	Button	Invalid Input	Windows Response	Note
(a)	$1/x$	$0$	Cannot divide by zero
(b)	$\sqrt{\phantom{x}}$	$-1$	Invalid input	Square root function on any negative value
(c)	$\log$	$0$	Invalid input	Or any negative value - typically natural number base
(d)	$\ln$	$0$	Invalid input	Or any negative value - base $e$
(e)	$/$	$0$	Cannot divide by zero	When the denominator is zero ( $\div$ )
(f)	$x^y$	$x=y=0$	$1$	*This is technically wrong (see below)
(g)	$\arcsin$	$2$	Invalid input	The “inverse sin” function. Not defined for any $x$ where $\lvert x \rvert > 1$
(h)	$\arccos$	$2$	Invalid input	The “inverse cos” function. Not defined for any $x$ where $\lvert x \rvert > 1$
(i)	$\tan$	$90$ or $270$	Invalid input	Value is in degrees
(j)	$!$	$-1$	Invalid input	Or any negative value

*Regarding $x^y$ where $x = y = 0$ , there are two different rules that clash:

$0$ raised to any power equals $0$ .
Any number raised to power $0$ equals $1$ .

Technically, $0^0$ is undefined - and so undefined due to the contradiction that it is called indeterminate. Computer theory needed to define $0^0$ as equal to $1$ in order for certain mathematical recursions to work, so Martin Davis (the modern founder of computer theory) chose rule 2 and assigned $0^0 = 1$ .

Lecture 6: Mathematical Structures and Functions - CSCI 381 Goldberg

Cost of Exhaustive Searching (Brute-Force Method)

Structure 1: Subsets of a Set - 2n2^n2n

Structure 2: Relations(hips) from a Set S1S_1S1​ to a Set S2S_2S2​ - 2mn2^{mn}2mn

Structure 3: Directed Graphs over a Set SSS - 2n22^{n^2}2n2

Structure 4: Partial Mapping from a Set S1S_1S1​ to a Set S2S_2S2​ - (n+1)m(n+1)^m(n+1)m

Structure 5: Total Mapping (Function) from a Set S1S_1S1​ to a Set S2S_2S2​ - nmn^mnm

Structure 6: Injective Mapping (1-1 Function) from a Set S1S_1S1​ to a Set S2S_2S2​ - n!(n−m)!\frac{n!}{(n-m)!}(n−m)!n!​

Structure 7: Surjective Mapping (Onto Function) from a Set S1S_1S1​ to a Set S2S_2S2​ - see below