Lecture 1: Introduction - CSCI 381 Goldberg

Administrative Updates

Email Communication

Please include the following email in your Qmail contact list: rgoldberg@qc.cuny.edu (to make sure they don’t end up in your spam/junk folders)
Class designated email: CS722ProjectsQC@gmail.com (for undergraduate and graduate students alike)
When I email you, it will be from my Qmail to your Qmail.
When you email me, please send from your Qmail to the class designated email.
- The Subject Line should always start with Subject: TC
- Never include any personal information. Your name is automatically inserted by Qmail and every professor via CunyFirst Rosters has your Student ID#.
For any content not already in the Brightspace Content tab folder, I will email it to the class.
If I mention in lecture that I will be emailing you a file, and the time passed and you have not yet received the file, then FIRST check your spam/clutter/junk/trash folders for the email. If not found, then email me.

Attendance

First few weeks: Attendance is mandatory (university reimbursement requirement)
After first few weeks: Attendance not required, except for exams
Exams: Mandatory attendance (details in syllabus)

Recording Policy

NO audio or video recording allowed in this course
Alternative: Professor will email comprehensive notes daily (afternoon)
All board content and shared materials provided via email

Brightspace System

Access: Content folder
Will be updated this week with current files
Older files currently there (will be replaced)

Notes & Asynchronous Materials

Syllabus document (video + notes) posted this week
Optional viewing (available asynchronously if interested)
Full notes emailed daily after class

Technical Notes

Brightspace 60-minute session limit: Class is 75 min
May need to switch to “Part 2” session around 45-60 min mark
Both part sessions listed on Brightspace calendar

Next Steps

Watch for email with notes later today
Check all folders for incoming course emails
Review syllabus when posted

Introduction to Theory of Computation

The field of material that this course covers is called many names in the literature. Three popular names are:

Theory of Computation
Computability Theory
Recursion Theory

The first two are similar. The third is intriguing and we will spend a number of lectures exploring that rubric.

Computer Theory was developed in the 1920s (Alonzo Church) and 1930s (Alan Turing). Church developed the Lambda Calculus (which influenced modern programming languages like Java). Church was abstract in thinking, while Turing provided the concrete architecture (Turing Machine) to implement Church’s concepts. They did not collaborate, though Church set the stage and Turing solidified the work.

Later, formal language theory emerged in mid 1950s-mid 1960s (Stephen Kleene for regular expressions, Noam Chomsky for unifying the overall system).

Some of the fundamental notions were based on set theory developed in the late 1800s (Georg Cantor, 1845-1918). This means that the fundamentals of computer theory were developed PRIOR to the advent of a computer. (Not till 1940s was there a stable electronic digital computer.)

Church-Turing Thesis

Computability is based on Countability (Natural number system)

Scope of Computability

An important nuance: the theory concerns whether a machine can follow instructions, not whether the code produces the desired solution. The focus is on the machine’s ability to compute, not on correctness or values.

Why should the ability for a machine to compute depend on the natural number system?

Natural numbers $N = \{0, 1, 2, 3, 4, \ldots\}$ are countable—you can list them in order. Real numbers $R$ include decimal points and fractions. They are fundamentally different: natural numbers require finite digits to describe any specific number, while real numbers require infinite digits (e.g., $\pi, e$ ).

Why Machines Can Only Count

Any real machine has finite memory. A machine with finite memory can only store and process numbers requiring finite representation—i.e., natural numbers. Arbitrary real numbers require infinite storage, making them impossible for practical machines to handle exactly.

In condensed form: The Church-Turing thesis states that a computing machine can only have a FINITE memory model and hence can only compute (completely) over the natural numbers.

Real Data Types Are Approximations

Programming languages provide a “real” data type, but these are approximations stored as natural numbers (via finite-precision floating-point representation). They are not true real numbers.

Turing’s Infinite Tape: An Apparent Paradox

Design choice: Turing provided his abstract machine with infinite tape memory, seemingly contradicting the Church-Turing thesis (which assumes finite memory).

Why? Turing couldn’t determine ahead of time how much memory a program needs. Rather than worry about allocating finite memory, he provided infinite tape so the machine would never run out. In practice,:

Only finite portions of the tape are used (if computation terminates normally)
Infiniteness only manifests in infinite loops (non-termination), which aren’t valid computations
This abstraction freed Turing from reasoning about memory as a constrained resource

Number Systems: Discrete vs. Continuous

Discrete vs. Continuous

Discrete are Natural Numbers and Counting Numbers. Every number has a well-defined successor and there are gaps between consecutive numbers.

Continuous are Real Numbers. There is no official successor (next number) to any given real number. For example, between 1.5 and 1.6 is 1.55, etc. There are no gaps between numbers in the sense that there is always another real number between any two real numbers; simply compute the average of the two real numbers.

The successor property enables one to describe the elements of a set in order using indices/positions. For example, “This is the 3rd element and this is the 4th element,” etc.

Zeno’s Paradox

Zeno’s paradox illustrates the fundamental difference between discrete and continuous mathematics.

The paradox (geometric form):

If one iteratively walks half the remaining distance to reach the opposite end of a room, they theoretically never arrive—since halving a distance infinitely produces points between any two locations. Yet physically, people walk across rooms.

The Resolution

Between any two points A and B, there are infinitely many geometric points. However, machines can only handle finite representations. A ruler measuring 5 inches between A and B counts a finite value, but this finite measurement actually summarizes infinitely many continuous points.

The resolution: Geometry is continuous (there’s always a point between any two points), but measurement by machines is discrete/digital. Machines cannot compute all infinite geometric points; they approximate using a finite, digital measurement.

This connects directly to Church-Turing: machines, operating with finite memory, cannot handle the continuous (infinite) realm. They work in the discrete realm of countable natural numbers.

Historical Note

Why is Church-Turing called a thesis (not a theorem)? Because we cannot prove that machines cannot have infinite memory. Science typically assumes unprovable things don’t exist, but computability theory is more modest: it’s an assumption, not a proven theorem. (Quantum computers may challenge this if they exhibit infinite memory models, though currently they don’t.)

Finite State Machines

Definition and Purpose

Finite state machines process strings over an alphabet. They must accept strings that belong to a given language and reject those that don’t.

Graph Representation

A finite state machine is a weighted graph $G_w = (V, E, w)$ where:

$V$ = vertices (called states)
$E$ = edges (called transitions)
$w: E \to \text{Alphabet}$ = weight function mapping edges to alphabet letters

Each path through the graph represents a string formed by concatenating the alphabet letters along the edges.

Zeno’s Paradox in Finite State Machines

The Problem: A machine with only a finite number of states must process an infinite number of possible strings over an alphabet. How is this possible?

This is a variation of Zeno’s Paradox: a finite structure handling infinity.

The Solution (Myhill-Nerode Theorem): Each state corresponds to an equivalence class—an infinite set of strings that behave identically. A finite number of such equivalence classes can distinguish all strings in a language. In other words, infinitely many strings collapse into finite patterns, allowing finite state machines to process infinite languages.