Lecture 23 (05/04/2026) - List Update Problem (3 Algorithms); Introduce Multiplicative Weight Updates | CSCI 328

Scribes: Noman Saleemi and Ahmed Malik

Review of Online Algorithms

Online algorithms require making decisions on the fly without knowing future update sequences in advance. To evaluate an online algorithm, its performance is compared to an optimal offline algorithm (often referred to as “God’s algorithm”) that has full knowledge of the entire sequence of updates well in advance. This comparison is quantified using a competitive ratio. Examples from previous classes include the ski rental problem, which has a competitive ratio of two, and the pizza finding problem, which has a competitive ratio of nine.

The List Update Problem

Problem Statement

In the list update problem, an algorithm must maintain a linked list of $n$ keys, denoted as $\{1, \dots, n\}$ . An online sequence of $m$ requests arrives: $r_1, r_2, \dots, r_m$ , where we assume $m > n$ . When a request $r_i$ arrives, the algorithm pays a cost equal to the current position of $r_i$ in the linked list because it must walk from the front to find it.

After the key is accessed, the algorithm is allowed to move $r_i$ to any previous position in the linked list for free in order to make it readily available for future requests. The overall objective is to minimize the total access cost over the entire sequence.

Approach 0: Do Nothing

The most trivial algorithm simply leaves the list unchanged, never moving the requested elements.

Analysis: Let the initial linked list be $1 \rightarrow 2 \rightarrow 3 \cdots \rightarrow n$ . A bad access sequence for this approach is repeatedly requesting the last element of the list $m$ times, i.e., $S = \{n, n, \dots, n\}$ . Each request costs $n$ , leading to a total cost of $mn$ .

An optimal algorithm (OPT) would move the last element $n$ to the front of the list after its first access. OPT pays $n$ for the first access and $1$ for all remaining $m-1$ accesses, yielding a cost bounded by $n + (m-1) \le 2m$ . Comparing the two costs reveals the competitive ratio:

\frac{mn}{2m} = \frac{n}{2} = \Omega(n)

Because the ratio depends on $n$ , this is a poor competitive ratio.

Approach 1: Order by Frequency

Another approach is to maintain the list based on the current access frequency of the keys, moving more frequently requested keys to the front of the list. Whenever a key appears, its frequency is updated, and if its new frequency surpasses the preceding item, it is moved forward.

Analysis: A bad sequence for this algorithm involves accessing the first key $n$ times, the second key $n$ times, and so on, up to the $n$ -th key:

S = \{\underbrace{1, \dots, 1}_{n \text{ times}},\ \underbrace{2, \dots, 2}_{n \text{ times}},\ \dots,\ \underbrace{n, \dots, n}_{n \text{ times}}\}

After half of sequence $S$ has gone by, keys $1, 2, \dots, n/2$ occupy the first half of the linked list, while keys $n/2+1, \dots, n$ are in the second half. For the remaining half of the sequence, each request targets a key in the second half of the list, costing at least $n/2$ per access. Since there are $n^2/2$ requests in the second half of $S$ , the cost is at least:

\frac{n^2}{2} \cdot \frac{n}{2} = \frac{n^3}{4} = \Omega(n^3)

Is there a better algorithm for this sequence? An algorithm that simply moves a new element to the front of the list would incur a cost of at most $n$ for the first access of a key, and $1$ for the remaining $n-1$ accesses. The total cost would be at most $n(n + 1(n-1)) \le 2n^2$ . Therefore, $\text{OPT}(S) \le 2n^2$ . The competitive ratio of the “Order by Frequency” algorithm is therefore at least:

\frac{n^3/4}{2n^2} = \frac{n}{8} = \Omega(n)

Approach 2: Move to Front (MTF)

The “Move-to-Front” algorithm dictates that after walking to the requested element $r_i$ , it is immediately moved to the front of the list. Despite its simplicity, this heuristic is highly competitive.

Theorem: The Move-to-Front algorithm is 2-competitive. For all sequences $S$ , the cost is bounded by:

\text{cost}_{\text{MTF}}(S) \le 2 \cdot \text{OPT}(S) - m + \binom{n}{2}

If the sequence length $m$ is longer than $\binom{n}{2}$ , the term $\left(m - \binom{n}{2}\right)$ becomes positive, leading to:

\text{cost}_{\text{MTF}}(S) \le 2 \cdot \text{OPT}(S) - \left(m - \binom{n}{2}\right) \le 2 \cdot \text{OPT}(S)

This mathematically guarantees a competitive ratio of strictly less than 2 for long enough sequences.

Proof Concept: Proving this theorem requires the potential function method, which acts like a piggy bank that stores value when the algorithm is faster than $2 \cdot \text{OPT}$ and borrows from the bank when it is slower. An obvious per-request approach (attempting to prove $\text{cost}_{\text{MTF}}(r_i) \le 2 \cdot \text{OPT}(r_i)$ ) fails because MTF may occasionally be much slower than OPT on individual requests.

Multiplicative Weight Updates

Problem Statement and Experts’ Theorem

The Multiplicative Weight Updates (MWU) algorithm is highly relevant in machine learning and statistical modeling. Suppose we must make a binary decision every day, such as deciding whether to sell or buy a stock. We have $n$ experts to assist us with their advice. At the end of the day, we observe the true outcome and identify which experts’ suggestions were correct and which were mistakes. The objective is to devise a strategy that performs almost as well as the best expert in hindsight.

The Weighted Majority Algorithm

The Weighted Majority algorithm dictates the following procedure:

Initially, all $n$ experts are assigned a weight of 1, meaning $w_i^1 = 1$ for all $i \in [n]$ .
At every step $t$ , we possess a weight $w_i^t$ for each expert $i$ . Our decision at time $t$ is to “sell” if at least half the total weight of the experts say to sell (i.e., if $\ge \frac{\sum w_i^t}{2}$ weight of experts say sell). Otherwise, we buy. If the weights are exactly equal, we flip a coin.
At step $t+1$ , the weights are updated based on the actual outcome at time $t$ :

w_i^{t+1} = w_i^t \quad \text{if expert } i \text{ was correct}

w_i^{t+1} = (1-\epsilon)w_i^t \quad \text{if expert } i \text{ was incorrect}

Here, $\epsilon$ acts as a penalty parameter that reduces our confidence in experts who make mistakes.

We calculate the mistakes (experts who ended up being incorrect based on the chosen decision) by the following:

\text{\# of mistakes made by the WM Algorithm} \le \alpha \cdot (\text{\# of mistakes made by the best expert})

Performance Guarantee

Theorem: After $t$ steps, let $m^t$ be the total number of mistakes made by the Weighted Majority algorithm so far, and let $m_i^t$ be the number of mistakes made by expert $i$ so far. The total mistakes made by the algorithm satisfies the bound:

m^t \le 2(1+\epsilon)\,m_i^t + \frac{2 \ln n}{\epsilon} \quad \forall\, i \in [n]

Because this guarantee holds for all $i \in [n]$ , it holds in particular when $i$ is the single best expert. If we ignore the additive $\frac{2 \ln n}{\epsilon}$ term (which becomes negligible for long sequences), the algorithm guarantees that our total errors are at most roughly twice the errors of the best expert.