Skip to content

Lecture 23 (05/04/2026) - List Update Problem (3 Algorithms); Introduce Multiplicative Weight Updates

Scribes: Noman Saleemi and Ahmed Malik

Online algorithms require making decisions on the fly without knowing future update sequences in advance. To evaluate an online algorithm, its performance is compared to an optimal offline algorithm (often referred to as “God’s algorithm”) that has full knowledge of the entire sequence of updates well in advance. This comparison is quantified using a competitive ratio. Examples from previous classes include the ski rental problem, which has a competitive ratio of two, and the pizza finding problem, which has a competitive ratio of nine.

In the list update problem, an algorithm must maintain a linked list of nn keys, denoted as {1,,n}\{1, \dots, n\}. An online sequence of mm requests arrives: r1,r2,,rmr_1, r_2, \dots, r_m, where we assume m>nm > n. When a request rir_i arrives, the algorithm pays a cost equal to the current position of rir_i in the linked list because it must walk from the front to find it.

After the key is accessed, the algorithm is allowed to move rir_i to any previous position in the linked list for free in order to make it readily available for future requests. The overall objective is to minimize the total access cost over the entire sequence.

The most trivial algorithm simply leaves the list unchanged, never moving the requested elements.

Analysis: Let the initial linked list be 123n1 \rightarrow 2 \rightarrow 3 \cdots \rightarrow n. A bad access sequence for this approach is repeatedly requesting the last element of the list mm times, i.e., S={n,n,,n}S = \{n, n, \dots, n\}. Each request costs nn, leading to a total cost of mnmn.

An optimal algorithm (OPT) would move the last element nn to the front of the list after its first access. OPT pays nn for the first access and 11 for all remaining m1m-1 accesses, yielding a cost bounded by n+(m1)2mn + (m-1) \le 2m. Comparing the two costs reveals the competitive ratio:

mn2m=n2=Ω(n)\frac{mn}{2m} = \frac{n}{2} = \Omega(n)

Because the ratio depends on nn, this is a poor competitive ratio.

Another approach is to maintain the list based on the current access frequency of the keys, moving more frequently requested keys to the front of the list. Whenever a key appears, its frequency is updated, and if its new frequency surpasses the preceding item, it is moved forward.

Analysis: A bad sequence for this algorithm involves accessing the first key nn times, the second key nn times, and so on, up to the nn-th key:

S={1,,1n times, 2,,2n times, , n,,nn times}S = \{\underbrace{1, \dots, 1}_{n \text{ times}},\ \underbrace{2, \dots, 2}_{n \text{ times}},\ \dots,\ \underbrace{n, \dots, n}_{n \text{ times}}\}

After half of sequence SS has gone by, keys 1,2,,n/21, 2, \dots, n/2 occupy the first half of the linked list, while keys n/2+1,,nn/2+1, \dots, n are in the second half. For the remaining half of the sequence, each request targets a key in the second half of the list, costing at least n/2n/2 per access. Since there are n2/2n^2/2 requests in the second half of SS, the cost is at least:

n22n2=n34=Ω(n3)\frac{n^2}{2} \cdot \frac{n}{2} = \frac{n^3}{4} = \Omega(n^3)

Is there a better algorithm for this sequence? An algorithm that simply moves a new element to the front of the list would incur a cost of at most nn for the first access of a key, and 11 for the remaining n1n-1 accesses. The total cost would be at most n(n+1(n1))2n2n(n + 1(n-1)) \le 2n^2. Therefore, OPT(S)2n2\text{OPT}(S) \le 2n^2. The competitive ratio of the “Order by Frequency” algorithm is therefore at least:

n3/42n2=n8=Ω(n)\frac{n^3/4}{2n^2} = \frac{n}{8} = \Omega(n)

The “Move-to-Front” algorithm dictates that after walking to the requested element rir_i, it is immediately moved to the front of the list. Despite its simplicity, this heuristic is highly competitive.

Theorem: The Move-to-Front algorithm is 2-competitive. For all sequences SS, the cost is bounded by:

costMTF(S)2OPT(S)m+(n2)\text{cost}_{\text{MTF}}(S) \le 2 \cdot \text{OPT}(S) - m + \binom{n}{2}

If the sequence length mm is longer than (n2)\binom{n}{2}, the term (m(n2))\left(m - \binom{n}{2}\right) becomes positive, leading to:

costMTF(S)2OPT(S)(m(n2))2OPT(S)\text{cost}_{\text{MTF}}(S) \le 2 \cdot \text{OPT}(S) - \left(m - \binom{n}{2}\right) \le 2 \cdot \text{OPT}(S)

This mathematically guarantees a competitive ratio of strictly less than 2 for long enough sequences.

Proof Concept: Proving this theorem requires the potential function method, which acts like a piggy bank that stores value when the algorithm is faster than 2OPT2 \cdot \text{OPT} and borrows from the bank when it is slower. An obvious per-request approach (attempting to prove costMTF(ri)2OPT(ri)\text{cost}_{\text{MTF}}(r_i) \le 2 \cdot \text{OPT}(r_i)) fails because MTF may occasionally be much slower than OPT on individual requests.

The Multiplicative Weight Updates (MWU) algorithm is highly relevant in machine learning and statistical modeling. Suppose we must make a binary decision every day, such as deciding whether to sell or buy a stock. We have nn experts to assist us with their advice. At the end of the day, we observe the true outcome and identify which experts’ suggestions were correct and which were mistakes. The objective is to devise a strategy that performs almost as well as the best expert in hindsight.

The Weighted Majority algorithm dictates the following procedure:

  1. Initially, all nn experts are assigned a weight of 1, meaning wi1=1w_i^1 = 1 for all i[n]i \in [n].
  2. At every step tt, we possess a weight witw_i^t for each expert ii. Our decision at time tt is to “sell” if at least half the total weight of the experts say to sell (i.e., if wit2\ge \frac{\sum w_i^t}{2} weight of experts say sell). Otherwise, we buy. If the weights are exactly equal, we flip a coin.
  3. At step t+1t+1, the weights are updated based on the actual outcome at time tt:
wit+1=witif expert i was correctw_i^{t+1} = w_i^t \quad \text{if expert } i \text{ was correct} wit+1=(1ϵ)witif expert i was incorrectw_i^{t+1} = (1-\epsilon)w_i^t \quad \text{if expert } i \text{ was incorrect}

Here, ϵ\epsilon acts as a penalty parameter that reduces our confidence in experts who make mistakes.

We calculate the mistakes (experts who ended up being incorrect based on the chosen decision) by the following:

# of mistakes made by the WM Algorithmα(# of mistakes made by the best expert)\text{\# of mistakes made by the WM Algorithm} \le \alpha \cdot (\text{\# of mistakes made by the best expert})

Theorem: After tt steps, let mtm^t be the total number of mistakes made by the Weighted Majority algorithm so far, and let mitm_i^t be the number of mistakes made by expert ii so far. The total mistakes made by the algorithm satisfies the bound:

mt2(1+ϵ)mit+2lnnϵi[n]m^t \le 2(1+\epsilon)\,m_i^t + \frac{2 \ln n}{\epsilon} \quad \forall\, i \in [n]

Because this guarantee holds for all i[n]i \in [n], it holds in particular when ii is the single best expert. If we ignore the additive 2lnnϵ\frac{2 \ln n}{\epsilon} term (which becomes negligible for long sequences), the algorithm guarantees that our total errors are at most roughly twice the errors of the best expert.