---
Video Title: Common Misuses of Asymptotic Notation
Description: Theory of Computation
https://uvatoc.github.io/week5

11.4: Common Misuses of Asymptotic Notation
- Characterizing the Running Time of a Java Program

David Evans and Nathan Brunelle
University of Virginia
---

﻿Now, hopefully, what big O means is clear.

How it's often used in talking about
efficiency of programs is like this.

You might have something that says,
here's some Java code.

It is computing the sum of all numbers up
to n.

So n is some int.

Got a value somewhere.

And that segment of code has running time
in big O of n.

Are we happy with that?

You've seen things kind of like this
before.

Maybe exactly like this.

What's the actual running time of this
code?

Yeah.

Certainly the actual clock time is going
to depend on a lot of things.

What this is trying to capture is to avoid
all those details and capture,

well, it's scaling with the size of n in a
linear way.

Which is probably true to some degree.

How consistent is that with the
mathematical notion of what big O of n is?

What's the longest this code could
actually run for?

For real?

If we're talking about a concrete Java
program?

If you wrote this code and
ran it on your laptop from

when I had the slide up,
would it be already done?

For any value of n you could pick?

Yeah.

So the int type has a maximum.

It's actually a small maximum,
right?

It's only 32 bucks.

It's a couple million.

So the real running time of this, based
on that maximum integer, is constant.

It does not scale asymptotically.

We know as the maximum total running time
there's some constant.

The other thing I want to point out,
which is sorry I forgot earlier,

is anytime we have actual code, we could
use big O, but we should get a tight bound.

The places where we need
the loose bounds like big.

O and omega are when
we're talking about problems.

When we're talking about a problem,
we don't necessarily know the most

efficient circuit for it,
or the most efficient

code, so that's why
we can only get bounds.

When we actually have
code, we can say big O, but

we always should be
able to get a tight bound.

Because we've got the code.

We know exactly what's going on.

There's no reason to not get a tight bound
and know exactly what we're counting,

right?

So that's often a misuse of big O.

It's not incorrect.

It's just telling us a lot of this
information, right?

It would certainly be correct
to also say this code has

running time in big O end of
the end of the end of the end.

I guess if you answer that
on 2110, you might not get

full credit for all of the
asymptotic analysis questions.

But it would always be correct,
right?

So for code, you should always be able to
get a tight bound.

If you think of this as actual Java code,
well, int is 32 bits.

It's got some maximum value.

It's constant time.

We know the maximum it can run.

It can never run longer than that,
because this is not unbounded.

Well, maybe that's still.

Maybe we should say,
well, of course that's not

what someone means
when they look at this code.

They're trying to think of what it would
mean in some more abstract, mythical

version of Java that doesn't have this int
type.

Then are we okay with it?

Are we now okay with, let's forget this,
assume some unbounded integer type.

Then are we okay with saying it has
running time in theta n?

So how long does each iteration take?

This is doing add one.

This is doing add i.

How long does it take to add two numbers?

Probably when you analyze this.

Do you analyze it saying, oh, add simple,
it's constant time.

Can addition really be constant time?

Yeah.

So what would it mean if you could do
addition in constant time?

And now instead of Java ints, we've got
add that takes in two numbers.

Let's see.

We've got our add function.

And it outputs their sum.

If that was constant time, what would it
mean?

Yeah.

Could you even look at the inputs?

If this is constant time,
right, you would have to know

the sum without even looking
at every bit in the inputs.

Because just looking at every bit in input
is not constant time.

The time it takes to look at all the bits
in input depends on how big the number is.

So this cannot be constant time.

To be constant time,
you've got to be able to get

the output without
looking at the whole input.

If the size of the input can increase.

And certainly if we're talking
about add with unbounded

numbers, the size of the
input can increase, right?

So a reasonable add is going to be linear
in the length of the inputs.

So either you're assuming
if add's constant for

32-bit ints, add's
pretty close to constant.

It's certainly bounded by some constant.

For integer adds, it
actually probably does take

exactly the same amount
of time for any two numbers.

For more complicated operations like
floating point, they do take different

amounts of time, which is another
potential leak of what you're computing on.

So we've got a problem.

Either we're talking about unbounded ints,
and then it's probably n-squared.

Because our natural way of doing add is
linear in...

Actually, no, it's not going to be
n-squared.

What is it going to be?

So n is the value of n.

This is going to be linear in the length
of n, right?

So this is going to be more like n-log n.

Which is probably not the correct answer.

Or at least not what would
be deemed the correct

answer for this question
in most contexts.

If you're assuming constant, your integers
have to be bounded.

And then the whole thing is constant time.

Or it's non-constant, and you've got to
actually analyze these operations carefully.

And all of them are going to be sort of in
the log of the magnitude event.

So this is correct.

But it's only correct because it's
constant time.

That's probably not the answer that's
desired when people look at this code.

At least with our definitions,
and if we're careful about computing

models, those are the answers that we
should have.