Notes on Terminology Top Arrays Contents

An Introduction to Order Notation

Order notation is used to describe the run time of an algorithm.1 in a big-picture sort of way. We often use order notation to compare two algorithms or to compare a single algorithm to some benchmark. For example, we might say:

f = O(n2)

This compares algorithm f to a benchmark named n2 and can be read as "algorithm f runs no worse than the square of the input size (times some constant) for arbitrary input above a certain problem size n". From this reading, we can deduce that n refers to the problem size and that big Oh (actually Omicron) refers to the concept less than or equal to.

There are a number of letters from the Greek alphabet used to make these comparisons. They are: Θ (theta), which means EQUIVALENT TO, o (little omicron), which means LESS THAN, ω (little omega), which means GREATER THAN, O (big omicron), which means LESS THAN OR EQUIVALENT TO, and Ω (big omega), which means GREATER THAN OR EQUIVALENT TO. The adjectives less and greater are general terms whose meaning depends on the particular aspect of the program being investigated. If how fast a program runs is of interest, then less means "takes less time to complete (runs faster)". If how much space a program takes up when running is of interest, then less means "uses less space". For all these symbols, it is generally implied that the definitions hold only for the worst case and for sufficiently large input. Also, unless space or some other quantity is specified, time is assumed.

As an example, consider two functions f and g that perform the same task. Consider also data sets that cause the functions f and g to exhibit worst case behavior (i.e., take the most time or use the most space). Under these conditions, the phrase f = ω(g) can be read as f is ALWAYS GREATER THAN g in the worst case and for sufficiently large input. It can also be read as g is ALWAYS LESS THAN THAN f for sufficiently large input in the worst case. Suppose we are concerned with execution speed. If indeed f = ω(g), then we can say without a doubt that function f will run slower than function g on the same data set, provided the data set is large enough and the data set forces both f and g to exhibit their worst case behavior. That is, as the sizes of these worst case data sets grow, the ratio of g's running time to f's running time will become smaller and smaller. Eventually, the data set sizes becomes large enough that g will run faster than f in an absolute sense (on a worst case data set) and the ratio of running times will be less than 1. Moreover, this ratio will diminish towards zero as the data sets get ever larger.

Comparing algorithms against benchmarks

Here's another example. Let's say that by counting steps we see that function r takes exactly n2 steps in the worst case where n is representative of the input size. We would say that r = Θ(n2). If another function s compares to r this way: s = ω(r), then we can also say that s = ω(n2) as well. This means that the number of steps that s takes is larger (by a non-constant amount). How much larger? This is unspecified, but example step counts might be n3 or n2log n or 2n. All of these counts are ω(n2).

The common benchmarks against which programs are often compared are:


1 constant
log n logarithmic
n linear
n2 quadratic
n3 cubic
n log n log-linear
2n exponential
n! factorial

Constants and lower order terms

When using order notation, we ignore constants and lower order terms. Suppose an program f takes time 4n2 - 3n + 105. For an input size n = 4, then f takes 157 units of time to complete. Suppose a program g takes time 5n2 + 10n + 2. For an input size n = 4, g takes 122 units of time. Although g runs faster than f for small inputs and f runs faster than g for larger inputs, both f and g are Θ(n2). The reason is g is never slower than twice the run time of f. When n = 100, f takes 39,805 units of time, while g takes 51,002 units of time. When n = 10,000, f takes 3.99701 * 108 units of time, while g takes 5.00100 * 108 units of time. Regardless of how big n gets, g never takes more time than twice the time f takes. Since their run times are within a constant factor of each other (that constant is somewhat less than 2), that means that:

f = Θ(g)

We can see that this is true by ignoring constants and lower order terms. The program f starts out as:

f = Θ(4n2 -3n + 105)

Removing lower order terms yields:

f = Θ(4n2)

Ignoring constant factors yields:

f = Θ(n2)

By the same token

g = Θ(5n2 + 1n + 2)
g = Θ(5n2)
g = Θ(n2)

Since both f and g are Θ(n2), they are Θ of each other.

What order notation does not mean

Suppose we have two algorithms, f and g, and that:

f = o(g)

While f will run faster than g when the input size is large enough and the input causes both f and g to exhibit worst case behavior, it does not mean that f runs faster than g in all cases. There may be small worst case data sets for which g runs faster. There may be rather large data sets for which f exhibits worst case behavior and g does not, leading g to again have a faster running time. The function f runs faster than g ONLY in the worst case and ONLY for sufficiently large inputs. A common mistake is to make the assumption that order notation holds for all inputs; it does not.

Suppose instead that:

f = Θ(g)

Does this mean that f has exactly the same running time as g? Again, no, not really. What this does say is that f and g have running times within a constant factor of each other when worst case behavior is exhibited with sufficiently large inputs.

A pictorial view

With the following Venn diagram, the definitions of the symbols used for order notation will become more intuitive (hopefully):


f = Ω(g)

Start by placing g in the Θ region. Since f is Ω, f can be placed in either the Θ region or the ω region. Thus, f runs within a constant factor of g or slower, for sufficiently large input that forces worst case behavior.

Notes on Terminology Top Arrays Contents