When then Senator Barack Obama visited Google in 2008, one of the questions he was asked was “what is the best algorithm to sort a million different numbers”? Obama, being up to speed on all the programming jargon, promptly replied “Bubble sort would be the wrong way to go!” While the obviously staged conversation was meant to be in jest, the word "algorithm" is certainly one of the defining words of our times. So, what is an algorithm? And why are they useful to us?
Very roughly speaking, an algorithm is a step-by-step procedure to solve a well- defined mathematical problem with two key features: (a) every step is elementary and (b) the procedure always terminates.
Let us start with an easy example of an algorithm: adding two numbers. Say we want to add 175 and 285. Even elementary school students know to start at the right end. First, add 5 and 5. This makes 10. So the in-place digit is 0 and carry the 1. Next, we move to the middle digit, add 7 and 8 and the carry digit 1 to make 16. The in-place digit is 6 and carry the 1. Finally, we add the left most digits 1 and 2 along with the carry digit 1 to make 4. Thus, we get the final result: 460.
While you (and most 4th graders) can perform this algorithm by hand, we can also teach these rules to a computer. The algorithm that your computer uses to add two numbers is essentially the same, except for two key advantages. 1, Computers can do every step faster. Much faster. 2, Computers do not suffer from the lack of attention, boredom or fatigue, which cause humans to make errors – especially over long time-frames or repetitive tasks.
This also brings us to an important point. When a computer is performing a task for you – be it adding two numbers or finding the quickest route from your home to the airport on Google Maps – there are two parts to solving the problem.
The first is a logic/math problem: given the road-map data, how does one find the quickest route? The second is how this logic, this algorithm, should be implemented on a computer. Theoretical computer science studies the first part, namely finding the best algorithm or logic to solve the problem. (The second part concerns the design of computer architecture and systems.)
From the perspective of theoretical computer science, adding two numbers or finding the quickest route in a map are easy. This is because the underlying algorithm/logic takes has relatively few steps. However, this is not the case for all problems. Take the famous traveling salesperson problem.
Suppose that our salesperson Sally lives in Washington, DC. She is given a budget of $50,000 and asked to visit the capitals of all the 196 nations in the world before returning to Washington, DC. Could she do this in the budget that she is allotted?
Sally has myriad potential routes. She could go east to west (maybe she should go to London then Paris and then through Europe). She could go west to east (Tokyo and then Seoul through Asia). Perhaps she should go circularly by continent, or take the cheapest flights even if it means backtracking.
Mathematically, there are more than 10^100 options between 196 capital cities. This number (known as Googol) is 1 followed by more than 100 zeros. Written out, 10^100 looks like this:
That’s more than the number of atoms in the universe. Even if we were to replace every atom on earth by a computer and they had all been running since the big bang, the computation to check all the possibilities would still be running!
Unable to figure out what to do, Sally goes to her travel consultant for help. Travis agrees but says that Sally needs to pay him upon completion of the job. But how will Sally know if he’s completed the job? While finding a solution may be difficult, verifying the correctness of a given solution is quite easy. Given a suggested itinerary, Sally can relatively easily check the total cost of the proposed route and the cities covered and make sure it fits her criteria.
This is a key property of the traveling salesperson problem and mathematically similar problems. In the parlance of computer science, these are known collectively as the P versus NP problem. While we currently do not know of any fast algorithms to solve them outright, there is a very fast algorithm to verify the correctness if given a solution. In mathematics, finding a proof is often significantly more difficult than verifying the correctness of a proof. (And this is the case in more than just math – just think about how much more difficult it is to create a sonnet than appreciate one!).
Computer scientists have complied a list of thousands of unsolved problems like this across diverse fields from mathematics to operations research, economics and biology. Why does that matter? Because this computational conundrum is shared, if we can find a fast algorithm for any one problem in this list, we will find a fast algorithm for all the problems on the list. Many problems, one algorithmic solution!
This is such a huge computational conundrum that the Clay Mathematics Institute lists it among the seven Millennium prize problems in mathematics. The Institute is offering a million dollars for its resolution.
However, apart from the meager million dollars, this problem has seismic ramifications for nearly every branch of modern science, ranging from physics to economics and engineering to mathematics. While we are not even close to resolution of this problem, exploring it has led to some of the most fascinating ideas in computer science. In fact, this solutionless problem set is the very foundation of modern computer cryptography and security. But that is a story for another day…