Algorithmic Adventures : Review

This book takes a rather difficult topic, “algorithmic complexity”, and explains it in a way that any reader with a bit of curiosity towards algorithmic world can understand most of its contents. This is actually not a book in the traditional sense of it. ETH Zurich offered a public lecture series called, “ The Open Class – Seven Wonders of Informatics” in the fall of 2005 and this book has been written based on those lecture series.

To begin with, the title needs a bit of explanation. The phrase “From Knowledge to Magic” is meant to represent the transition from a deterministic algorithmic world to a randomized algorithmic world, a transition from a world of bits and bytes to a world of molecules and DNA, a transition from something that we know that is currently possible in to something that appears magical. What are randomized algorithms and deterministic algorithms ? In fact what are algorithms ? How does one compute using molecules ? What is an NP-hard problem ? These and many more such questions answered in the book. Do you know that a particular version of Traveling Salesman problem that is considered as NP-hard has been solved by a DNA computer ? I didn’t know this fact until I read this book. The book starts off with the basic definition of “algorithm” and takes a reader all the way to visualizing a few scenarios that could represent the world of computing 20-30 years from now, may be even earlier.

A Short Story about the development of Computer Science

Why Computer Science Is Not a Computer Driving Licence

The chapter starts off with a question, “Why does public opinion equate the facility to use specific software packages to computer science , while most of us clearly distinguish basic research in physics vs. the technical application in electrical engineering?”. As an answer to “What is Compute Science?”, the author writes

The main problem with answering this question is that computer science itself does not provide a clear picture to people outside the discipline. One cannot classify computer science unambiguously as a metascience such as mathematics, or as a natural science or engineering discipline. The situation would be similar to merging physics and electrical and mechanical engineering into one scientific discipline. From the point of view of software development, computer science is a form of engineering, with all features of the technical sciences, covering the design and development of computer systems and products. In contrast, the fundamentals of computer science are closely related to mathematics and the natural sciences. Computer science fundamentals play a similar role in software engineering as theoretical physics plays in electrical and mechanical engineering. It is exactly this misunderstanding of computer science fundamentals among the general public that is responsible for the bad image of computer science.

Any scientific discipline is built on certain notions. These notions are like self-evident truths and are called axioms. If you take probability theory, it took more than 2 centuries of dabbling until it was described in a consistent mathematical language. Is Probability theory water tight ? Yes, in a way because everything flows from the basic axioms, No because we take axioms for granted. For example, the quantum physics world is vastly different and maybe there is a need to relook at the entire theory. It might seem that giant scientific disciplines stand on these axioms and can fall off any time. The scientific researchers in fact are always looking to take a crack at axioms and other notions of a scientific discipline because it might result in a better understanding of the entire subject.

It was David Hilbert who pushed the notion of cause and effect in Mathematics and strove for the ideal world of water tight and perfect mathematical framework. In 1931, Kurt Godel definitively destroyed all dreams of building such a perfect mathematics. Basically Godel’s work says that building mathematics as a formal language of science is an infinite process. The author writes,

The result of Godel were responsible for the founding of computer science. Before Godel nobody saw any reason to try and give an exact definition of the notion of a method. Such a definition was not needed, because people only presented methods for solving particular problems. The intuitive understanding of a method as an easily comprehensible description of a way of solving a problem was sufficient for this purpose. But when one wanted to prove the nonexistence of an algorithm (of a method) for solving a given problem, then one needed to know exactly (in the sense of a rigorous mathematical definition) what an algorithm is and what it is not. Proving the nonexistence of an object is impossible if the object has not been exactly specified.

Hence there was a need for formal definitions in Computer Science. The first formal definition of an algorithm was given by Alan Turing in 1936 and later further definitions followed. This definition made a demarcation between the problems that can and cannot be solved using an algorithms/ computers. Thanks to the exact definition of what an algorithm is, one was able to investigate the borderline between the automatically solvable and unsolvable.

Algorithmics

**What Have Programming and Baking in Common?
**

This chapter explains the meaning of algorithm via a simple example - “baking a cake”. Since the book is based on public lectures , it is no surprise that the content goes in great detail to explain in layman terms as to what happens exactly with in the internals of the computer when a program is executed.

Infinity is not equal to Infinity Infinity Is Not Equal to Infinity,

Why Infinity Is Infinitely Important in Computer Science

The chapter gives a background to the wonderful concept of infinity and gives a good explanation for the following questions :

What does it mean for a set to be infinite ?
How does one compare two infinite sets ?
Are there different sizes of infinities ?
How does on prove that set of real numbers is a greater set (higher cardinality) than the set of rational numbers ?

The point of this chapter is to equip the reader with the “not so intuitive” nature of infinity so that he/she can grasp the concepts around algorithmically solvable problems and unsolvable problems.

Limits of Computability,

Why Do There Exist Tasks That Cannot Be Solved Automatically by Computers

How many programs can ever be written? The author shows that this question can be answered by matching every program that can ever be written to a natural number. Hence the first basic principle to be grasped is that the cardinality of the set containing all the programs is same as the cardinality of N. Algorithms are nothing but programs that run in finite time. So, they are a proper subset of programs set.

The chapter introduces the first problem that no algorithm can solve. The task is called Problem(c) that takes a natural number n and outputs a number c up to n decimal digits after the decimal points. Since the cardinality of R is greater than N and there are |N| programs, and there exists a c for which there is no algorithm that can be written. So, the same Cantor’s diagonalization is used to show that there is a problem for which no algo exists.

Obviously who is interested in a task such as Problem (c). So, the author takes on a practical problem, a decision problem, that cannot be solved by any algo. Simply stated, a decision problem is one that takes a natural number n as input and it outputs YES if n belong to a certain set M and NO if n does not belong to a set M. On the face of it, it seems a straightforward task and that a simple algo can do the job. For example if the set M is the set of even numbers, given a number n, it is easy to decide whether it falls in to the set of even numbers. Similarly if M is a set of primes, a few lines of code is enough to check whether n falls in the set of primes. However the fact is that this simple problem is devious and no algo can solve this problem. The proof of this statement lies in doing a nifty trick on the Diagonalization argument. You have got to read through the proof to understand its magic. Thus the author shows the first practical problem, a decision problem that cannot be algorithmically solved. This is denoted by (N, M(DIAG)). Reduction method is used to propagate this hard problem in to a number of problems that cannot be solved algorithmically. By the end of the chapter, the reader gets a fair idea that all non trivial semantic questions like the following are not algorithmically solvable

What does a given program compute? Which problem does the program solve
Does the program developed solve the given problem?
Does a given problem halt on a given input or does a program always halt ?
Does a given program accept a given input ?

Complexity Theory,

What to Do When the Energy of the Universe Doesn’t Suffice for Performing a Computation?

As the research on the problems that were algorithmically solvable or not, reached its maturity, the next issue that was dealt by research scientists was, “How does one classify the problem as practically solvable or unsolvable?” First one must get some clarity on what does “practically solvable” mean. The chapter defines time complexity and space complexity of an algorithm and uses these concepts to explain the type of algorithms that are not practical. An extreme case would be to think in terms of age of universe that is less than 1018 seconds. Let’s say you have been running an algo on your laptop, 1 Gig machine since the start of universe. Basically you have let the computer do 1018 times 109 instructions , i.e. 1027 instructions . If the algo is of time complexity n! or 2n, then the algo can at most work on a few hundred numbers only. So, any algo whose time complexity is exponential or factorial can be safely assumed to be practically unsolvable. So, what are the kinds of problems that can be deemed practically solvable ? There is no formal proof for this, but scientist have agreed that if the time complexity of an algorithm to solve a problem is of polynomial order, then the problem can be practically solved.

There are two reasons for acceptance of this definition. First is a practical one. It has always been the case that if a polynomial running time algo was devised, people have always figured out a better algo with lesser degree polynomial. The second reason is a theoretical reason : you cannot fix a specific degree for the polynomial and declare that as the boundary line. What if, the same algo runs a bit faster on a different environment, different programming language etc?

The main task of complexity theory is to classify concrete computing problems with respect to their degree of hardness measured by computational complexity. So, how does one go about classifying a problem as a hard problem or not ? Since one is unable to design lower bounds on complexity of concrete problems, a indirect method is used to classify whether a problem is NP-hard or not. There are about 4000 problems that are considered as hard problems and cannot be solved by an algo efficiently.

Let’s say you have thought about an algo A and you want to know whether it is NP-hard. The way to do it is , check whether the assumption of polynomial running time for your algo would mean a polynomial running time for any of the 4000 NP-hard problems. If that is the case your assumption is wrong and hence your algo is NP-hard. It’s a convoluted argument but a plausible one and that has stood the test of time.

Now it is perfectly possible that the now classified 4000 problems themselves are not NP-hard. For example Traveling Salesman problem is NP-hard but with DNA computing it is possible in the future that all variants of TSP can be solved in polynomial time. So, may be the subset of 4000 problems might be reduced and the ones you thought were NP-hard are no longer NP-hard. But till then the method to check whether a problem is NP-hard or not is via the indirect method. This concept is not easier to grasp for a first timer and hence the author provides enough examples and visuals to drive home this point.

At the end of this chapter, a reader will get a fair idea of the kind of tasks that are deemed to be practically unsolvable. For example, I can now understand and make sense of the statement that I came across a few days ago in graph theory - “ Finding a Hamiltonian Cycle in Graph is NP-hard problem”. This means that there is no algo till date that has been developed that has polynomial time complexity.

Randomness in Nature and as a Source of Efficiency in Algorithmics

The reader will feel that the word “magic” in the title is apt, after reading this chapter. Randomized algorithms are introduced in this chapter. What are they ? There are many tasks for which deterministic algos take more than the age of universe and are deemed practically impossible. In all such cases, uncertainty comes to the rescue. By giving up on absolute reliability of the output and allowing for a small error probability, algorithms can be devised to have have randomization as one of the steps. This randomization could range from merely selecting a number at random from a set to simulating something at random and using it for algo processing. This randomization allows for the task to be done in a far less time than any deterministic algo. The chapter uses an example of communication protocol for explaining the power of randomized algorithm. This example makes the reader abundantly clear that any kind of insane reliability can be obtained by repeating several randomized computations on the same problem instance. For many applications on the internet, randomized algorithms are the solutions to many problems that take insane time or space by a deterministic algo.

Cryptography,

How to transform drawbacks in to advantages?

Cryptography is also a topic that lives up to the subtitle of the book, “from knowledge to magic”. Thinking of secure communication between multiple parties is not possible without randomized algorithms. The chapter gives the logic behind symmetric cryptosystems and then explains its limitations. It then goes on to describing the math behind public key-private key cryptography system that is the dominant mode of secure communication on the internet. At the heart of its success lies the fact there are one way functions that cannot be inverted easily. No wonder the best number theorists in the world contributed a ton of stuff to developments in Cryptography.

The last few chapters are like reading stuff from a science fiction book, but with a difference. The stories mentioned here are happening in various labs around the world. DNA computing for instance is the use of molecules for computation. The following visual is a projection of how computing will be done years from now.

clip_image001

Massive parallel computing can be done via DNA computing. This might make all the 4000 or so NP-hard problems known till date crumble down in to doable tasks. In fact Adelman, a scientist has solved an instance of Traveling Salesman problem in his lab. Though there are many aspects of DNA computing that are yet to worked on, it will not be surprising ( given the way PC and internet revolution happened so quickly relatively speaking) to see DNA computing device in every scientist’s lab in the near future.

Takeaway :

The book shows the beauty, depth and usefulness of the key ideas in computer science and makes a reader understand and appreciate the “science” part of “computer science”.