Contents |
---|

Introduction |

Perfect Antivirus |

Perfect Compressor |

# Introduction

This article is my attempt to collect and list interesting “proofs” that demonstrate why certain types of “perfect” software tools are impossible to create.

I plan to add more when I can recollect them or if I come across more such demonstrations.

For each of the tools, I first define what is meant by a perfect such tool and then demonstrate why it is not possible to create it. Wherever possible, I have tried to give credit to the author or forum where I first came across the proof.

# Perfect Antivirus

**Definition:** A perfect antivirus would be a program that can
reliably detect all computer virii and can remove them from
infected programs - or can at least quarantine infected programs.

**Notes:** We will show that it is not possible in the first
place to write a program that can detect all computer virii with
100% accuracy. I came across this proof in a book by Fred Cohen,
the pioneer researcher on computer virii.

**Proof:** Let us suppose that there indeed exists a program
that can tell us with 100% accuracy whether a given program is
a virus or not. It is easy to see that we can write this program
as a procedure (or write a wrapper procedure around the program)
that returns `true` if a given program is a virus, `false`
otherwise. Let us name this procedure `isVirus( )`.

Now we can write a program that uses this procedure *on itself*
in the following manner:

if( isVirus( self)) { /* Behave like a normal program */ } else { /* Behave like a virus */ }

It can easily be seen that our program always behaves exactly opposite
to what the virus detector says is its behaviour. So the virus
detector is *always wrong* about our program and is not 100%
accurate after all!

Thus there can never be a program that can always correctly determine if a given program is a virus or not.

# Perfect Compressor

**Definition:** A perfect compressor would be a program that can
always compress any given data file to a file with a smaller size,
such that a corresponding decompressor will always be able to
decompress the compressed file to exactly the same data that was
present in the original file *without using any extra information*.

**Notes:** We will use the “pigeon hole” principle to
demonstrate the impossibility of such a tool. I do not know who
came up with this proof first, but it is seen frequently on the
`comp.compression` newsgroup.

**Proof:** A compressor will have to compress a given
sequence of `n` bytes to a sequence of *at most*
`(n-1)` bytes to be called a “compressor” and
will have to always do it for *any* such sequence to be
even considered for being called “perfect”.

Let us suppose that such a tool exists.

Now each byte can store 256 different values, so there are
256^{n} possible input sequences of `n` bytes. In
other words, there are

^{n}+ 256

^{(n-1)}+ 256

^{(n-2)}+ ... + 256

^{1}

possible non-empty input sequences of `n` bytes or less for
our compressor.

Since the compressor can output sequences of only `(n-1)`
bytes or less, it can only output

^{(n-1)}+ 256

^{(n-2)}+ ... + 256

^{1}

possible non-empty sequences of bytes of compressed data.

Clearly there just isn't enough room to fit the compressed output of the possible set of input data - we have more pigeons than we have pigeon holes to put them in.

In other words, it is just not possible to build a perfect compressor as defined here.

Note that we didn't even have to consider the additional restriction
that `n`, `(n-1)`, etc. bytes have to be respectively
compressed to `(n-1)`, `(n-2)`, etc. or less bytes.