First published: 01st January 1995
By Allan G. Dyer M.Sc.(tech) B.Sc. AIDPM MIAP MHKCS Head of F-PROT Technical Support, Yui Kee Co. Ltd.
One of the earliest techniques used to detect viruses was signature scanning, searching programs for fragments of known viruses. But what if the virus changes itself each time it infects a file? Choosing a search string then becomes impossible, but it seems incredible that a program could change itself entirely, yet still work in the same way.
In fact, many methods exist for changing a program without changing it's function. Many instructions or groups of instructions have equivalents that do the same thing, for example, XOR AX,AX does the same as MOV AX,0. Alternatively, the same operations could be performed using a different set of registers. A third method would be to separate functional code by variable lengths of 'junk' code, instructions that did not affect the operations being performed.
The most important method of polymorphism is encryption, basically, the main part of the code is encrypted using a variable key, and only the decryption routine is left unencrypted (Fig. 1). Naturally, the decryption routine could be used to detect the virus, so often one of several decryption routines are chosen, and the other polymorphic methods are used to further obscure the decryption routine. Thus, two copies of the same virus may not have even a single byte in common.
The first polymorphic viruses, the V2P? series, were written in the United States, causing a shockwave though the anti-virus industry as scanner manufacturers strove to match this new technical development. Writing a polymorphic virus, however, is a lot more complex than writing a simple virus, so they remained rare until 1991, when the Dark Avenger published the Mutation Engine or MtE. Rather than being a virus itself, MtE is a polymorphic generator which can be linked to any virus to make it polymorphic. This immediately put the capability to produce polymorphic viruses into the hands of quite mediocre virus writers. There are at least 33 different viruses that are known to use MtE.
In 1992 and 1993, the major virus-writing groups produced their own polymorphic generators. A Dutch member of the TridenT virus group wrote TPE (TridenT Polymorphic Engine), the American group NuKE wrote NED (NuKE Encryption Device), and the Canadian group Phalcom/SKISM produced DAME (Dark Angel's Multiple Encryptor). Towards the end of 1993, the first polymorphic generator from Taiwan was published, DSME (Dark Slayer Mutation Engine), with documentation in English and Chinese.
In the beginning of 1994, an American virus writer called MnemoniX presented a new generator called MutaGen. There are now four different versions of MutaGen. In March 1994, a second generator from Taiwan was released, GPE (Guns'n'Roses Polymorphic Engine). In the documentation, the author of GPE, Slash Wu, prohibits the use of the generator in viruses and other malicious software. He claims that he developed GPE solely for the purpose of protecting data and programs from unauthorised users. In April 1994, Dark Slayer produced a more complex version of his engine, calling it DSCE (Dark Slayer Confusion Engine).
1994 also saw the first polymorphic generators developed in Hong Kong. Some members of the Jerusalem.Vtech family use VTME (Vtech Mutation Engine). A Hong Kong virus writer who was producing very simple viruses at the beginning of the year has now produced several versions of a simple polymorphic generator, CLME (Crazy Lord Mutation Engine). Most recently, a polymorphic virus by another Hong Kong writer has appeared. It is based on the Jerusalem virus and will probably be called Jerusalem.J-virus.
Detecting a polymorphic virus is more difficult than detecting a simple virus (some anti-virus products could not detect MtE viruses more than a year after it was published), but several methods exist. Algorithmic methods search for common structures in the decryption routine. Checksumming is a very good method for detecting any virus. Decryption-based detection uses generic decryption methods, followed by a string-based search on the decrypted code.
When the Mutation Engine was first released, the possibility of even novice virus writers producing difficult-to-detect polymorphic viruses caused great concern, but the major drawback of a polymorphic generator is that once an anti-virus program is able to recognise a particular generator, it is usually able to detect all viruses hidden by it.