How Malware Works 2022

Malware needs a vulnerability in order for it to do its damage. Sometimes that vulnerability is at a human level, simply tricking people into running some software that they shouldn’t. However, much of the time, the vulnerability reflects some error in software. As such, to better understand malware, we must first take a look at how software works.

Executing Code


Programming languages are sometimes represented as a hierarchy, with the base being binary.


The hardware of computers works based on the presence or absence of electrical charges. In short, the hardware has an alphabet of only two “letters.” Commonly, this binary language is represented as 1s (there is a charge) or 0 (there is no charge). This is how the electrical components of RAM, a hard drive, and a CPU are able to communicate.

For example, to spell the word malware in binary:

m = 01101101
a = 01100001
l = 01101100
w = 01110111
a = 01100001
r = 01110010
e = 01100101

However, just because all computing devices (servers, laptops, tablets, smartphones, etc.) use the same alphabet, it doesn’t mean that binary code written for one type of device will work for another. The hardware of a device, largely the CPU, dictates specific rules for how data is processed. As a rough analogy, the Latin alphabet may be the backbone for how many cultures speak and write. However, just because someone from Boston uses the same letters as someone in Paris, it doesn’t mean something he or she writes will be understood in that different environment.

Similarly, binary code of one architecture, like that of a smartphone, won’t work on a different architecture, like that of a typical laptop. Just as different cultures have different rules for how words are formed and placed into a sentence, different computing architectures have different rules for how binary data is processed. As such, software, including malware, rarely jumps between different types of devices.

Better Than Binary

Early programmers realized that, rather than trying to code in binary, it made sense to write in other languages and then have some other program translate that “source code” into binary for them. This gave rise to assembly language, which features human readable commands like “add,” “mov” (short for move) and “inc” (short for increment). A special program called an assembler converts code written in assembly language into a binary program. In simple terms, all an assembler does is read the text-based instructions and then convert them, line by line, into binary instructions.

The next evolution of programming was the advent of compiled languages, which today comprise much of the software (and malware) used on computers. Compiled languages like C and C++ use commands much closer to natural language and have various routines built in. For example, in assembly language, to print “Hello World” to the screen might take 15 lines of difficult-to-understand code. In a language like C, much of that work is summarized by printf("Hello World").

Compiled languages use a compiler to translate source code into a binary program that, in most cases, can be managed by a specific operating system. Some compilers translate source code into assembly code (and then an assembler turns it into binary). Others are able to compile binary without this intermediate step. Compilers are much more sophisticated than assemblers. Rather than translating line by line, compilers can apply logic, pull in other source files, or link to programs (often referred to as libraries) already built into an operating system. While on one hand, the features of compiled languages and their compilers has greatly expanded the complexity of computer programs, the process of compilation can also mask source code errors, giving rise to vulnerabilities in programs.

For a few reasons, most notably the use an OS’ existing libraries, compiled programs are specific to not just an architecture but also typically an operating system. This explains why software (including malware) tends to be specific to not just a category of devices but also an operating system running on those devices. Malware written for a Windows computer generally does not cross over into a Linux one and so forth.

Of course, we used the word “tends” for a reason. There is software, such a computer’s BIOS, that runs before an operating system loads, making such code architecture dependent but OS independent. We also have scripting, which can be both architecture and OS independent.

A script is a set of typically text-based instructions to be used by a special program (often called an interpreter). As such, a script is not a standalone program, and in the world of computer programming, lengthy debates can be launched over whether someone who writes scripted applications is in fact a “programmer” or a “coder.” What is important to know, especially when trying to understand malware, is that while an interpreter is written for a specific OS, the script itself can be universal. As such, a script-based piece of malware could be cross-platform. One of the advantages of scripting is that it takes some of the risk and difficulty out of programming. Generally, scripting is easier to learn than a compiled language, and from a security standpoint, the interpreter can prevent a serious error from crashing the machine or creating some vulnerability. Scripting is very popular for developing web applications since it is portable across different platforms. For example, a web application moving from a Windows server to a Linux one might need very little adjustment and can be readily edited.

One language in a category by itself is Java. Like a scripted language, Java requires additional software to run. However, rather than an interpreter, Java’s additional program is a virtual machine. On desktops and laptops, this is typically called the JRE — Java Runtime Environment. The nature of Java allows the source code to be compiled just before it runs, boosting performance over scripted languages, but it is also very portable; as long as the two different computers have similar Java virtual machines, they can run the same code. While you can find Java in many places — servers, desktops, tablets — its dominant role these days is as the language for Android smartphone applications.

It’s important to recognize that, while in theory, scripted languages and Java are device independent, these kinds of applications use programs (an interpreter or virtual machine) that are specific to an operating system. For example, the Android virtual machine won’t run on a typical laptop or desktop, and this is one reason why Android apps, even though they are written in Java, only run within Android devices. Still, there are examples of cross-platform code contributing to malware attacks. For example, a malicious cross-platform script or Java app might communicate with an attacker’s server, identifying the OS of the victim’s computer, and then download an exploit and payload that are specific to that operating system.

In many ways, the evolution of programming reflects the growth in computing hardware over the years. As CPUs and memory have increased in capacity, there is no longer an urgency to code as short as possible. As such, over the years we have seen different attempts to make programming easier. We have even seen what has been called “pseudo-code,” sometimes abbreviated as p-code. The premise behind p-code is to use “natural” language in place of coding shorthand. For example rather than writing something like:

if ($x < 10) {$i++},

p-code would have you write:

if x is less than 10 increment i.

The further programming gets away from binary language, however, the more moving parts we see in turning the instructions of a program into an executable binary. The risk is that at each stage we are relying on the quality of the underlying software — not just the source code, but the compilers, interpreters, and so forth — to keep our data and devices secure.

Recommended for you The Components of Malware

Review Checkpoint

To test your understanding of the content presented in this assignment, please choose your selected response

1. Which of the following describes the purpose of a compiler? Choose only one answer below.

a. A compiler turns binary language into assembly language.

b. A compiler turns human readable code into executable binary.

Correct. Compilers create binaries from human readable code written in languages such as C or C++.

c. A compiler checks code for security vulnerabilities.

2. Higher-level programming languages are inherently more secure than lower-level ones. True or false? Choose only one answer below.

a. True

b. False

Correct. This is a false statement. The level of a programming language does not bear on its security.

Recommended for you Cloud Application Security

Related Articles

Check Also
Back to top button