Expressions

Programs are essentially sequences of instructions that operate on data. To write a program, it is essential to represent data, define operations that transform the data, and allocate storage for intermediate data during execution.

In general programming, an expression is a non-empty sequence of literals, variables, operators, and function calls that calculates a value. The process of executing an expression is called evaluation, and the resulting value produced is called the result of the expression (also sometimes called the return value).

In C++, expressions represent data. An expression is a piece of code that has both a type and a value:

The type indicates the kind of data the expression represents.
The value is the actual data.

Expressions in C++ can range from simple entities like numbers, characters, or boolean values, to complex objects like students, cars, or files. Let’s examine an example from a “Hello, World!” program:

std::cout << "Hello, World!" << std::endl;

This line contains several expressions. Let’s focus on three key ones:

std::cout: a built-in C++ entity used to display information in the terminal (console)
- Type: A kind of stream, for now, think of it as a tool for printing (specifically, std::ostream).
- Value: The stream pointing to the terminal, enabling output to the console.
"Hello, World!": a sequence of characters known as a string literal.
- Type: A string literal (sequence of characters).
- Value: The specific string literal “Hello, World!”.
std::endl: a predefined command that tells the terminal to move the cursor to the next line.
- Type: For now, think of it as a special sequence of characters.
- Value: Represents the end of a line.

Note that these expressions are combined using the << operator to produce the desired output (which is also an expression). We’ll discuss operators a little later.

Memory

All expressions in a program have both types and values. For a computer to work with these values, it must represent them internally. Modern digital computers rely on the binary numeric system to store and process data. In fact, all data in a computer program is ultimately stored in numeric form.

Expressions have types and values, which must be stored internally by the computer. Modern computers use the binary numeric system for data representation, storing everything in numeric form.

Binary and Decimal Systems

The binary system operates similarly to the decimal system (the system humans use for math and counting), but with one key difference: the base.

Decimal System (Base 10): Each symbol (or “digit”) can take one of 10 possible values: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. By combining these symbols into sequences (e.g., 123), we can represent any value, no matter how large or precise.
Binary System (Base 2): Each symbol (called a “bit”) can take only one of two possible values: {0, 1}. Just like in decimal, combining these bits into sequences allows the representation of any value.

If you’re wondering how two symbols can represent arbitrarily large values, consider this.

A tally mark system (a unary numeric system with only one symbol) can represent any value by simply adding more marks. Similarly, having two symbols in binary provides even more flexibility to represent larger numbers with shorter sequences.

For example:

0 (binary) = 0 (decimal)
1 (binary) = 1 (decimal)
10 (binary) = 2 (decimal)
11 (binary) = 3 (decimal)
100 (binary) = 4 (decimal)
101 (binary) = 5 (decimal)
110 (binary) = 6 (decimal)
111 (binary) = 7 (decimal)
1000 (binary) = 8 (decimal)
1001 (binary) = 9 (decimal)

Numbers are represented in binary using place values:

Decimal places: 1’s, 10’s, 100’s, …
Binary places: 1’s, 2’s, 4’s, 8’s, …

To convert a binary number to a decimal number, multiply each bit by 2 raised to the power of its position, starting from 0 on the right. Add the results to get the decimal equivalent. For example, the binary number 10010 equals 18 in decimal because it has:

A 1 in the 16’s place or fourth position (2 raised to the power of 4)
A 1 in the 2’s place or first position (2 raised to the power of 1)

Bytes and Storage

All data in a computer is stored as bytes, where a byte consists of 8 bits. Even complex data types like streams and strings are stored as sequences of bytes.

For example, the ASCII table maps characters to numeric values, allowing characters to be represented in binary. The letter ‘A’ is represented by the number 65 in ASCII, which is stored in binary as 01000001. This allows computers to encode and process characters as binary data.

Data can be stored in:

Disk Storage: Includes devices like hard drives, SSDs, USB drives, and floppy disks. This is where permanent files are stored. Disk storage has two key characteristics, which makes them slower:
- Persistence: Data remains intact even when the computer is turned off.
- Large Capacity: Designed to store large amounts of data, such as files, videos, and programs.
Random Access Memory (RAM): Much faster than disk storage but is volatile, meaning data is lost when the computer is turned off. It is used for temporary storage of data and instructions that the CPU needs while running a program.

Expression Types

The type of an expression determines:

Storage requirements: how many bytes the data occupies in memory.
Interpretation rules: how the stored bytes are interpreted (e.g., as numbers, characters, etc.).

Primitive Data Types

The types of expressions we’ve encountered so far have been somewhat complex, such as entire streams or sequences of characters. In this lecture, we will focus on simpler types of expressions—those that are built directly into the C++ language. These are known as primitive data types.

Common Primitive Data Types in C++

C++ provides a variety of primitive data types, but for this course, we will concentrate on the most commonly used ones:

int: Represents whole numbers, also known as integers (e.g., 42, -7).
float: Stands for “floating-point number” and is used to represent non-whole numbers. A float can also represent whole numbers but is primarily designed for decimals (e.g., 3.14).
double: Short for “double-precision floating-point number,” it offers greater precision than float.
bool: Stands for “boolean” and represents truth values—true or false.
char: Represents a single character, such as ‘a’, ‘A’, ’!’, or even a digit like ‘1’. It is important to note that the character ‘1’ is different from the integer 1.

There are other primitive types in C++ that we’ll discuss later in the course. Additionally, we care about strings, but strings are not technically primitive data types in C++.

Integral and Floating-Point Types

Integral types: These include any type that represents whole numbers, such as int and even char.
Floating-point types: These include types that represent numbers with a decimal point, such as float and double.

Memory Allocation for Primitive Types

Primitive data types have fixed sizes in memory, but the exact size can vary depending on the system and compiler. The C++ language standard provides some flexibility, allowing sizes to vary between systems, but there are common patterns:

int: Typically requires 4 bytes, which means it can represent values ranging from -2,147,483,648 (-2^31) to 2,147,483,647 (2^31 - 1). While this range is sufficient in most cases, larger types like long int and long long int are available for situations that require larger numbers.
char: Always requires 1 byte, as mandated by the C++ standard. It is the only primitive type with a fixed size specified by the standard.
float: Typically requires 4 bytes. This allows for the representation of decimal values but with less precision compared to double.
double: Typically requires 8 bytes, which is twice the size of a float. This allows it to store larger values and more precise decimals.
bool: Usually requires 1 byte, but this can vary depending on the system.

C++ expressions of different types may share the same binary representation but are interpreted differently. For example, the integer 65 and the character ‘A’ may have the same binary representation (1000001) but are interpreted differently based on their types.

Literals

A literal is a hard-coded expression representing a value in a program. Examples include:

5: An int literal.
5.1: A double literal (non-whole decimal numbers default to double).
5.1f: A float literal (suffixing a number with f specifies float).
'A': A char literal (single characters must use single quotes).
true / false: The only valid bool literals.
"Hello": A string literal (strings use double quotes).

By default, C++ treats all non-whole decimal literals as double, not float. This is because double provides higher precision, which is often desirable when working with decimal numbers.

Although strings are not primitive types in C++, they can still be constructed using literals.