Pointers | A Magic Wand to Allocate Memory in Code

7 min
memorypointers

Pointers are a fundamental concept in programming, serving as a cornerstone for memory management and enhancing code efficiency. In this blog, we'll delve into the world of pointers, exploring their role in memory management, and how they can enhance code efficiency.

What is a Pointer?

A pointer is a variable that stores the memory address of another variable. It acts as a reference to a location in memory where a value is stored. Pointers are essential in low-level programming languages like C and C++, where direct manipulation of memory is necessary.

Role in Memory Management

Pointers play a crucial role in memory management. They allow programmers to:

  1. Dynamically allocate memory at runtime.
  2. Access and manipulate data in memory directly.
  3. Implement data structures like linked lists, trees, and graphs.

Before Dive Deep Into Pointers Let's Understand Memory Allocations

1- Static Memory Allocation

Static memory addresses are determined at compile time (relative addresses) and the memory is allocated in the stack frame when the assembled code runs on the host machine. We can access to the address directly.

All instructions for accessing and manipulating variables, including their relative (not absolute) memory addresses, are generated by the compiler during the compilation process. For each use of a variable like 'n', the compiler generates specific instructions to locate its memory address on the host machine and retrieve or modify its value. These instructions become part of the executable code and are executed during runtime. This means that every time n is used in the program, the corresponding machine instructions to access its memory location are already in place and ready to be executed.

There's no need to determine anything at runtime, as those instructions are part of the executable code after compilation.

Check below example in C:

int main(void){
    int n = 50;
    n=n+10;
    printf("%i", n);
}

In this example, the relative memory address of n is determined at compile time and allocated in the stack frame. The instructions to access and manipulate n are generated and ready to use after compilation.

When we compile code on one machine that will run on another, we can't always directly allocate absolute memory addresses during compilation. Here's how this typically works:

  1. During compilation, the compiler doesn't actually allocate real physical memory addresses. Instead, it works with relative addresses or offsets from certain reference points. These are often called "relocatable addresses."

  2. he actual memory allocation happens in several stages:

  • The compiler generates object files with symbolic references and relative offsets
  • The linker combines these objects and resolves symbols
  • The loader finally places the program in actual memory when it's run

For statically allocated variables, what's "static" isn't the absolute memory address, but rather:

  • The size of the allocation (known at compile time)
  • The relative position of variables in memory
  • The lifetime of the variables (entire program execution)

If we deep dive into above code, you can see a basic assembly code like below:

main:
    ; Set up stack frame
    push ebp
    mov ebp, esp
    sub esp, 4           ; Allocate 4 bytes for n
 
    ; n = 50
    mov DWORD PTR [ebp-4], 50
 
    ; First printf
    mov eax, DWORD PTR [ebp-4]
    push eax
    call printf
    add esp, 4
 
    ; n = n + 10
    mov eax, DWORD PTR [ebp-4]
    add eax, 10
    mov DWORD PTR [ebp-4], eax
 
    ; Second printf
    mov eax, DWORD PTR [ebp-4]
    push eax
    call printf
    add esp, 4
 
    ; Clean up and return
    mov esp, ebp
    pop ebp
    ret

In this assembly code:

  • [ebp-4] always refers to the location of n.
  • Each time n is used, the code accesses this same memory location.

Static memory allocation is faster than dynamic memory allocation since the memory address is determined at compile time.

Diagram illustrating the compilation process from source code to machine code

The Compilation Process: From Source Code to Machine Code

2- Dynamic Memory Allocation

Dynamic memory allocation allows programs to request memory from the operating system at runtime. This is crucial for managing memory efficiently, especially when the size of the data structure is unknown at compile time. We can access to the address indirectly using pointers.

Check below example in C:

int main(void){
    int *ptr = (int *)malloc(sizeof(int));
    *ptr = 10;
    printf("%i", *ptr);
}

In this example, the memory address of ptr is determined at runtime and allocated in the heap. The instructions to access and manipulate ptr are generated at runtime. However, the instructions for malloc are determined at compile time.

In here, malloc is a function that requests memory from the operating system and returns a pointer to the allocated memory. We'll see how pointers are used to access and manipulate this memory in the next section.

An overview of the four memory segments—global, code, stack, and heap—illustrating the conventional representation of heap growing downward and stack growing upward
Memory Segments: Global, Code, Stack, and Heap

How Pointers Work

Pointers are variables that store memory addresses. They are essential for dynamic memory allocation, data manipulation, and efficient program execution.

Declaration

Pointers are declared using the * symbol. For example:

int *ptr;

We can access to the address of a variable using & symbol.

Lets think below example in C:

int main(void){
    int n = 50;
    int *ptr = &n;
    printf("%i", *ptr);
}

In this example, ptr is a pointer to an integer variable n. The & operator is used to get the address of n, and *ptr is used to access the value stored at that address (dereference the address).

We have used a pointer for an integer type in above example. However, pointers are more useful for complex data types like arrays, structures, and classes.

For a string allocation in C:

int main(void){
    char *s = "Hello";
    printf("%s", s);
}

In the above code, s is a pointer to a character array. It stores the address of the first character in the string.

The basic memory layout of above code looks like below:

Memory Address   Content
0x1000           0x2000     <-- This is where 's' is stored
...
0x2000           'H'
0x2001           'e'
0x2002           'l'
0x2003           'l'
0x2004           'o'
0x2005           '\0'

Pointers point to the first element of the string.

Thanks to pointers, we can easily pass references to functions and manipulate data in memory instead of copying the entire complex data which cause performance issues.

Lets declare a struct and see how it works:

struct Person {
    char *name;
    int age;
};
 
void someFunction(struct Person *person){
    printf("%s", person->name);
}
 
int main(void){
    struct Person person1 = {"John", 20};
 
    someFunction(&person1);
}

In the above code, person1 is a struct that contains a name and an age. The someFunction function takes a pointer to a Person struct as an argument. The -> operator is used to access the members of the struct.

If the function is using whole struct as a parameter instead of a pointer, it will cause a copy of the struct to be passed to the function. In that function scope, the copy will be used and memory will be allocated in the stack.

void someFunction(struct Person person){
    printf("%s", person.name);
}
 
someFunction(person1);

It's not a good practice to pass large structs to functions as arguments. It's better to pass pointers to structs.

That's why we want to pass by a reference instead of a value for large and complex data types.

Also, without using a pointer, passing an address of a variable to a function will not work since we need a pointer to dereference the address and get the value.

int arr[5] = {1, 2, 3, 4, 5};
int *ptr = arr;
ptr = ptr + 2;
printf("%d", *ptr); // Output: 3

Languages Using Pointers: Explicit vs Implicit

Pointers are used in various programming languages, either explicitly or implicitly. Let's explore some examples:

Explicit Pointer Usage

Languages that allow direct manipulation of pointers include:

  1. C
  2. C++
  3. Go
  4. Rust (with unsafe blocks)

In these languages, developers can directly work with memory addresses and perform pointer arithmetic.

Implicit Pointer Usage

Languages that use pointers behind the scenes without exposing them directly to the developer include:

  1. Java
  2. Python
  3. JavaScript
  4. C#
  5. Ruby

These languages handle memory management and references automatically, abstracting away the complexity of direct pointer manipulation.

While these languages don't expose pointers directly, they often use reference types or objects that behave similarly to pointers in terms of passing data by reference.

References