Lecture 6: Pointers and Arrays
How can we effectively manage all types of memory in our programs?
Pointers
Pointers are variables (like any other variable) that are 8-bytes (8 chars) long. Within these 8-bytes include an address to another variable/unit of memory. Since pointers are variables themselves, they actually have a memory address as well. But that isn't the main point here. Really, the address points to another unit of memory. This is useful because we can manipulate larger chunks of memory (such as strings) via pointers. To motivate pointers in one sentence: in C, there is only pass by value meaning that a copy of the value is made when sending in a variable as an argument. What happens if this value is massive in size? Copying it would be expensive. Instead, we can pass the location of the variable. This is done through pointers which, in this case, would be a (potentially) smaller sized variable containing the hexidecimal address for the actual variable we are working with. Indeed, pointers are useful in general for memory allocation (heap memory allocation) and manipulation which is what we will get into later in the course.
Declaring pointers
We declare pointers using the following template:
[type of value being pointed to] *[pointer variable name] = address;
As an example:
int *xPtr = &x;
Motivations for pointers
- Allows us to manipulate memory and allocate memory at a lower level (i.e. heap allocation).
- Pass around large variables by reference to avoid expensively making a copy.
- Refer to memory generically (type-agnostic).
Operaters
Note: you may notice that we use *
when declaring pointers. There is actually a double use case. We use *
to deference other variables on the fly, but when declaring pointers, we also use *
. The *
denotes that the variable is an address (pointer). To bring light to this, we understand that we need to type our variables with type (i.e. int x = 7
). In this sense, we can kind of say that pointers also require its own "type" which is denoted by *
(i.e. char *hello = "hello"
). See Nick's answer here.
Memory
You should think of memory as one big array of bytes. Each "cell" has an address associated with it and (possibly) a value.
- A pointer takes up 8 of these slots and is used to store the hex address of another item.
Functions with pointers
When passing around pointers, we utilize several of the operators.
- Everything in C is automatically passed by value which means a copy is made of the argument. Thus, any changes do not persist.
- To have changes persist, you must pass in a pointer (address of,
&
) of the argument variable. That is, an address of the location of the variable's value is sent in to the function. As a result, to essentially pass-in-reference, you can use the&
operator (i.e.myfunc(&myvar)
). A copy of the address is sent in.
- Because of this, in the formal parameter of the actual function, you need to use the pointer type (i.e.
myfunc(char *ch)
). Furthermore, you must remember that once inside the function, this argument is actually an address. Thus, to change/get the value, you must deference using*
).
Double Pointers
Now we introduce double pointers which are, well, pointers to pointers.
The motivation for this is that we might want to modify the address of a pointer itself (i.e. modify where a pointer points to). This actually makes a lot of intuitive sense when you think of it.
I think a good mental model to have for double pointers is to think of pointers themselves as just an ordinary variable (which is actually what they are) and think of how we can use pointers to point to these variables themselves.
See the skipSpaces example from lecture to get a better understanding of double pointers. You really just need to follow the memory diagram here.
Arrays
We will talk about arrays and how they are represented in memory. Before we jump into arrays, it is helpful to understand how variables are stored in memory. See this memory diagram:
Here x
is stored at location 0x1f0
in memory and x
contains the value 2
. So when we refer to x
, we refer to 2.
Arrays are stored in memory in the same way except that the variable name refers to a contiguous block of memory (multiple cells possibly) rather than just one cell.
- Size of a declared array cannot be changed.
- Cannot reassign arrays (unlike pointers).
Arrays as parameters
When passing in an array as an argument into a function, C makes a copy of the address of the first array element and sends that in instead of the whole array.
- Note: this might be a problem with
sizeof()
because if you callsizeof()
on an array in a function and it is passed in via argument, you must remember that it itself is actually a pointer now sosizeof()
would just return 8. Thus, you should throw the size in of an array as an extra argument if needed.- You might ask why we can't just deference an array pointer and get size vai that way. You must recall that the pointer points to the first element of the array so that will just get the size of that element rather than the entire array.
- Another note is that these are the same:
// the following are equivalent
char *ptr = &str[0];
char *ptr = &str; // should avoid this though
This is because a pointer to an array points to the address of the first variable of the array (second line). On the first line, we get the address of the first element as well. This makes sense.
Arrays of Pointers
- Can have an array of pointers (i.e. array of strings). Each element is a pointer pointing to the first element of the string (if it is a string... does not have to be).
Pointer Arithmetic
- We saw this with char * strings already where it updates the memory address whatever arithmetic operation you perform.
- This can be generalized to all types of pointers as well. However, the arithmetic operations work based on the size of the data type.
That is, say we have
int *nums = ... // e.g. Oxff0
int *nums1 = nums + 1; // e.g. Oxff4
Of course this works nicely for strings since chars are 1 byte.
Additionally, we can use bracket notation:
char someLetter = str[2];
This means that we should update the pointer + 2 bytes (since chars are one byte long), deference, and get value.
This can be kind of tricky so make sure to re-watch the video.
int *nums = ... // e.g. Oxffo
int *nums3 = nums + 3; // e.g. Oxffc
int diff = nums3 - nums; // 3
A formula to get the actual value is:
Thus, above, we have .
Const, struct, and ternary
Const
- We can use the keyword
const
to declare global constants in program.- In C, constants cannot be changed after declared.
Syntax:
const int someNum = 5;
Can also declare at top of file like so:
#define
- Useful for making things read-only (un-changeable)
- With pointers, we can modify the pointer but not the values it points to (this applies for double pointers too).
- Part of the type.
Structs
Syntax:
struct date { // declaring a struct type
int month;
int day; // members of each date structure
};
struct date today; // construct structure instances
today.month = 1;
today.day = 28;
struct date new_years_eve = {12, 31}; // shorter initializer syntax
- Use
typedef
to define it as a type so you don't need to putstruct
before every declaration.
typedef struct date {
int month;
int day;
} date;
date today;
today.month = 1;
today.day = 28;
- Can utilize pointers for structs as well (since if you pass a struct into a function, it copies it first).
->
operator is syntax shorthand for deferencing struct pointers. Usually, we'd write something like(*d).day++
with paranthesis since order of operations matter here. Shorthand we can just writed→day++
.
Ternary operator
- Shorthand for using if/else to evaluate to a single value.
condition ? expressionIfTrue : expressionIfFalse
int x;
if (argc > 1) {
x = 50;
} else {
x = 0;
}
// is equivalent to
int x = argc > 1 ? 50 : 0;