Lecture 15: Optimization
- Compiler can optimize code for us to do things like:
- Reduce static instruction count and/or dynamic instruction count
- Static is num of instructions in executable whereas dynamic is number of instructions that are actually executed for a certain input (i.e. might skip over if-statement on certain inputs).
- Cycle count / execution time
- Reduce static instruction count and/or dynamic instruction count
- Can have targeted, intentional optimizations our code.
- Measure performance by CPU cycles.
GCC Optimizations
- Constant folding
- pre-calculates calculations so that it doesn't need to re-do it for every execution of the program.
- Common sub-expresssion elimination
- Save result of expression once and reuse it when needed.
- Strength reduction
- Change more time consuming operations like multiplying to more efficient operation such as adding.
- Extremely common: bitwise in place of div and mult.
- Code motion:
- Moves calculations out of loop so doesn't need to be repeated.
- Tail recursion
- Make recursive program an iterative one. Reduces stack frames.
- Loop unrolling
Knowing that GCC optimizes these various things, we can now go and intentionally design our code so that the optimizations have a large impact.
Limitations
Why not always optimize? Can be hard to debug the assembly.
Caching
Despite the above things we can do to optimize our code, the main bottleneck for performance is actually accessing memory. Caching helps improve this.
- We are familiar with three levels of memory: registers, RAM, and hard disk (storage). In between registers and RAM, we have auxillary memory segments.
- Thus, what caching does is move memory we need frequently into these cache-designated segments.
- When will the memory system cache something into a closer memory segment?
- Depends on locality:
- Temporal locality: we are accessing same data repeatedly. Thus, store near CPU in cache-designated segment.
- Spatial locality: data near each other will likely be accessed in the near future. Example is array indexing: if we index
array[i]
, it is a good chance that we might indexarray[i+1]
.
- Depends on locality:
- Memory system does this automatically.