Introduction
Because of today's high processor speeds, and the large amount of available memory, many programmers seem to ignore the limitations on these resources in applications. This can often lead to scalability and performance problems that can often be solved with very little difficulty. This paper discusses common pitfalls and problems when creating applications in .NET. We will cover some nomenclature first, then present several common issues that we have come across in a wide variety of applications. We will then describe some common ways of solving these problems.
Beware of optimizing needlessly. You are better off writing readable code, then focusing your efforts on optimizing only those parts that need it... this is just as true with Just In Time (JIT) compiled platforms, such as .NET as with native code.
If we overoptimize our code, we risk limiting the ability of the compiler to perform compiler optimizations, such as loop interchange, 1 thereby leading to a decrease in performance, as the compiler may select different optimizations depending on platform.
In my experience, it is best to ensure you have a scalable architecture, and leave optimization for a subsequent stage in which you can identify, measure, and fix those performance hotspots.
A big Oh
You may have come across big O notation before. It is used to describe how an algorithm employs resources (time and memory) depending on the size of its input. If a function has a time complexity of O(n 2 ), then we imply that after n is sufficiently large, the amount of time spent in the function grows in proportion to the square of the size of the input... but when n is small, this is not necessarily the case.
Let's have a look at the following function:
1: static void SomeFunction(int[] array)
2: {
3: Thread.Sleep(100000)_
4:
5: for (int i = 0_ i < array.Length_ i++)
6: {
7: for (int j = 0_ j < array.Length_ j++)
8: {
9: Thread.Sleep(10)_
10: }
11: }
12: }
This function has a time complexity of O(n 2 ). If we pass in an array of 1 element, then the function will spend longest in the sleep on line 3 of our function and this is still true if we pass a 100 element array. However, if we pass in a 1,000 element array, the same amount of time is spent in line 9 as line 3.
As the number of elements in the array grows beyond 1,000, the sleep on line 9 becomes more and more dominant and the sleep on line 3 loses its influence on the run time of the function.
The table below outlines several common function complexities, illustrating how they grow with the size of the input.
I was going to draw a graph of this table, but you would only have been able to see the lines of O(n 3 ) or O(n 2 ), as they grow so much more rapidly than the others. So far, we have been talking about time complexities, but it is important to remember that the same can also be true of memory usage.
You will often find that, to decrease a function's timecomplexity, you need to increase its memory complexity. Can you come up with a function with an O(1) time complexity (i.e. the function's execution time does not vary with the size of the input), and a function with an O(1) memory complexity, to tell if a number under 10,000 is prime or not? What are the respective memory and time complexities?
Get real
So all this complexity malarkey sounds good... but how do I use it in practice? My function is 5,000 lines of code, calls other functions and has 20 parameters – there is no way I can figure out its time/memory complexity!
The truth is, you don't need to figure it out by hand. The way that I do it is to use a profiler and run the application with one set of inputs, then rank the functions in order of their execution times. ANTS Profiler™ is my personal favorite, as it is simple to use, and it supports linelevel timings.