After putting together this HOWTO I began to understand why the words "pitfalls" and "caveats" are so often associated with benchmarking...
Or should I say Apples and PCs ? This is so obvious and such an old dispute that I won't go into any details. I doubt the time it takes to load Word on a Mac compared to an average Pentium is a real measure of anything. Likewise booting Linux and Windows NT, etc... Try as much as possible to compare identical machines with a single modification.
A single example will illustrate this very common mistake. One often reads in comp.os.linux.hardware the following or similar statement: "I just plugged in processor XYZ running at nnn MHz and now compiling the linux kernel only takes i minutes" (adjust XYZ, nnn and i as required). This is irritating, because no other information is given, i.e. we don't even know the amount of RAM, size of swap, other tasks running simultaneously, kernel version, modules selected, hard disk type, gcc version, etc... I recommend you use the LBT Report Form, which at least provides a standard information framework.
A well-known processor manufacturer once published results of benchmarks produced by a special, customized version of gcc. Ethical considerations apart, those results were meaningless, since 100% of the Linux community would go on using the standard version of gcc. The same goes for proprietary hardware. Benchmarking is much more useful when it deals with off-the-shelf hardware and free (in the GNU/GPL sense) software.
We are talking Linux, right ? So we should forget about benchmarks produced on other operating systems (this is a special case of the "Comparing apples and oranges" pitfall above). Also, if one is going to benchmark Web server performance, do not quote FPU performance and other irrelevant information. In such cases, less is more. Also, you do not need to mention the age of your cat, your mood while benchmarking, etc..