We have a program that performs a comparison of every pair of elements in a database, and then compares the result with the median value of all values. It was suggested that the program should instead make pairwise comparisons within every possible combination of subsets. How many comparisons would we end up making?
For this example, each element in our database will be represented by an integer. We call these elements the population which is then partitioned into all possible distinct pairwise disjoint subsets . Each is called a block of and is distinct; the program should not enumerate the same block multiple times. As an extra complication, each pairwise comparison will need to be also made against the block median. So for each block there are two operations performed:
- Take the median value of each
- Take the euclidean distance between the median and every element .
The first operation need only be performed once per block and the second will need to be performed exactly times, as each block is also a subset and cannot contain duplicates.
The number of ways can the population be divided into disjoint non-empty subsets is known as a Stirling number of the second kind, or simply a Stirling partition number. (see [1]) This is defined as:
So when is divided with partitions where is the cardinality of the block it follows that the program will need to obtain median values and then perform pairwise comparisons:[sterling]
which is the number of ways a set with n elements can be partitioned into disjoint, non-empty subsets. the number of ways a set with elements can be partitioned into disjoint, non-empty subsets. As the partition changes, all of needs to be recalculated as does the score of the partition . For every different set of partitions we perform operations plus operations as it is possible to have anywhere from 0 to partitions in the set.
Then the total number of operations over all possible partitions is: Where is the nth Bell number. We can now use this formula to calculate the exact number of comparisons made during the execution of the program
Further Reading
- Stirling numbers https://oeis.org/A008277
- Bell numbers https://oeis.org/A008277