Posterous
Joel is using Posterous to post everything online. Shouldn't you?
Dsc_5799_-_version_2__1__thumb
 

Tenerife Skunkworks

Boldly going where few have gone before

« Back to blog

The perils of benchmarking Q

Suppose we had a list of 10000000 phone numbers and wanted to take just the first 8 digits of each. 

We could create the list like this: 

q)l:10000000 10#99?"0123456789" 

Here, 99?... creates a vector of 99 random characters from the set [0-9] and 10#... takes 10 characters from each generated "phone number". 

10000000 ... repeats it that many times. 

A simple 8#l will just gives us the first 8 elements of the list which is not what we want. 

q)f1:{x@\:til 8} 
q)f2:{8#/:x} 
q)f3:{8#'x} 

The above 3 solutions will give us the first 8 digits of every phone number in the list. 

Which solution is the fastest one, though? 

The list is very very long specifically to make benchmarking easier. 

q)\t f1 l 
906 

\t here gives us execution time for f1 in milliseconds. We proceed to time f2 and f3:

q)\t f2 l 
738 
q)\t f3 l 
738 

Looks like f1 is much slower but is it really? Let's run the benchmark several times... 

q)\t f1 l 
624 
q)\t f2 l 
735

See this thread in the non-commercial q/kdb forum for an additional note from Stevan Apter.

Posted April 6, 2008
 
To leave a comment on this posterous, please login by clicking one of the following.
Posterous-login     twitter