Fast Random Strings Generation in Haskell

Posted in: fp, haskell.

Yesterday morning there was an interesting question on Haskell Cafe:

Hello anyone, I’ve written a snippet which generates a file full of random strings. When compiled with -O2 on ghc-7.6, the generation speed is about 2Mb per second which is on par with interpreted php. That’s the fact I find rather disappointing. Maybe I’ve missed something trivial? Any suggestions and explanations are welcome.

It seemed an interesting exercise and we can’t put up being on par with PHP, so I decided to dig in. The code was the following:

The first thing I thought was to switch to ByteString: since we don’t need to deal with UTF-8 encoding (the strings to generate were plain ASCII strings in the range a-z) we can avoid paying the overhead Data.Text introduces. After that, the code was a slightly faster, but nothing thrilling (about 7 seconds on my machine).

The crucial hint was gave from Gregory Collins, who suggested that System.Random was very slow. He proposed, as better alternative, the mwc-random package, and the fact Bos was the author was a guarantee. After a bit of struggling I ended up with the following code:

The are a couple of interesting points:

Not only is our code cleaner, but is also a great deal faster! On my machine generating 10000 words takes more or less half a second!

Well, a lot faster than PHP, as expected.

Loved this post? Stay update!