Wednesday, 19 April 2017

Bacons and Pickles!

For my last adventure in the parallel world, I decided to do something special - get all of you hungry, and also curious about the The Oracle of Bacon. I happened to find this interesting git repository, which calculates the bacon number of actors picked up from the IMDb database. Yes I know; I could not resist a temptation to mess with the code.

As the author has very well stated, this problem can be reduced to that of doing a BFS on an undirected graph, with actors as vertices and common movies as edges. Unfortunately, the immense dataset adversely impacts performance. Putting on profiler glasses, gives us pretty much the same view: maximum time to load input data. But wait, why pickle?

cProfiler for python

The Elephant coming into the Room

After reading almost every post containing 'pickle' on stackoverflow.com, I ventured to alter a few things.

  • Disabling garbage collector : Unreferenced objects invite the garbage collector which might in turn, slower the input. [Source
  • JSON as input : After a night's struggle, I managed to convert the pickle input file into a JSON. This would obviously cause problems when a new dataset is used, but the difference in performance is worth considering.

Input files converted to JSON

A long-shot at parallelism

Having scored with memory, I questioned myself if there was some scope for parallelism. Intuitively, it seemed like a bad idea because the graph was being implemented with huge, in-memory dictionaries, that were perplexing to deal with. Nevertheless, I gave it a try with the following approach.

  1. Given an actor, find his/her co-stars using the actor-to-movies and movies-to-actors mappings.
  2. If you are lucky enough to find Mr.Kevin Bacon in your co-stars, return 1.
  3. Else, do a parallel BFS on his/her co-stars till you find a minimum link. (The actor's bacon number will be that of your co-star's, incremented by 1)
Much to my disappointment, due to a number of factors like python's GIL, the huge dictionaries (being passed as arguments to processes), basic structure of underlying code, I was not very successful. All the same, documentation of these attempts and others are available here.

Happy coding in a parallel world 😊








Monday, 17 April 2017

Just rand()-only!

Life was throwing stones at me, so I turned to the parallel world for a sweet, simple and sexy adventure. Little did I know, that I will get hit by another revolting, wrecking boulder. Let me elaborate!

We have all met prime numbers in life. My adventure started off with a lucid attempt to write a code snippet that counts the number of prime numbers in a range. The vanilla approach involves finding factors of each number; counting primes as you iterate. But who likes vanilla? I wanted chocolate or strawberry!

There are plenty of fancy algorithms for primality testing - of which the I chose Miller-Rabin. Its is based on Fermat's theorem that states that if p is prime, then, ap-1 = 1(mod p) for any integer less than p.

Me : So, all I have to do is check if that condition is satisfied for all a.
The wiser me:  But wait, how is that better from my vanilla method?
Me : Then, maybe I do it for just one such a.
The wiser me : But what if I am unlucky enough to select an a that satisfies on coincidence?
Me : Oh! I get it , I have to choose a small number of such a and deal with the error.
The wiser me : *rolls eyes* Finally!

As you pretty much guessed, asomething huge is pretty knotty to compute. There are tricks such as modular exponentiation, splitting into powers of 2 etc... that our prudent mathematicians have discovered. Let us take a moment to say arigato! Exploiting all such clever techniques, I proudly came up with a working version of prime.c. It was no child's play, especially for someone who can only see, not look!

Get to the parallelism already!

Broadly thinking, I could come up with two ways in which parallelism can be benefitted from.

  • Parallelising the test for the numbers themselves
  • Given a number, parallelising the test for the trials ( the a we spoke about earlier)
I set out to explore the first. After the necessary #pragma mantras, I executed the parallel code. It was slower than the serial code.Yes, that was precisely the boulder I mentioned. I searched. I inspected. I scrutinized. I probed. I found nothing.

Finally, a Mr.Robert Reed opened my eyes with this. rand() was the culprit! I was using rand() to pick those 'a' s and as it turns out, rand() is not thread-safe. Hence, all those calls to rand() were being done in serial. Added with the overhead of creating those threads, my parallel code was bound to be slower.

Moral of the story: Do not assume that you can parallelise anything and everthing; and also by the way, rand() is not always thread safe.

All relevant code and logs wills be found here. Happy coding in a parallel world! 😊