Thursday, June 30, 2011

Get the Intersection Points of Two Link List

Problem:  There are two link lists that have an intersection point (they have the same end), which makes them forms a "Y" shape. Find the intersection point.

Solution:  Definitely we can traverse one list and hash all the nodes. Then traverse the other list. But if we want a O(1) space complexity, we can try the following two methods:

  1. Traverse both lists to get the their length. Then move the pointer from the head of the longer list by len(long_list) - len(short_list) steps. Then move both pointers in parallel and compare.
  2. Traverse one list, make the last node point to the head to form a circle and remember the length l. Move one pointer l steps, starting from the head of the second list. Then also start to move a second pointer, still from the head of the second list. Two pointers move in parallel. The node they meet each other is the intersection point.

Tuesday, June 28, 2011

Segment Tree and Interval Tree

To build a Segment Tree, first sort all the end points of the segments, than adding -infinity and +infinity. For n segments, we then partition (-infinity, +infinity) into 2n+1 segments. These 2n+1 segments serve as elementary intervals. From them (considering them as leafs),  we can build a balanced binary tree. Then we add the n segments into the tree. Each node represents an interval. Non-leaf nodes represent the interval which is the union of their children. When adding a segment, we store the information of that segment in all nodes where the nodes' intervals are within the segment's interval. Besides, the nodes' parents' interval are not within the segment's interval.

  • building cost is O(nlogn), space cost is O(nlogn), the cost to find all segments that contain a point (stabbing query) is O(k+logn)
  • can also solve box overlap problem, Klee's measure problem.
  • For multi-level segment tree, first build a base segment tree on one axis. Then build small segment tree on each node of the base tree, only for the segments stored on that node.
  • For more details, please check http://www.cnblogs.com/TenosDoIt/p/3453089.html (in Chinese).
To build an Interval Tree, it is a little bit similar. First sort all the end points of the intervals and find the median points. Every node in the interval tree is associated with a point. And the node contains all the intervals that cover the point. The left child (sub-tree) of the node contains all the intervals whose upper bounds are smaller than the node's point. Similarly, we can define the right child.
  • building cost is O(nlogn), space cost is O(n), the cost to find all intervals that overlap with an interval   (range query) is O(k+logn). 
  • The result is a ternary tree with each node storing:
    • A center point
    • A pointer to another node containing all intervals completely to the left of the center point
    • A pointer to another node containing all intervals completely to the right of the center point
    • All intervals overlapping the center point sorted by their beginning point
    • All intervals overlapping the center point sorted by their ending point
  • If we need to do a range query with interval tree, see we want to find the set of r that overlap with interval q. We first to sort all the start/end points of intervals, then find those have either start/end inside q. Second, we need select one point in q, do the query on the interval tree to find all the intervals that have the point. At the end, we need to merge results and do the dedup.
See Also: Klee's measure problem



KMP algorithm

KMP algorithm stands for Knuth–Morris–Pratt algorithm. The problem it wants to solve is to search for a word W in a text string S efficiently. The followings only give the main idea of this algorithm:

  1. The essence of the algorithm is to try to avoid comparing a character more than once. Assume we have two cursors m and i. m points to the start matching position for W within S. n points to the current location within W where we fail (we only had a partial match). To search for W in the rest of S, we don't need to start the next match check beginning at m+1, we can choose the position in a smart way.
  2. To be smart, we need the help from an auxiliary table T . T has the same length as W. T[i] means if the match fails at the position i, how much we need to look back (reset the start cursor). Thus, m will be reset as m+i-T[i]. Moreover, we don't need to start matching process from the new start position, instead, we can just start from the new i that equals toT[i].
  3. This table is mainly for those patterns which contain some non-prefix substring that are also some prefix of the patterns. For example,  in "ABCDABD", substring starting at position 4, "AB", is also the prefix "AB". On the contrary, for "ABC", there are no need to fall back.
  4. The time complexity is O(n+k), where n and k are the length of S and W, respectively.
The following gives the table building algorithm:
    table[0] = 0;
        table[1] = 0;
    
        int cnd = 0;
    
        for(int n =2; n<p.size();)
        {
           if(p[n-1] == p[cnd])
           {
              cnd++;
              table[n] = cnd;
              n++;
           }
           else if(cnd>0)
           {
             // This is the complicated part, if we can't find 
             // a match,  we need to gradually fall back, 
             // to look at shorter string to see if we can 
             // find a match.
             cnd = table[cnd];
           }
           else
           {
              cnd = 0;
              table[n] = cnd;
              n++;
           }
    
        }
    
    

    Reservoir Sampling

    Reservoir sampling is a sampling algorithm that randomly pick k elements out of a set of n elements. It is adapted from Knuth Shuffling. The key idea is to maintain a buffer with size k, then replace the elements in buffer with a descending probability. The application of this algorithm is when sampling a finite subset from streaming data. Reservoir sampling is very useful for online sampling on streaming data.

    int buf[k];
    //initialize the buffer
    for(int i=0; i<k; i++)
       buf[i] = a[i];
    
    for(int i=k; i<n; i++)
    {
       int idx = random(0, i);
       if(idx<k) swap(buf[idx], a[i]);
       
    }
    

    The Myth of memset()

    memset() according to its specification:
                         void * memset ( void * ptr, int value, size_t num );
    However, the value will be converted to unsigned char before calling the actual memset().

    Monday, June 27, 2011

    "Mutable" Key Word

    mutable allows a class data member to be modified in a const function.
    Example:
    Class A{
       private:
          mutable int x;
    
       public:
          void foo(int a) const
          {
             x = a;
          }   
    }
    

    Lights and Switches

    Problem: Four lights in a room and four switches outside the room that control the light. You can only enter the room to check if the lights are on. If you are outside, you don't know. How to know the mapping from switches to lights by just entering the room once?

    Solution: The key is to find two binary classifier, then we can encode 4 lights. One classifier is On/Off. The other is Cold/Hot. So we can turn on two switches for a while, let the lights become hot. Then turn off one of them. Then turn on one of the the rest two switches.Then rush into the room to check. There must be four different statuses: On and Hot, Off and Hot, On and Cold, Off and Cold.

    Efficient Way to Calculate Power(x,y)

    Problem: Give your own implementation of power(x,y), where y is a nature number.

    Solution: There is a neat solution:

    double power(double base, int exp)
    {
       int res = 1;
       
       while(exp)
       {
         if(exp & 1) res *= base;
    
         exp >>= 1;
         base *= base;
       }
    
       return res;
    }
    

    Find the Super Cool Number

    Problem: A Super Cool number is a nature number which can be represented as a^b (a and b are both nature number and b>1. Given a nature number N, find the super cool number that is closest to N.

    Solution: The key is to know the upper bound of the exponent, ceiling (log_2_(N)). If we have functions such as log(), then it is easy. Otherwise, we need to guess an upper bound, we can use N/2.  Then we do a series of binary search,

    • find a value n between 1 and N such that n^exponent is closest to N.
    • for all the possible exponent, repeat the previous step.
    • binary search over the exponent is quicker than search over the base (i.e., 1~N)
    Then the total complexity is around O(logN*logN).

    Efficient Way to Validate Sudoku Results

    Problem: If you are playing a Sudoku game, provide a way to validate if a configuration is a valid solution to Sudoku?

    Solution: Basically, there are 27 zones (9 columns, 9 rows and 9 squares) you need to check. To check each zone, we can use a bit vector (2 bytes long) to check duplicates. For every number n in the zone, we do 1 <<  n and check whether the nth bit of current flag is "1" or "0" (by using "&"). If it is "1", then duplicates detected. Otherwise we update the current flag and move on.

    Caution: Some one propose to calculate the sum and product of all the numbers in a zone, to see if the sum are produce are 45 and 9! respectively. However, this doesn't guarantee a solution to Sudoku. For example, 4+4+4+1+9+5+9 +7 +2 = 45 and also their product equals to 9!. Moreover, the time complexity is of the similar level as the solution proposed.

    Thursday, June 23, 2011

    Shoot the Car

    Problem: There is a infinite road. At t0, a car is at coordinate a. It will move towards either left or right (you don't know) at a constant speed b. You don't know the value of a and b. You just know they are integers. You have a gun and at t1, t2, ... tn, ..., you can shoot at any coordinate on the road. Can you find a way to finally shoot the car within finite time?

    Solution: Since a and b will be finite, so the search space is finite. Naively, you can try to traverse a matrix M*N. At each tick, you try to shoot at M[i] + t*N[j]. If you fail to shoot the car after traversal, expand the matrix and traverse the part you haven't visited.

    While more elegant way is try to order the states you want to search. We can start from (0,0), then (1,0) -> (1,1) -> (0, 1) -> (-1, 1) -> (-1, 0) -> (-1, -1) -> (0, -1) -> (1, -1) -> (2, 0) -> .... Just start from the center and swirl!

    Single Precision and Double Precision Floating Number

    • Single precision floating number has 1 signed bit, exponent width 8 bits, fraction 23 bits, significand 24 bits. The value is  (from Wikipedia)
    • Double precision floating number has 1 signed bit, exponent width 11 bits, fraction 52 bits, significand 53 bits.The value is (from Wikipedia)

    A Coin, First Throw Get Head, Bet the Second Thow

    Problem: A coin (you don't know it is a fair one or biased one), in the first throw, you got a "head". What will you bet for the second throw and what is the probability to win?

    Solution: This is a conditional probability problem. Assume the probability to get head is p. f(p) is a uniform distribution. P(H2/H1) = P(H2H1)/P(H1) = int_{p^2*f(p)*dp} / int_{p*f(p)*dp} = (1/3) / (1/2) = 2/3.

    Wednesday, June 22, 2011

    4 Locks on a Treasure Chest

    Problem: A round treasure chest has 4 locks. The distance between any adjacent locks are 90 degree. You can't see the status of the locks because they are inside. You can only touch them and set them. Every time, you can chose to touch any two locks and set them. A lock can be set to either 1 or 0. If all the four locks are set as 1 or 0, the chest will open. Once you had set two locks, the round chest will rotate. The rotation distance is random. Is there a way you can guarantee to open the chest?

    Solution: The initial status is unknown, but definitely it is not 4 "1" or "0".  Try to think from the last step (under what condition, you can definitely open the box with one more move). We can do as follow:

    1. Set two adjacent locks to "0", if still not open, go to step 2.
    2. Set two locks that are opposite to each other to "0", if still not open, go to step 3.
    3. If you reach here, it means there is one "1" and three "0". Try two adjacent locks, if one is '1" and one is "0", Bingo! Set the "1" to "0". But if you encounter two "0", you need to set one of them to "1". Then there are two cases: a) "1 0 1 0"  and b) "1 1 0 0". Let's go to step 4.
    4. if it is a), then we try to set two locks that are opposite to each other to either "0" or "1". We are done! If it is b), we still try to set two locks that are opposite to each other. Basically, we just toggle them (or do nothing). Then step 5.
    5. So we still have  "1 1 0 0". This time try to set two adjacent locks, if we encounter two "1" or two "0", we are done. If not, we toggle them. Then we have "1 0 1 0". Then go to step 5.
    6. Try to set two locks that are opposite to each other, we got the treasure.
    Extension Question: One condition in previous problem is actually redundant. Say you can't feel the lock (you can't tell the status of the lock) and you can just toggle it. Every time you can toggle 1 or 2 locks. Still, we can solve it in 7 steps:
    1. Toggle two opposite ones.
    2. Toggle two adjacent ones.
    3. Toggle two opposite ones.
    4. Toggle a random one.
    5. Toggle two opposite ones.
    6. Toggle two adjacent ones.
    7. Toggle two opposite ones.
    Generally, if the initial status is unknown, there are two type of cases: 1) two "1" and two "0" 2) three "1" (or "0") and one "0" (or "1"). For the case 1, setp 1~3 can guarantee we finish the game. Step 4 is to handle case 2). After step 4, we are facing case 1), so just repeat the three steps again. 

      Next Permutation of an Integer or string

      Problem: Given an integer(or string), get the integer that is just bigger than it (lexicographical order).
      Example:  given 507632, we need find 520367; given 123, we need find 1023.  
      Solution: Basically, we need something like next_permutation in STL, but we can implement by our own.

      1. Start from the end, we try to find the longest non-increasing sequence. Then the integer (or string) is divided into halves. For the tail part, we can't find one that is bigger than it. Then we need to think of the front part.
      2. We swap the last element a of the front part with the element b in the tail part that is the smallest one but bigger than a. swap a and b. The tail part is still a non-increasing sequence. We just reverse it and append it to the front part. The reverse is neat!
      3. for the case where the integer (or string) is already the biggest one, we need to add zero.
      The code is as follow (not consider the case that needs adding zero):

      char* next_perm(char* s, int len)
      {
      
          int i = len-1;
      
          while(i>=1 && s[i-1] >= s[i])
                i--;
      
          if(i==0)  return 0;
      
          int lo = i;
          int hi = len - 1;
          
          //binary search
          int c = s[i-1];
          
          while(lo < hi)
          {
            int mid = lo + (hi - lo)/2;
            if(s[mid] <= c)
                 hi = mid-1;
            else if(s[mid] > c)
            {     
               if(mid+1 <= hi && s[mid+1] <=c)
               {   
                 lo = mid;
                 break;
               }
               lo = mid+1;
            }
          
          }
          
          s[i-1] = s[lo];
          s[lo] = c;
      
          //reverse
          lo = i;
          hi = len - 1;
      
          while(lo<hi)
          {
             char t = s[lo];
             s[lo] = s[hi];
             s[hi] = t;
             lo++;
             hi--;
          }
      
          return s;
      }
      
      

      Tuesday, June 21, 2011

      Print all the combination from a candidate set that sum to a target value

      Problem: Given a target set of integers (may contain negative integers) and a target value, print all the combinations that sum to the target value. Each element can only be used once in a combination. For the same combination, only print once.

      • Example:  candidate set [-1, 3, 2, 1, 5], target value 4. Then we can have 1+3, -1+2+3

      Solution: Recursion is used to solve this problem. Also first sort the candidate set.

      • Assume we have a prev_sum (the sum of the previous integers we had added up) and current integer we want to add, if prev_sum + current integer > target value, we return.
      • if prev_sum + current integer = target value, we print out the combination and also return. No need to explore further, since the candidate set is sorted.
      • if prev_sum + current integer < target value, we take the current integer, update prev_sum, and also update an array that is used to track the combination, then we make a recursive function call
      • an array, int index[] is used to track the integers we had used.  The advantage of this array is simplicity. Definitely we can use STL vectors.
      • If the target value is negative, taking the above approach naively will have some problem (see chunjie's comment below). There are different ways to handle this case. One way used below is to convert the target value into positive and also convert the whole set of integers by multiplying with a "-1".
      The related code is as follow:

      
      
      
      void print_comb(int index[], int n, vector<int> set, int negative)
      {
         for(int i =0; i<=n; i++)
           cout << (negative * set[index[i]]) << ((i<n)?'+':' ');
           cout << endl;
      }
      
      //assume a sorted list
      void _print_all_combine_n(int target, int negative, int n_prev, int index[], vector<int> set, int pos, int n)
      {
      
         for(int i = pos; i < set.size(); i++)
         {
           if(n_prev + set[i] == target) 
            {
               index[n] = i;
               print_comb(index, n, set, negative);
               return;
            }
            
            // we don't need to look at later cases, since they are not possible
            if(n_prev + set[i] > target)        
                  return;
      
            // skip duplicate numbers before advancing, since we only use each number once
            int m = i+1;
            while(set[m] == set[i] && m < set.size())
                 m++;
            i = m-1;
      
            if(n_prev + set[i] < target)
            {
               index[n] = i;
               print_all_combine_n(target, n_prev+set[i], index, set, i+1, n+1);
      
            }
      
         }
      
      }
      
      //assume a sorted list
      void print_all_combine_n(int target)
      {
      
         int index[10000];
         index[0] = 0;
         vector<int> set(myints, myints + sizeof(myints) / sizeof(int) );
         
         int negative = 1;
      
         if(target < 0) 
         {
            for(int i=0; i<set.size(); i++)
               set[i] = (-1)*set[i];
      
            negative = -1;
         }
      
         sort(set.begin(), set.end());
      
         print_all_combine_n(target * negative, negative, 0, index, set, 0, 0);
      }
      
      int main()
      {
      
         int myints[] = {10, 1, 2, 7, 6, 1, 5, -3};
      
         
         print_all_combine_n(8);
      
      }
      
      
      

      Multiple Inheritance, Virtual Inheritance and Private Inheritance

      • Multiple Inheritance refers to a class can inherit features and data members from multiple classes. For example, we have class Animal, Mammal and Carnivore. Mammal and Carnivore both derived from Animal. We further have a class Tiger inheriting both Mammal and Carnivore.  For example, 
                                                                Animal (eat())
                                                                /             \
                                   Mammal (eat())      Carnivore (eat())
                                                                          \             /
                                                                             Tiger (eat())

      • If we have the above inheritance hierarchic, if the base class Animal has a virtual method eat(), both Mammal and Carnivore will have its eat(), which lead to that in the Tiger class, when its instance call eat(), it doesn't know which one to call. It can only call t.Mammal::eat() or t.Carnivore::eat(). In order to solve this Diamond problem, Virtual Inheritance is introduced. If we have  "class Mammal :: public virtual Animal"  and "class Carnivore :: public virtual Animal", then for the later Tiger class, we won't have this ambiguity.  
      • When you use Private Inheritance (e.g., class A :: private  B), all the public methods and data members in B will be private after inheritance. This is so called "Implement A in term of B". And the implementation of B is not exposed after private inheritance. A better way is to use "Inclusion", try to make B a private data member of A. For certain situation, inclusion is not possible, so you have to use private inheritance.

        The difference between Static and Non-Static methods

        1. Static methods are owned by the class, while non-static methods are owned by the class instances.
        2. When passing parameter to non-static methods, an implicit "this" is passed which points to the class instance. Therefore, non-static methods can use "this" to access all those non-static members (methods).
        3. However, for static methods, "this" is not passed, so static methods have no way to access those non-static members (methods). On the other side, non-static methods can always access static members (methods).

        Monday, June 20, 2011

        NXN Boggle Game

        Problem: Solve the boggle game with 5X5 square. Each cell in the square is a letter (from A-Z).  The interior cells can have 8 directions (including diagonal)  as next move, the border cells have 5 or 3 directions. Start from any cell, just go with the possible directions, you will have a path. The path can only contain a cell once (you can't visit the cell you had visited). If the path represents a valid English word, output it:

        Solution: Use recursion to solve this problem. Also a trie is used.

        • Globally, we have a hash table for all valid words (paths) and a trie to store the dictionary
        • Start from each cell, we have current_word (the path we had visited), all the visited paths (can be implemented as a hash table or a matrix).
        • Start from that cell, we append the cell to the current_word. If current_word is not a visited path, we update visited path, otherwise we return.  If the path is over the maximum length (N*N), we return.
        • Then check if current_word is in trie (we can have a recursive checking method), if not, we return. The use of trie here can save us time from exploring some invalid path (not a word). 
        • If current_word is a valid word, we update the global hash table.
        • Then we explore the all possible directions and make those recursive calls.
        • For each cell (as the starting cell), we repeat the above process.

        Link all the nodes at the same level for a binary tree

        Problem: Given a binary tree, assume you have one extra pointer (TreeNode *next), link all the nodes at the same level by using this next pointer. No other extra space is allowed (e.g., queue)


        Solution:  if the tree is a full binary tree, where all non-leaf nodes have two children, the solution will be easier. You can find it at ihas1337code. If the tree is not necessarily a full binary tree, the solution is more difficult. But here we can still give an iterative solution. The key here is to leverage the established "next" pointer. When we are visiting the ith level (assume all the "next" pointers at ith level have been established), we construct the the "next" pointers at i+1th level. The details of the algorithm are as follow:
        • if a node has both left child and right child, then left child's next should link to right child. This is the most straightforward case.
        • if a node only have one child, to assign the child's next, we need to look at the node's siblings. basically, we need to find the first non-leaf sibling of the node, then find its left-most child as the target of the current node's child's next pointer. 
        • We construct the next pointer level by level. When we are visiting the ith level (assume all the "next" pointers at ith level have been established), we construct the the "next" pointers at i+1th level. During this process, we also record the first non-leaf node at i+1th level. Since when we proceed to visit i+1th level, we need to start from that node. If such node can't be found, it means we finish linking the tree by level.
         The code is as follow (may not be the optimal):

        typedef struct TreeNode{
                struct TreeNode *left;
                struct TreeNode *right;
                struct TreeNode *next;
                int value;     
        }TreeNode;
        
        
        bool isLeaf(TreeNode *root)
        {
            return root->left==NULL && root->right==NULL;
        }
        
        TNode * leftmostChild(TreeNode *root)
        {
          return root->left? root->left:root->right;
        }
        
        
        TreeNode * real_next_non_leaf(TreeNode *root)
        {
            while(root->next && isLeaf(root->next)) root = root->next;
            
            return root->next;
        
        }
        
        void link_tree_level(TreeNode *root)
        {
             if(!root || !isLeaf(root)) return;
             
             TreeNode *firstp = leftmostChild(root);
             
             while(true)
             {
                 TreeNode *nnext;
                 while(root)
                 {
                     if(root->left && !root->left->next) 
                     {
                       if(root->right)  
                       {
                          root->left->next = root->right;
                          nnext = real_next_non_leaf(root);
                          root->right->next = nnext ? leftmostChild(nnext) : nnext;
                          root = nnext;              
                       }
                       else
                       {
                          nnext = real_next_non_leaf(root);
                          root->left->next = nnext ? leftmostChild(nnext) : nnext;
                          root = nnext;
                       }
                     }
                     else if(root->right)  
                     {
                          nnext = real_next_non_leaf(root);
                          root->right->next = nnext ? leftmostChild(nnext) : nnext;
                          root = nnext;              
                    }
                 }
                        
               
                while(firstp && isLeaf(firstp))
                            firstp = firstp->next;
               
                if(!firstp) break;
                root = firstp;
                firstp = leftmostChild(root);
               
             }
         
        }
        
        

        Friday, June 17, 2011

        Something about Quicksort

             1. Why the average time complexity is O(NlogN)?
        • Think the whole process as a tree-like structure, in each iteration, quicksort can divide the problem in half, so the depth of the tree is logN, then in each depth level, essentially N comparison needs to be done. Therefore the average time complexity is O(NlogN).
        • Use Master Theory, T(N) = 2*T(N/2) + N. Then we have T(N/2) = 2*T(N/4) + N/2. Therefore, we can derive T(N) = 4*T(N/4) + 2*N = 8*T(N/8) + 3N = .... Then eventually we can have T(N) = a*T(1) + b*N, where a = O(2^(logN)), b =O(logN). So  T(N) = O(NlogN).
           2. When is the worst case O(N^2)?
        • if in each iteration, we always happen to select the smallest (or biggest, depends on how you do quicksort), then you can't effectively break the input array into two sub arrays. When you have N elements in the array, you can only get an array of size 1 and an array of size N-1. Then the cost is O(N^2). Especially when you try to sort a sorted list and you always choose the lowest index as pivot, you will always have O(N^2).

        Why there is no virtual constructor in C++?

        Answer: If a function is virtual, there needs to be a pointer to the virtual function table so that you can look up in the vtable. However, when the constructor is executed, the pointer to vtable is not there yet, so you can't be able to locate vtable. Besides, constructor is usually in-lined. If it is a virtual function, it can't be in-lined

        Polymorphism

        Polymorphism means different types of data can be handled with an uniform interface. The interface is the same, but the implementation can be different. There are three types of polymorphism:
        1. ad-hoc polymorphism: functions with same name but different parameter lists (i.e. function overloading).
        2. parametric polymorphism: generic data types, for example, template in C++,  which just provides a general data type.
        3. subtype polymorphism: different class objects (classes should have inheritance relations) can be applied with the same interfaces. For example, the virtual functions in C++.

        Calculate the multiplication of A[1], A[2]...A[N] but without A[i]

        Problem: An array of numbers, A[1], A[2]...A[N], provide a O(N) algorithm to calculate C[i] = A[1]*A[2]...A[N]/A[i]. Division is not allowed and no extra memory space allowed.

        Solution: An O(N) time and O(N) space algorithm may not be hard. It needs some thoughts to get an algorithm which doesn't require extra memory space. Here is a brilliant way from ihas1337code:

        int C[N];
        int left, right;
        left = right = 1;
        
        for(int i=0; i<N; i++)
        {
           C[i] *= left;
           C[N-i-1] *= right;
           left *= A[i];
           right *= A[N-i-1];
        }
        

        Thursday, June 16, 2011

        Something about conditional variables in PThread

        Step to use conditional variables:
        1.  Declare and initialize global data/variables which require synchronization (such as "count")
        2. Declare and initialize a condition variable object
        3. Declare and initialize an associated mutex
        4. Create threads A and B to do work
        Step for thread A:
        1. Lock the mutex
        2. Check the global data (e.g., counter), if not fulfilled, call pthread_cond_wait()pthread_cond_wait() will automatically release the lock and put the thread into block state.
        3. Get the signal from thread B and wake up. After waking up, the thread will automatically grab the mutex.
        4. Unlock the mutex and proceed.
        Step for thread B:
        1. Lock the mutex
        2. modify the global data (e.g., counter), if certain threshold is met, call pthread_cond_signal() or pthread_cond_broadcast to notify the blocked thread(s). pthread_cond_singal() is for one thread and pthread_cond_broadcast() assume there are multiple threads waiting. 
        3. Unlock the mutex and proceed.

        Wednesday, June 15, 2011

        Rotate an image 90 degree clockwise

        Problem: an image stored as a M*N matrix (two dimensional array), give an in-place algorithm to rotate it 90 degree clockwise.

        Solution: First assume a simpler case where M=N, we can only use O(1) memory. Basically, each iteration we change the location of four elements (sometime, these four elements may just points to the same one): a[x,y] -> a[y,N-1-x] -> a[N-1-x, N-1-y] -> a[N-1-y, x] -> a[x,y]. The algorithm is as follow:

        
        for(int x =0; x<n; x++)
        {
           y = x;
           if(y>=n-x-1) break;
           for(; y<n-x-1; y++)
           {
              temp = a[x,y];
              a[x,y] = a[y,n-1-x];
              a[y,n-1-x] = a[n-1-x,n-1-y];
              a[n-1-x,n-1-y] = a[n-1-y,x];
              a[n-1-y,x] = temp;
           }
        }
        
        
        If N!=M, things will be complex, we may need an auxiliary array to track if we had visited some elements or not, the algorithm (may not be optimal) is as follow:
        int visited[N][M];
          int i,j;
          for(i=0; i<N; i++)
          for(j=0; j<M; j++)
                visited[i][j] = 0;
        
          for(i=0; i<N; i++)
          for(j=0; j<M; j++)
          {
                if(visited[i][j]==0)
                {
                   int tmp = a[i][j];
                   int c_i = i;
                   int c_j = j;
        
                   do
                   {
                     int ii = c_j;
                     int jj = N - 1 - c_i;
        
                     int to_i = (ii*N + jj + 1)/M;
                     int to_j = (ii*N + jj + 1)%M;
        
                     if(to_j==0)
                     {
                        to_i--;
                        to_j = M - 1;
                     }
                     else to_j--;
        
                     int tmp1 = a[to_i][to_j];
                     a[to_i][to_j] = tmp;
                     visited[to_i][to_j] = 1;
                     c_i = to_i;
                     c_j = to_j;
                     tmp = tmp1;
                   }
                   while(!(c_i==i && c_j==j));
          }
        }
        

        Related:  In-place matrix transposition

        M balls and N boxes

        Problem: There are M balls and N boxes. Each ball is randomly placed into a box. What is the expected number of boxes that have at least one ball inside?

        Solution: We can create a series of identification variables, Ii, which denotes for the i th box, if there is at least one ball inside. If yes, Ii = 1, otherwise Ii = 0. Then our problem is to calculate the expected value to I, where I = I1 + I2 + ... + IN. There is symmetric here, so E(I) = N*E(Ii) = N*(1-(N-1/N)^M)

        Related: How many boxes you need to have in order to collect all the toys?

        Saturday, June 4, 2011

        How many boxes you need to have in order to collect all the toys?

        Problem: Mike likes potato chips and also like the free toys piggybacked with the chips. There are N different toys in total. In each box of potato chips, there will only be one toy and the type of the toy is randomly decided. If Mile wants to collect all the N different toys, how many boxes of chips he needs to buy (on average)?


        Solution: We can solve the problem by using Geometric Distribution. After we had collected i-1 toys ( i <=N), the number of  boxes we should get in order to get the i th toy conforms to a geometric distribution. The probability for the geometric distribution is (N-i+1)/N, thus the expected number of boxes to get the i th toy is N/(N-i+1). Then we can construct a random variable X and X = X1 + X2 + ... + XN, where Xi means the number of  boxes we should get in order to get the i th toy when we already have i-1 different toys. So E(X) = E(X1) + E(X2) + ... + E(XN) = N/N + N/N-1 + ...... + N/1 = N*(1/N + 1/(N-1) + ... + 1)

        Related: Coupon collector's problem, M balls and N boxes