Monday, May 28, 2012

Type Inference

I just finished implementing a type inference algorithm based off of FALCON  [1]. I like this algorithm for several reasons
  1. The efficiency is O(n): This is important because we should minimize JIT overhead.
  2. Decouples type inference from code generation: This will make it easier if we decide to switch from LLVM to a GNU equivalent.
  3. Type bounds are good: The algorithm is able to assign different types to the same variable when the variable is used in different contexts.
The algorithm works by introducing a type lattice (also similar to the MaJIC and McVM MATLAB JIT compilers[2][3]). For example, the simple lattice I am currently using
Each variable in the SSA is then assigned a type from our lattice. For example, in the code
b = 1;
for i=0:5
  temp = 5;
  b = b + temp;

  temp = [5 5 5];
  b = b + temp;
The algorithm is able to differentiate between the two occurrences of temp. The algorithm will also realizes that b is a matrix inside of the loop (once matrices are added to type type lattice). Then before the start of the loop, b, will be converted to a single element matrix. This means we can generate matrix addition operations inside of the loop.

[1] L. D. Rose and D. Padua. Techniques for the Translation of MATLAB Programs into Fortran 90. ACM Transactions on Programming Languages and Systems, 21(2):286–323, Mar. 1999.
[2]George Almási and David Padua. 2002. MaJIC: compiling MATLAB for speed and responsiveness. SIGPLAN Not. 37, 5 (May 2002), 294-303.
[3] M.Sc. Thesis - McVM: an Optimizing Virtual Machine for the MATLAB Programming Language

Tuesday, May 22, 2012

Initial Work

Hi, I am Max Brister, a student working on JIT compilation in GNU Octave for google summer of code. My project proposal can be found on melange. In this blog I plan to explore implementation details, mark my progress, and present intermediary results.

I have already implemented a simple proof of concept which shows some promise. The simple script
a = 1;
b = 1;
for i=1:10000
  for j=1:10000
    a = a + b;
executes on my computer in 0.178s using my jit branch, and in 253.124s using Octave 3.6.1. A speedup of 1422 is not bad.

There is still quite a ways to go before the code is ready for users though. I need to take another look at type inference, ensure error cases are handled correctly, and the current implementation requires the inner loop bounds to be constant for type inference.

You can view my current progress by checking out my code from
hg clone
The specific revision this post refers to is cba58541954c.

I realized that I should mention that a speedup of 1422x is really a best cast speedup. Code which is already vectorized will see a much smaller speedup (if any). This is because Octave already efficiently implements vectorization.