I had a great time at OctConf 2012. There were a lot of interesting people there, and it is nice to be able to put faces to names. I definitively hope I will be able to make it next year.
I recently realized that there are only two weeks left before the GSoC suggested pencils down date (8/13/2012). In the remaining time I plan on focusing my effort on better matrix support and supporting more builtin functions. I should be able to make significant progress on this in the remaining two weeks.
After GSoC is done, I plan on working on compiling user functions and function handles. I think adding support for user functions in JIT is important, but I'm not sure if I will be able to complete it in two weeks.
Friday, July 27, 2012
Tuesday, July 3, 2012
Comparison of JIT with Oct files
In my last post I tested the Octave JIT compiler on a problem presented on the mailing list. I got a request for a comparison with oct files. I think this is an interesting comparison, because ideally the JIT compiler should reduce the need to rewrite Octave scripts as oct files.
When using JIT, the compile time is part of the run time for the first execution of the loop. This means that for this example, JIT is currently about 10 times slower than the oct file. However, if we were to execute the function 50 times on 1,000,000 element vectors, then JIT would be 6 times slower.
After looking at the assembly, it looks like JIT runs into issues with checks for matrix index validity and that loop variables are doubles (in loops like `for ii=1:5' ii is a double). It should be possible to fix these issues in JIT, but it will result in a larger compile time.
Oct file
The oct file is mostly equivalent to the loopy version in my previous post.
#include <octave/oct.h> #include <octave/parse.h> DEFUN_DLD (oct_loopy, args, , "TODO") { feval ("tic"); octave_value ret; int nargin = args.length (); if (nargin != 2) print_usage (); else { NDArray data = args(0).array_value (); octave_idx_type nconsec; nconsec = static_cast<octave_idx_type> (args(1).double_value ()); if (!error_state) { double *vec = data.fortran_vec (); octave_idx_type counter = 0; octave_idx_type n = data.nelem (); for (octave_idx_type i = 0; i < n; ++i) { if (vec[i]) ++counter; else { if (counter > 0 && counter < nconsec) std::fill (vec + i - counter, vec + i, 0); counter = 0; } } if (counter > 0 && counter < nconsec) std::fill (vec + n - counter, vec + n, 0); ret = octave_value (data); } } feval ("toc"); return ret; }
Results
I ran each test five times, taking the lowest time. I have also separated out the compile/link time from the run time. For JIT, compile time was determined by running the function twice, and subtracting the first run time from the second run time. The compile time for the oct file was determined by timing mkoctfile. The initial parameters were a random vector, A, of size 1,000,000 and a K = 3.
Compile time | Run time | |
---|---|---|
JIT | 14ms | 21ms |
OCT | 2400ms | 3.3ms |
When using JIT, the compile time is part of the run time for the first execution of the loop. This means that for this example, JIT is currently about 10 times slower than the oct file. However, if we were to execute the function 50 times on 1,000,000 element vectors, then JIT would be 6 times slower.
After looking at the assembly, it looks like JIT runs into issues with checks for matrix index validity and that loop variables are doubles (in loops like `for ii=1:5' ii is a double). It should be possible to fix these issues in JIT, but it will result in a larger compile time.
Subscribe to:
Posts (Atom)