This post covers how to use BLAS
and LAPACK
routines with ArrayFire-rb
, and how these functions
have been implemented by creating Ruby bindings to ArrayFire C API.
The performance benchmarks for ArrayFire against NMatrix can be represented by the following figures. The code used for benchmarking and generating the plots can be found here and be used to reproduce similar plots.
(Note: The above benchmarks have been done on an AMD FX 8350 octacore processor and Nvidia GTX 750Ti GPU. CUDA backend of ArrayFire was used with double floating points.)
The figure shows that ArrayFire takes the least computation time of all. ArrayFire is 3 e +6 times faster than NMatrix for JRuby and NMatrix for Ruby(not BLAS) whereas 7 e +5 times faster than NMatrix for Ruby(using BLAS).
For LAPACK routines, like calculating determinant and lower-upper factorization, ArrayFire is 100 times faster than NMatrix for JRuby whereas 6 times faster than NMatrix for Ruby(using LAPACKE).
Lets take a look at the implementation.
All the BLAS methods are singleton_methods
.
void Init_arrayfire() {
ArrayFire = rb_define_module("ArrayFire");
Blas = rb_define_class_under(ArrayFire, "BLAS", rb_cObject);
rb_define_singleton_method(Blas, "matmul", (METHOD)arf_matmul, 4);
rb_define_singleton_method(Blas, "dot", (METHOD)arf_dot, 4);
rb_define_singleton_method(Blas, "transpose", (METHOD)arf_transpose, 1);
}
arf_matmul
and arf_transpose
call af_matmul
and af_transpose
respectively.
static VALUE arf_matmul(VALUE self, VALUE left_val, VALUE right_val, VALUE left_prop_val, VALUE right_prop_val){
afstruct* left;
afstruct* right;
afstruct* result = ALLOC(afstruct);
Data_Get_Struct(left_val, afstruct, left);
Data_Get_Struct(right_val, afstruct, right);
af_mat_prop left_mat_prop = arf_mat_type_from_rbsymbol(left_prop_val);
af_mat_prop right_mat_prop = arf_mat_type_from_rbsymbol(right_prop_val);
af_matmul(&result->carray, left->carray, right->carray, left_mat_prop, right_mat_prop);
af_print_array(result->carray);
return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result);
}
static VALUE arf_transpose(VALUE self, VALUE input){
afstruct* obj;
afstruct* result = ALLOC(afstruct);
Data_Get_Struct(input, afstruct, obj);
af_transpose(&result->carray, obj->carray, false);
af_print_array(result->carray);
return Data_Wrap_Struct(CLASS_OF(input), NULL, arf_free, result);
}
af_matmul
expects the property of matrices passed to it. For example, af_mat_prop AF_MAT_TRANS
signifies that the matrix is transposed. So, to pass the symbols to Ruby C function arf_matmul, I
have used the code given below. I have creared a list of values MAT_PROPERTIES
and a function
arf_mat_type_from_symbol
which is responsible for figuring out af_mat_prop
from the symbol.
std::map<char*, size_t> MAT_PROPERTIES = {
{"AF_MAT_NONE", 0},
{"AF_MAT_TRANS", 1},
{"AF_MAT_CTRANS", 2},
{"AF_MAT_CONJ", 4},
{"AF_MAT_UPPER", 32},
{"AF_MAT_LOWER", 64},
{"AF_MAT_DIAG_UNIT", 128},
{"AF_MAT_SYM", 512},
{"AF_MAT_POSDEF", 1024},
{"AF_MAT_ORTHOG", 2048},
{"AF_MAT_TRI_DIAG", 4096},
{"AF_MAT_BLOCK_DIAG", 8192}
};
af_mat_prop arf_mat_type_from_rbsymbol(VALUE sym) {
ID sym_id = SYM2ID(sym);
for(std::map<char*, size_t>::value_type& entry : MAT_PROPERTIES) {
if (sym_id == rb_intern(entry.first)) {
return static_cast<af_mat_prop>(entry.second);
}
}
VALUE str = rb_any_to_s(sym);
rb_raise(rb_eArgError, "invalid matrix type symbol (:%s) specified",
RSTRING_PTR(str));
}
So, now I can check if the bindings work successfully.
$ rake pry
pry -r './lib/arrayfire.rb'
[1] pry(main)> left = ArrayFire::Af_Array.new 2 , [3,3] , [1, 4, 6, 4, 11 , 2 ,-5, 8, 10]
No Name Array
[3 3 1 1]
1.0000 4.0000 -5.0000
4.0000 11.0000 8.0000
6.0000 2.0000 10.0000
=> #<ArrayFire::Af_Array:0x000000014e56c8>
[2] pry(main)> right = ArrayFire::Af_Array.new 2 , [3,2] , [1, 0, 8, 10, -11, 8]
No Name Array
[3 2 1 1]
1.0000 10.0000
0.0000 -11.0000
8.0000 8.0000
=> #<ArrayFire::Af_Array:0x00000001591db0>
[3] pry(main)> result = ArrayFire::BLAS.matmul(left, right, :AF_MAT_NONE, :AF_MAT_NONE)
No Name Array
[3 2 1 1]
-39.0000 -74.0000
68.0000 -17.0000
86.0000 118.0000
=> #<ArrayFire::Af_Array:0x000000016136f8>
[4] pry(main)> transposed = ArrayFire::BLAS.transpose(left)
No Name Array
[3 3 1 1]
1.0000 4.0000 6.0000
4.0000 11.0000 2.0000
-5.0000 8.0000 10.0000
=> #<ArrayFire::Af_Array:0x00000001762f68>
It works!.
LAPACK routines similar to BLAS ones are singleton methods.
Lapack = rb_define_class_under(ArrayFire, "LAPACK", rb_cObject);
rb_define_singleton_method(Lapack, "svd_func", (METHOD)arf_svd_func, 4);
rb_define_singleton_method(Lapack, "svd_inplace_func", (METHOD)arf_svd_inplace_func, 1);
rb_define_singleton_method(Lapack, "lu_func", (METHOD)arf_lu_func, 4);
rb_define_singleton_method(Lapack, "lu_inplace_func", (METHOD)arf_lu_inplace_func, 1);
rb_define_singleton_method(Lapack, "qr_func", (METHOD)arf_qr_func, 4);
rb_define_singleton_method(Lapack, "qr_inplace_func", (METHOD)arf_qr_inplace_func, 1);
rb_define_singleton_method(Lapack, "cholesky_func", (METHOD)arf_cholesky_func, 3);
rb_define_singleton_method(Lapack, "cholesky_inplace_func", (METHOD)arf_cholesky_inplace_func, 1);
rb_define_singleton_method(Lapack, "solve_func", (METHOD)arf_solve_func, 0);
rb_define_singleton_method(Lapack, "solve_lu_func", (METHOD)arf_solve_lu_func, 0);
rb_define_singleton_method(Lapack, "inverse", (METHOD)arf_inverse, 1);
rb_define_singleton_method(Lapack, "rank", (METHOD)arf_rank, 1);
rb_define_singleton_method(Lapack, "det", (METHOD)arf_det, 1);
rb_define_singleton_method(Lapack, "norm", (METHOD)arf_norm, 1);
rb_define_singleton_method(Lapack, "is_lapack_available", (METHOD)arf_is_lapack_available, 0);
LAPACK.svd
must return multiple objects and hence calls LAPACK.arf_svd_func
. The implementation
is as follows.
static VALUE arf_svd_func(VALUE self, VALUE u_val, VALUE s_val, VALUE vt_val, VALUE val){
afstruct* input;
afstruct* u;
afstruct* s;
afstruct* vt;
Data_Get_Struct(val, afstruct, input);
Data_Get_Struct(u_val, afstruct, u);
Data_Get_Struct(s_val, afstruct, s);
Data_Get_Struct(vt_val, afstruct, vt);
af_svd(&u->carray, &s->carray, &vt->carray, input->carray);
return Qtrue;
}
module ArrayFire
class LAPACK
def self.svd(af_array)
u = ArrayFire::Af_Array.new
s = ArrayFire::Af_Array.new
vt = ArrayFire::Af_Array.new
ArrayFire::LAPACK.svd_func(u, s, vt, af_array)
return u, s, vt
end
end
end
LAPACK.det
calls af_det
to calculate the determinant. Currently, I haven’t
created support for complex dtype. So, I have ignored the imaginary part of
the result if it is not a real number.
static VALUE arf_det(VALUE self, VALUE val){
afstruct* matrix;
double det_real, det_imag;
Data_Get_Struct(val, afstruct, matrix);
af_det(&det_real, &det_imag, matrix->carray);
return DBL2NUM(det_real);
}
Now, we can check the elementwise operations using pry
.
$ rake pry
pry -r './lib/arrayfire.rb'
[1] pry(main)> matrix = ArrayFire::Af_Array.new 2, [4,2], [1,3,5,7,2,4,6,8]
No Name Array
[4 2 1 1]
1.0000 2.0000
3.0000 4.0000
5.0000 6.0000
7.0000 8.0000
=> #<ArrayFire::Af_Array:0x0000000265cbe0>
[2] pry(main)> u , s, vt = ArrayFire::LAPACK.svd(matrix)
=> [#<ArrayFire::Af_Array:0x000000026ff070>, #<ArrayFire::Af_Array:0x000000026ff048>, #<ArrayFire::Af_Array:0x000000026ff020>]
[3] pry(main)> ArrayFire::Util.print_array(u)
No Name Array
[4 4 1 1]
-0.1525 -0.8226 -0.3945 -0.3800
-0.3499 -0.4214 0.2428 0.8007
-0.5474 -0.0201 0.6979 -0.4614
-0.7448 0.3812 -0.5462 0.0407
=> true
[4] pry(main)> ArrayFire::Util.print_array(s)
No Name Array
[2 1 1 1]
14.2691
0.6268
=> true
[5] pry(main)> ArrayFire::Util.print_array(vt)
No Name Array
[2 2 1 1]
-0.6414 -0.7672
0.7672 -0.6414
=> true
[6] pry(main)> left = ArrayFire::Af_Array.new 2 , [3,3] , [1, 4, 6, 4, 11 , 2 ,-5, 8, 10]
No Name Array
[3 3 1 1]
1.0000 4.0000 -5.0000
4.0000 11.0000 8.0000
6.0000 2.0000 10.0000
=> #<ArrayFire::Af_Array:0x00000002984c78>
[7] pry(main)> det = ArrayFire::LAPACK.det(left)
=> 415.9999694824219
It works!
We can now use ArrayFire-rb
for linear algebra since the BLAS and LAPACK routines are in
place. Also, the performance is outstanding.
In the next blog, I will explain about using the Random Engine generator and Statistics methods for ArrayFire.
©2016-2024 Prasun Anand