Chapter 6. Performance

Table of Contents
6.1. High-Level Operations
6.2. Extending Scilab
6.3. Building an Optimized Scilab
 

Scilab—The fastest thing from France since Django Reinhardt.

  Ch. L. Spiel

In this chapter we discuss how expressions can be written to execute more quickly while doing the same thing. Scilab is powerful and flexible, therefore there are plenty of things one can do to speed up function execution. On the downside there are a lot of things the can be done the wrong way, slowing down the execution to a crawl.

In the first part of this chapter, Section 6.1, we focus on high-level operations that are inherently executed fast. The main class to name here are vectorized operations. Another class are all functions that are constructing or manipulating vectors or matrices as a whole. The second part of this chapter, Section 6.2, deals with the extension of Scilab through compiled functions for the sake of increased execution speed. We close with a section on how to compile Scilab itself to increase its performance with Section 6.3.

6.1. High-Level Operations

Not using vectorized operations in Scilab is the main source for suffering from a slow code. Here we present performance comparisons between different Scilab constructs that are semantically equivalent.

6.1.1. Vectorized Operations

The key to achieve a high speed with Scilab is to avoid the interpreter and instead make use of the built in vectorized operations. Let us explain that with a simple example.

Say we want to calculate the standard scalar product s of two vectors a and b which have the same length n. Naive as we are, we start with


s = 0                               // line 1
i = 1                               // line 2
while i <= n                        // line 3
    s = s + a(i) * b(i)             // line 4
    i = i + 1                       // line 5
end                                 // line 6

Here Scilab re-interprets lines 3 to 5 in every round-trip, which in total is n times. This results in slow execution. The example utilizes no vectorization at all. On the other hand it uses only very little memory memory as no vectors have to be stored.

The first step to get some vectorization is to replace the while with a for loop.


s = 0                               // line 1
for i = 1:n                         // line 2
    s = s + a(i) * b(i)             // line 3
end                                 // line 4

Line 2 is only interpreted once; the vector i = 1:n is set up and the loop body, line 3 is threaded over it. So, only line 3 is re-evaluated in each round trip.

OK, it is time for a really fast vector operation. In the previous examples the expression in the loop body has not been modified, but we can replace it with the element wise multiplication operator .*, and replace the loop with the built-in sum function. (See also Section 6.1.3.3.)


s = sum(a .* b) 

One obvious advantage is, we have a one-liner now. Is that as good as it can get? No, the standard scalar product is not only a built-in function it is also an operator:


s = a * b'

We summarize the timing results of a PII/330 GNU/Linux-system in Table 6-1.

Table 6-1. Comparison of various vectorization levels

construct MFLOPS
while 0.005
for 0.008
.* and sum 1.7
* 2.8

In other words the speed ratio is 1:1.6:330:550. Of course the numbers vary from system to system, but the general trend is clear. The figures tell us two things:

  1. Keeping the problem size the same, a vectorized operation is over a hundred times faster than the comparable interpreter (emulated) operation.

  2. In the same time Scilab executes several hundereds or thousands of vectorized operations, it can only run a single interpreted operation.

    
-->n=1000; timer(); for i=1:n, sqrt((i-1)*%pi/n); end; timer()
     ans  =
        0.05
    
    -->n=100000; timer(); sqrt((1:n)*%pi/n); timer()
     ans  =
        0.04  
           
    

The latter point is a valuable starting point for many vectorizations. This holds particularly for partial vectorizations, where the operations under consideration cannot be replaced by a single operator or function call. If a slow interpreted command cannot be replaced by a vectorized operator – which would result in a speed-up of a factor of 500 say, parts of the command might be amenable to vectorization. This partial vectorization can replace parts of the expression with vectorized operations. The important rule is that several hundred up to thousands of vectorized operations can be traded in for the interpreted operation to be replaced.

In the next example the matrix a is treated as a collection of row-vectors. The problem is to subtract row-vector b from the rows in a. Obviously, this can be achieved with a loop. The faster way is to cast b into a matrix of the same shape as a and then subtract the two matrices. What seems to be a detour – duplication the entries of b – turns out to be advantageous for performance.


a = [2.56, 2.85, 2.66; ..
     3.74, 3.25, 3.21; ..
     4.05, 4.89, 4.49; ..
     5.90, 5.94, 5.37];
b = [1.01, 1.67, 1.79];
[m, n] = size(a);

// non-vectorized
c0 = zeros(a);
for i = 1:m
    c0(i, :) = a(i, :) - b;
end
c0

// partial vectorization
c1 = a - b(ones(m, 1), :)

-->m = 1000; n = 200; a = rand(m, n); b = rand(1, n);
 
-->timer(); c0 = zeros(a); for i = 1:m, c0(i, :) = a(i, :) - b; end; timer()
 ans  =
    0.19  
 
-->timer(); c1 = a - b(ones(m, 1), :); timer()
 ans  =
    0.07  

6.1.2. Avoiding Indexing and Resizing

Accessing a single vector-element or matrix-element in a (often even nested) loop is slow. Sometimes the loop/index construct cannot be avoided, but in many cases it can be replaced with an equivalent vectorizable expression. Moreover, if you cannot get around indexing single elements, at least avoid resizing (most often: growing) the vector or matrix. Compare the following three examples.


// (1) insert element at non-existent position => autovivicate element
v = []
for i = 1:n
    v(i) = i
end

// (2) insert into pre-sized vector
v = zeros(1, n)
for i = 1:n
    v(i) = i
end

and


// (3) append to existing vector
v = []
for i = 1:n
    v = [v, i]
end

Snippet (2) is the fastest of the three. It should be used whenever the final size is known in advance, or if the final size can be calculated in an easy way. Appending to an existing vector or matrix (3) is almost twice as fast as forcing a new element to spring into existence by indexing (1). In the authors' opinion, snippet (3) is the clearer solution in comparison to (1) for all problems where the final vector size cannot be determined in advance.

But again for our specific example a built-in operator exists that does the same job at lightning speed: the range operator, colon ":", which is described in detail in Section 6.1.3.1.


// (4) range generator (colon operator)
v = 1:n

The speed ratio of examples (1), (2), (3) and (4) is approximately 1:20:2:4000.

In the next example, Example 6-1, the functions actually try to do something useful: they mirror a matrix along its columns or rows. We show different implementations of mirrorN that all do the same job, but utilize more and more of Scilab's vector power with increasing function index N.

Example 6-1. Variants of a matrix mirror function


function b = mirror1(a, dir)
// mirror matrix a along its
// rows, dir = 'r' (horizontal)
// or along its columns, dir = 'c' (vertical)

[rows, cols] = size(a)
select dir
case 'r' then
    for j = 1 : cols
        for i = 1 : rows
            b(i, j) = a(rows - i + 1, j)
        end
    end
case 'c' then
    for j = 1 : cols
        for i = 1 : rows
            b(i, j) = a(i, cols - j + 1)
        end
    end
else
    error("dir must be ''r'' or ''c''")
end


function b = mirror2(a, dir)
// same as mirror 1

[rows, cols] = size(a)
b = []
select dir
case 'r' then
    for i = rows : -1 : 1
        b = [b; a(i, :)]
    end
case 'c' then
    for i = cols : -1 : 1
        b = [b, a(:, i)]
    end
else
    error("dir must be ''r'' or ''c''")
end


function b = mirror3(a, dir)
// same as mirror 1

[rows, cols] = size(a)
select dir
case 'r' then
    i = rows : -1 : 1
    b = a(i, :)
case 'c' then
    i = cols : -1 : 1
    b = a(:, i)
else
    error("dir must be ''r'' or ''c''")
end


function b = mirror4(a, dir)
// same as mirror 1

select dir
case 'r' then
    b = a($:-1:1, :)
case 'c' then
    b = a(:, $:-1:1)
else
    error("dir must be ''r'' or ''c''");
end
   

Besides the performance issue discussed here the functions in Example 6-1 demonstrate how much expressiveness Scilab has got. The solutions look quite different, though they give the same results. The benchmark results of all functions are plotted in Figure 6-1, and an extensive discussion is found in Section 6.2.1. In brief the functions get faster from top to bottom, function mirror1 is the slowest, mirror4 the fastest.

6.1.2.1. $-Constant

The last of the examples, mirror4, introduces a new symbol, the "highest index", $ along a given direction. The dollar sign is only defined in the index expression of a matrix. As 1 always is the lowest (or first) index, $ always is the highest (or last). The dollar represents a constant, but this constant varies across the expression! More precisely it varies with each matrix dimension. Let us make things clear by giving an example.


-->m = [ 11 12 13; 21 22 23 ];

-->m(2, $)
 ans  =
    23.

-->m($, $)
 ans  =
    23.

-->m(:, $/2 + 1)
 ans  =
!   12. !
!   22. !

6.1.2.2. Reshaping

Reshaping a matrix in Scilab is a cheap operation. A 1000-times-1000 matrix is reshaped into a 2000-times-500, or a 250-times-4000 matrix at very little computational cost. However, keep in mind that the time to reshape is proportional to the total size of the matrix, i.e., reshaping an n-times-m matrix is an O(n*m) operation.

When to use reshaping? If an algorithm that requires multiple indices into a matrix can be mapped onto an equivalent one that accesses a vector, or vice versa, it can be a benefit to work with the more convenient representation and reshape afterwards.

Our example to illustrate this is simple, but gives you the gist of reshaping. Sorting into lexicographical order is most easy done with a vector. (gsort can sort a matrix into lexicographical order, see Section 6.1.3.3.6, but we want to demonstrate reshaping and not the functionality of gsort) To get a matrix where strings of same first letters are in the same rows, we use matrix.


-->perm3 = ['cab', 'bca', 'acb', 'bac', 'cba', 'abc'];
-->sorted_perm3 = gsort(perm3, 'c', 'i');
-->matrix(sorted_perm3, 2, 3)'
 ans  =
!abc  acb  !
!          !
!bac  bca  !
!          !
!cab  cba  !

See also Section 6.1.3.3.8 about matrix, and the following section, Section 6.1.2.3 about the flattened matrix representation.

6.1.2.3. Flattened Matrix Representation

The $ sign leads us to the flattened or vector-like representation of a matrix, if we rewrite the third line of the above example to


-->m(1:$)[1]
 ans  =
!   11. !
!   21. !
!   12. !
!   22. !
!   13. !
!   23. !

The expression u = v(:) is reshape operation, assigning to u the column-representation of v. For general reshaping of matrices, see the matrix function in Section 6.1.3.3.8.

Tip

Given the vector v, the expression v = v(:) is a very convenient idiom in a function to force v into column (i.e. 1-times-N) form.

In general a nxm matrix mat can be accessed in three ways:

  • as a unit by saying mat,

  • by referencing its elements according to their row and column with mat(i, j), or

  • via indexing into the flattened form mat(i).

The following equivalence holds:

 \[ \mathrm{mat}_{i, j} = \mathrm{mat}_{i + (j - 1)n}. \]

Scilab follows Fortran in its way to store matrices in column-major form. See also the discussion of the function matrix in Section 6.1.3.3.

6.1.3. Built-In Vector-/Matrix-Functions

Scilab provides many built-in functions that work on vectors or matrices. Knowing what functions are available is important to avoid coding the same functionality with slow iterative expressions.

For further information about contemporary techniques of processing matrices with computers, the classical work "Matrix Computations" [Golub:1996] is recommended.

6.1.3.1. Vector Generation

There are two built-in functions and one operator to generate a row-vector of numbers.

6.1.3.1.1. Operator ":"

This syntax of the colon operator is


initial [: increment] : final

with a default increment of +1. To produce the equivalent piece of Scilab code, we write


x = initial
v = [ x ]
while x <= final - increment
    x = x + increment
    v = [v, x]
end

where v is the result. Note that the last element of the result always will be smaller or equal to the value final.

See also Section 2.6 for a discussion of the dangers involved in using a colon-expression with fractional parameters.

6.1.3.1.2. linspace

The syntax of linspace is


linspace(initial, final [, length])

using a default of 100 for length. linspace returns a row-vector with length entries, which divide the interval (initial, final) in equal-length sub-intervals. Both endpoints, i.e. initial and final are always included.

6.1.3.1.3. logspace

logspace works much like linspace, and the following relation holds

 \[ \mathrm{logspace}(\mathrm{init}, \mathrm{final}) = 10^{\mathrm{linspace}(\mathrm{init}, \mathrm{final})} \]

6.1.3.2. Whole Matrix Construction

All of the functions shown in this section are capable to produce arbitrary matrices including the boundary cases of row-, and column-vectors.

6.1.3.2.1. zeros

As the name suggests this function produces a matrix filled with zeros. The two possible instantiations are with two scalar arguments


n = 2
m = 5
mat = zeros(n, m)

or with one matrix argument


mat1 = [ 4 2; ..
         4 5; ..
         3 5 ]
mat2 = zeros(mat1)

The first form produces the n times m matrix mat made up of zeros, whereas the second builds the matrix mat2 which has the same shape as mat1, and is also consisting solely of zeros.

Caution Single scalar argument to zeros
 

In the case of a single scalar argument zeros returns a 1-times-1 matrix, the sole element being a zero.

Furthermore, note that


zeros()

is not allowed.

6.1.3.2.2. ones

The command is functionally equivalent to zeros. Instead of returning a matrix filled with 0.0 as zeros does, ones returns a matrix filled with 1.0. The only difference is a third form which is permitted for ones, and that is calling the function without any arguments:


-->ones()
 ans  =
    1.  

6.1.3.2.3. eye

The eye function produces a generalized identity matrix, this is a matrix with all elements

 \begin{eqnarray*} a_{i, j} & = & 0\quad\mathrm{for}\quad i \not= j,\quad\mathrm{and} \ a_{i, j} & = & 1\quad\mathrm{for}\quad i = j. \end{eqnarray*}

This command is functionally equivalent to zeros. The only extension is the usage without any argument, where the result automatically takes over the dimensions of the matrix in the subexpression it is used.


-->a=[2 3 4 3; 4 2 6 7; 8 2 7 4]
 a  =
!   2.    3.    4.    3. !
!   4.    2.    6.    7. !
!   8.    2.    7.    4. !

-->a - 2*eye()
 ans  =
!   0.    3.    4.    3. !
!   4.    0.    6.    7. !
!   8.    2.    5.    4. !

6.1.3.2.4. diag

Function diag has two different working modes depending on the shape of its argument. Given a vector v it constructs a diagonal matrix mat from the vector, with v being mat's main diagonal, i.e. mat(i, i) = v(i) for all v(i). Given an arbitrary matrix mat, diag extracts the diagonal as a column-vector.


-->diag(2:2:8)
 ans  =
!   2.    0.    0.    0. !
!   0.    4.    0.    0. !
!   0.    0.    6.    0. !
!   0.    0.    0.    8. !

-->m = [2, 3, 8; 7, 6, -6; 0, -5, -8]
 m  =
!   2.    3.    8. !
!   7.    6.  - 6. !
!   0.  - 5.  - 8. !

-->diag(m)
 ans  =
!   2. !
!   6. !
! - 8. !

The 2-argument form of the diag function


diag(v, k)

constructs a matrix that has its diagonal k positions away from the main diagonal, the diagonal being made up from v again. Therefore, diag(v) is the special case of diag(v, 0). A positive k denotes diagonals above, a negative k diagonals below the main diagonal. As for the 1-argument form, extraction of the kth super-diagonal (positive k, or subdiagonal (negative k) is also implemented.


-->diag([1 1 1 1]) + diag([2 2 2], 1) + diag([-2 -2 -2], -1)
 ans  =
!   1.    2.    0.    0. !
! - 2.    1.    2.    0. !
!   0.  - 2.    1.    2. !
!   0.    0.  - 2.    1. !

-->diag(m, -1) // using the same m as above
 ans  =
!   7. !
! - 5. !
Tip

Nesting two calls to diag is the building block for an interesting idiom to test whether a matrix m is a diagonal matrix.


and( abs(diag(diag(m)) - m) <= %eps * abs(m) )
   

The inner call to diag extracts m's main diagonal, the outer call taking this column-vector and construction a matrix out of it. The rest of the code simple checks the relative error.

6.1.3.2.5. rand

The rand function generates pseudo-random scalars and matrices. Again the function shares its two fundamental forms with zeros. Moreover, the distribution of the numbers can be chosen from 'uniform' which is the default, and 'normal'. The generator's seed is set and queried with


rand('seed', new_seed)

and


current_seed = rand('seed')

6.1.3.3. Functions Operating on a Matrix as a Whole

Although the section title might imply that the following functions apply to matrices only, Scilab's understanding allows for vectors anywhere a matrix is accepted (but not vice versa).

6.1.3.3.1. find

In our opinion one of the most useful functions in the group of whole matrix functions is find. It takes a boolean expression of matrices, i.e. an expression which evaluates to a boolean matrix, as argument and in form


index = find(expr)

returns the indices of the array elements that evaluate to true, i.e. %t in a vector. See also Section 6.1.2.3.

In the form


[rowidx, colidx] = find(expr)

it returns the row- and column-index vectors separately. Here is a complete example:


-->a = [ 1 -4  3; 6  2 10 ]
 a  =
!   1.  - 4.    3.  !
!   6.    2.    10. !

-->index = find( a < 5 )
 index  =
!   1.    3.    4.    5. !

-->a(index)
 ans  =
!   1. !
! - 4. !
!   2. !
!   3. !

-->[rowidx, colidx] = find( a < 5 )
 colidx  =
!   1.    2.    2.    3. !
 rowidx  =
!   1.    1.    2.    1. !

The expressions expr can be arbitrarily complex. They are not at all limited to a single matrix.


-->b = [1 2 3; 4 5 6]
 b  =
!   1.    2.    3. !
!   4.    5.    6. !

-->a < 5
 ans  =
! T T T !
! F T F !

-->abs(b) >= 4
 ans  =
! F F F !
! T T T !

-->a < 5 & abs(b) >= 4
 ans  =
! F F F !
! F T F !

-->find( a < 5 & abs(b) >= 4 )
 ans  =
    4.  

Last but not least find is perfectly OK on the left-hand side of an assignment. So, replacing all odd elements in a with 0 simply is


-->a( find(modulo(a, 2) == 1) ) = 0
 a  =
!   0.  - 4.    0.  !
!   6.    2.    10. !

To get the number of elements that match a criterion, just apply size(idxvec, '*') to the index vector idxvec of the find operation.

6.1.3.3.2. max, min

Searching the smallest or the largest entry in a matrix are so common that Scilab has separate functions for these tasks. We discuss max only as min behaves similarly.

To get the largest value saying


max_val = max(a)

is enough. The alternate form


-->[max_val, index] = max(a)
 index  =
!   2.    3. !
 max_val  =
    10.  

returns the position of the maximum element, too. The form of the index vector is the same as for size, i.e. [row-index, column-index]. Speaking of size, max has the forms max(mat, 'r'), and max(mat, 'c'), too.


-->[max_val, rowidx] = max(b, 'r')
 rowidx  =
!   2.    2.    2. !
 max_val  =
!   4.    5.    6. !

-->[max_val, colidx] = max(b, 'c')
 colidx  =
!   3. !
!   3. !
 max_val  =
!   3. !
!   6. !

These forms return the maximum values of each row or column along with the respective indices of the elements' rows or columns.

The third way of using max is with more than one matrix or scalar as arguments. All the matrices must be compatible, scalars are expanded to full matrix size, like scalmat = scal * ones(mat). The return matrix holds the largest elements from all argument matrices.


-->max(a, b, 3)
 ans  =
!   3.    3.    3.  !
!   6.    5.    10. !

6.1.3.3.3. and, or

Both, and and or borrow their syntax from the size function: without a second argument, or a star, "*", as second argument the function is applied to the argument as a whole. A 1 or a "r" applies the function seperately to each row, yielding a row-vector as result. Accordingly a 2 or a "c" applies the function seperately to each column, yielding a column-vector as result.

The function and returns true if all components of the argument are true. Therefore, it is related to Fortran-9x's all function. Similarly function or returns true if any component of its argument is true, mimicking Fortran-9x's any function.

One of the fastest ways of testing whether a vector (or matrix) v contains any non-zero element uses or: or(v). As demonstrated with the find function, the arguments to and and or can take arbitrarily complex boolean expressions. If we like to test whether all components of the vector v = [1.0 0.95 1.02] are within 10% of the value 1, we do not need a loop: and( abs(v - 1.0) < 0.1 ).

6.1.3.3.4. Operator "&", Operator "|"

The operators "&", and "|" perform a component wise logical-and, or logical-or operation. See also Section 4.3.3. The arguments to either operator can be scalars or matrices.

6.1.3.3.5. sum, cumsum, prod, cumprod

These are the numeric cousins of the boolean function pair or and and. Their syntax is identical. The "cum" functions work cumulatively, returning a vector (matrices are processing in their flattened representation).

A fast factorial function?


function f = fact(n)

if n < 0 then
    error("fact: domain")
end

if n == 0 then
    f = 1
else
    f = prod(1 : n)
end

$1000 at 4.5% over 7 years?


-->1000.0 * cumprod( (1.0 + 0.045) * ones(7, 1) )
 ans  =
!   1045.     !
!   1092.025  !
!   1141.1661 !
!   1192.5186 !
!   1246.1819 !
!   1302.2601 !
!   1360.8618 !

though 1000.0 * (1.0 + 0.045)^(1:7)' produces the same result and requires less keystrokes.

6.1.3.3.6. gsort

Warning

Do not use sort! It is buggy in that it sometimes does not return a permutation of the input data. Use gsort instead of sort.

The gsort function is a versatile sorting function for vectors and matrices of real numbers or strings. It sorts into increasing order or decreasing (default!) order, sorts a matrix's rows or columns separately, and can sort the rows or columns lexicographically. The output of gsort not only is the sorted matrix mat_sorted but also the permutation vector permutation that generates the sorted matrix from the input matrix. The synopsis is


    [mat_sorted, permutation] = gsort(mat_input, mode, direction)

where mode can have the values shown in Table 6-2, and direction the values displayed in Table 6-3.

Table 6-2. Mode Specifiers for gsort

Specifier Action Note
'g' sort flattened matrix default
'r' column-by-column  
'c' row-by-row  
'lr' rows lexicographically not 'rl'!
'lc' columns lexicographically not 'cl'!

Table 6-3. Direction Specifiers for gsort

Specifier Action Note
'i' increasing order or upgrade  
'd' decreasing order or downgrade default

Let us look at some simple examples. We use a numeric matrix in the example, but a string matrix would do as well.


-->mat1 = [11 12; 21 22; 31 32]
 mat1  =
!   11.    12. !
!   21.    22. !
!   31.    32. !
 
-->gsort(mat1)
 ans  =
!   32.    21. !
!   31.    12. !
!   22.    11. !
 
-->gsort(mat1, 'r')
 ans  =
!   31.    32. !
!   21.    22. !
!   11.    12. !
 
-->gsort(mat1, 'c')
 ans  =
!   12.    11. !
!   22.    21. !
!   32.    31. !

Applied without parameters gsort sorts the flattened (see also Section 6.1.2.3) version, here: mat(:), of its argument into decreasing order. The 'r'- or 'c'-options tell gsort to sort each column or row seperately.

Note

'r' means column wise, and 'c' means row wise!

The next example points out the difference between simple row- or column-sorting and lexicographical sorting of columns or rows.


-->mat2 = [6 72 23; 56 19 23; 66 54 21]
 mat2  =
!   6.     72.    23. !
!   56.    19.    23. !
!   66.    54.    21. !

-->gsort(mat2, 'r')  // col-by-col
 ans  =
!   66.    72.    23. !
!   56.    54.    23. !
!   6.     19.    21. !

-->gsort(mat2, 'lc')  // col lexico
 ans  =
!   72.    23.    6.  !
!   19.    23.    56. !
!   54.    21.    66. !

-->gsort(mat2, 'c')  // row-by-row
 ans  =
!   72.    23.    6.  !
!   56.    23.    19. !
!   66.    54.    21. !

-->gsort(mat2, 'lr')  // row lexico
 ans  =
!   66.    54.    21. !
!   56.    19.    23. !
!   6.     72.    23. !

Now what is the exact difference between row-by-row sorting and lexicographic row sorting? After row-by-row sorting (in decreasing order) of an m-times-n matrix a the following relation holds:

 \[ a_{i, j} \ge a_{i, j+1} \quad\mathrm{for}\quad 1 \le i \le m \quad\mathrm{and}\quad 1 \le j \le n - 1. \]

In other words each row is sorted separately by interchanging its columns. After a lexicographic sort the relation between the rows is:

 \[ a_{i, :} \ge a_{i + 1, :} \quad\mathrm{for}\quad 1 \le i \le m - 1. \]

This time whole rows are compared to each other. Analogous relations hold for column sorting.

In environments not as rich as Scilab gsort might be the heart of user-written min, max, and median functions. All three are predefined in Scilab.

6.1.3.3.7. size

The size function handles all shape inquiries. It comes in four different guises. Assuming that mat is a scalar or matrix, size can be used as all-info-at-once function as in


[rows, cols] = size(mat)

as row-only, or column-only function


rows = size(mat, 'r')
cols = size(mat, 'c')

and finally as totaling function


elements = size(mat, '*')

6.1.3.3.8. matrix

A (hyper-)matrix can be reshaped with the matrix command. To keep things simple we demonstrate matrix with a 6x2-matrix.


-->a = [1:6; 7:12]
 a  =
!   1.    2.    3.    4.     5.     6.  !
!   7.    8.    9.    10.    11.    12. !

-->matrix(a, 3, 4)
 ans  =
!   1.    8.    4.     11. !
!   7.    3.    10.    6.  !
!   2.    9.    5.     12. !

-->matrix(a, 4, 3)
 ans  =
!   1.    3.     5.  !
!   7.    9.     11. !
!   2.    4.     6.  !
!   8.    10.    12. !

In contrary to the Fortran-9x function RESHAPE, matrix neither allows padding, nor truncation of the reshaped matrix. Put another way, for a m times n matrix a the reshaped dimensions p, and q must obey

 \[ m n = p q \]

matrix works by columnwise "filling" the contents of the original matrix a into an empty template of a p-times-q matrix. (See also Section 6.1.2.3.) If this a too hard to imagine, the second way to think of it is imagining a as a column vector of dimensions (m * n)-times-1 that is broken down column by column into a p-times-q matrix. In fact this is not pure imagination as for many Scilab matrix operations the identity a(i, j) == a(i + n*(j - 1)) holds.


-->a(2,4)
 ans  =
    10.  

-->a(8)
 ans  =
    10.  

Moreover, the usual vector subscripting can be used to a matrix.


-->a(:)
 ans  =
!   1.  !
!   7.  !
!   2.  !
!   8.  !
!   3.  !
!   9.  !
!   4.  !
!   10. !
!   5.  !
!   11. !
!   6.  !
!   12. !

6.1.4. Evaluation of Polynomials

Once upon a time there was a little Scilab newbie who coded an interface to the optim routine to make polynomial approximations easier. On the way an evaluation function for polynomials had to be written. The author was very proud of herself because she knew the Right Thing(tm) to do in this case namely the Horner algorithm. Actually she immediately came up with two implementations.

Example 6-2. Naive functions to evaluate a polynomial


function yv = peval1(cv, xv)
// Evaluate polynomial given by the vector its
// coefficients cv in ascending order, i.e.
// cv = [p q r]  -->  p + q*x + r*x^2 at all
// points listed in vector xv and return the
// resulting vector.

yv = cv(1) * ones(xv)
px = xv
for c = cv(2 : $)
    yv = yv + c * px
    px = px .* xv
end


function yv = peval2(cv, xv)
// same as peval1

yv = cv($);
for i = length(cv)-1 : -1 : 1
    yv = yv .* xv + cv(i)
end
   

So what is wrong with that? This code looks OK and it does the job. But from the performance viewpoint it is not optimal! The fact that Scilab offers a separate type for polynomials has been ignored. Even if we are forced to supply an interface with the coefficients stored in vectors the built-in function freq is preferable.

Example 6-3. Less naive functions to evaluate a polynomial


function yv = peval3(cv, xv)
// same as peval1, using horner()

p = poly(cv, 't', 'coeff')
yv = horner(p, xv)


function yv = peval4(cv, xv)
// same as peval1, using freq()
// The return value yv _always_ is a row-vector.

p = poly(cv, 't', 'coeff')
unity = poly(1, 't', 'coeff')
yv = freq(p, unity, xv)
   

Table 6-4 shows the speed ratios (each line is normalized separately) for a polynomial of degree 4 that we got on a P5/166 GNU/Linux system.

Table 6-4. Performance comparison of different polynomial evaluation routines

evaluations peval1 peval2 peval3 peval4
5 3.5 4.2 1 7.0
1000 1.4 2.5 1 2.5

If we now decide to change our interface to take Scilab's built-in polynomial type the evaluation with freq can again be accelerated by a factor of more than three.

Notes

[1]

Remember that the colon operator returns a row-vector.