7.2. Internal Data Structure

FIXME: explain the parameter stack, data stack, etc.

7.2.1. Parameter Stack And Data Stack

FIXME: follow the documantation in "Internals".

7.2.2. Storage of Complex Matrices

Many programming languages store scalar complex variables z in Euclidean representation,

 \[ z = x + iy, \]

where x, and y are real numbers and i denotes the imaginary unit. A complex number is stored in memory as a record.


type Complex is record
    RealPart : Float;
    ImagPart : Float;
end record;

Fortran chooses to store complex matrices as sequences of Complex, and almost all other programming languages follow this convention.


declare
    CpxVec : array (1 .. 10) of Complex;

Thus, the memory image of CpxVec, broken into pieces, is


  --- address ---        --- contents ---

addr +            0  :  CpxVec(1).RealPart
addr +   Float'Size  :  CpxVec(1).ImagPart
addr + 2*Float'Size  :  CpxVec(2).RealPart
addr + 3*Float'Size  :  CpxVec(2).ImagPart
addr + 4*Float'Size  :  CpxVec(3).RealPart
addr + 5*Float'Size  :  CpxVec(3).ImagPart
...

where addr is the start address of the complex vector CpxVec in memory. The obvious advantage of this storage scheme is that it can be viewed as a vector of Complex scalars.


  --- address ---          --- contents ---

addr +              0  :       CpxVec(1)
addr +   Complex'Size  :       CpxVec(2)
addr + 2*Complex'Size  :       CpxVec(3)
...

Scilab does not follow this convection for storing complex numbers, if it did, we would not have to write this section. Instead of storing real and imaginary parts of a complex vector in turn, Scilab separately stores the vector of the real parts, and the vector of the imaginary parts.

Our example vector CpxVec from above, gets stored by Scilab in the following way:


     --- address ---          --- contents ---

real_addr +            0  :  CpxVec(1).RealPart
real_addr +   Float'Size  :  CpxVec(2).ImagPart
real_addr + 2*Float'Size  :  CpxVec(3).RealPart
...

imag_addr +            0  :  CpxVec(1).ImagPart
imag_addr +   Float'Size  :  CpxVec(2).ImagPart
imag_addr + 2*Float'Size  :  CpxVec(3).ImagPart
...

where real_addr and imag_addr are the start addresses of the two vectors. Nothing should be assumed of their relation; e.g. imag_addr might not point to the first memory cell after the last cell in the vector of the real parts.

The consequence for a Scilab programmer who wants to interface routines that use the conventional (Fortran) storage scheme for complex matrices is that she has to splice real and imaginary parts before calling the routine, and to store them seperately after completion. See Example 7-2 for a demonstration of this technique.

Example 7-1 re-implements the multiplication of two complex matrices, wmmul in Scilab. For conventional storage the function would be much shorter, for we could use zgemm from BLAS to compute the product C of two matrices A and B. dgemm and zgemm compute C := Alpha*A*B + Beta*C. Alpha and Beta are scalars.


type OrientationType is new Character;[1]

procedure dgemm
    (OrientationA  : in     OrientationType;
     OrientationB  : in     OrientationType;
     M             : in     Natural;
     N             : in     Natural;
     K             : in     Natural;
     Alpha         : in     Float;
     A             : in     FloatMatrix;
     LdA           : in     Natural;
     B             : in     FloatMatrix;
     LdB           : in     Natural;
     Beta          : in     Float;
     C             :    out FloatMatrix;
     LdC           : in     Natural);

procedure zgemm
    (OrientationA  : in     OrientationType;
     OrientationB  : in     OrientationType;
     M             : in     Natural;
     N             : in     Natural;
     K             : in     Natural;
     Alpha         : in     Complex;
     A             : in     ComplexMatrix;
     LdA           : in     Natural;
     B             : in     ComplexMatrix;
     LdB           : in     Natural;
     Beta          : in     Complex;
     C             :    out ComplexMatrix;
     LdC           : in     Natural);

      subroutine wmmul(a, na, b, nb, c, nc, l, m, n)

      call zgemm('n', 'n', l, n, m, 1.0d0, a, na, b, nb, 
     $     0.0d0, c, l)

      end

But as Scilab stores real and imaginary part of a complex matrix separately, we use a Karatsuba multiplication scheme with only three multiplications instead of the four as the naive algorithm does. Expressed in Scilab, we have


function [cr, ci] = mul_karatsuba(ar, ai, br, bi)

// fast multiplication of two complex numbers
// z1 := ar + i*ai
// z2 := br + i*bi
// cr + i*ci =: z3 = z1 * z2

p1 = ar * br
p2 = ai * bi
cr = p1 - p2

s1 = ar + ai
s2 = br + bi
p3 = s1 * s2
ci = p3 - p1 - p2

The actual implementation of wmmul is a more space saving version of the above.


function [cr, ci] = mul_karatsuba_final(ar, ai, br, bi)

p1 = ar * br
p2 = ai * bi

s1 = ar + ai
s2 = br + bi
ci = s1 * s2
ci = ci - p1 - p2
cr = p1 - p2

It is fairly obvious, how big the effort is, even for expressing the algorithm in Scilab. The Fortran function wmmul is even more convoluted because of several explicit do-loops.

Example 7-1. Multiplication of complex matrices


      subroutine wmmul(ar, ai, na, br, bi, nb, cr, ci, nc, l, m, n)
*
*     name       : wmmul.f  --  multiplication of two complex matrices;
*                                c := a * b
*     author     : L. van Dijk
*     last. rev. : Sun Jan 16 22:41:27 UTC 2000
*     Scilab ver.: 2.5
*     compiler   : g77 version 2.95.1 19990816 (release)

*     Copyright (C) 2000 Lydia van Dijk

*     PARAMETERS
*     ai,ar, bi,br, ci,cr: real and imaginary parts of the respective
*                          matrices
*     na, nb, nc: number of rows of resp. matrix in calling routine
*     l: number of rows in a and c
*     m: number of columns in a, and number of rows in b
*     n: number of columns in b and c
      implicit none
      double precision ar(*), ai(*), br(*), bi(*), cr(*),ci(*)
      integer na, nb, nc, l, m, n

*     LOCAL VARIABLES
      integer i, j
      integer ia, ib, ic
      double precision p1(l, n), p2(l, n)
      double precision s1(l, m), s2(m, n)

*     TEXT     
      call dgemm('n', 'n', l, n, m, 1.0d0, ar, na, br, nb, 
     $     0.0d0, p1, l)                                    -- p1 = ar * br
      call dgemm('n', 'n', l, n, m, 1.0d0, ai, na, bi, nb, 
     $     0.0d0, p2, l)                                    -- p2 = ai * bi
      ia = 0
      do 20 j = 1, m                                        -- s1 = ar + ai
         do 10 i = 1, l
            s1(i, j) = ar(ia+i) + ai(ia+i)
 10      continue
         ia = ia + na
 20   continue
      ib = 0
      do 40 j = 1, n                                        -- s2 = br + bi
         do 30 i = 1, m
            s2(i, j) = br(ib+i) + bi(ib+i)
 30      continue
         ib = ib + nb
 40   continue
      call dgemm('n', 'n', l, n, m, 1.0d0, s1, l, s2, m,
     $     0.0d0, ci, nc)                                   -- ci = s1 * s2
      ic = 0
      do 60 j = 1, n                                        -- ci = ci - p1 - p2
                                                            -- cr = p1 - p2
         do 50 i = 1, l
            ci(ic+i) = ci(ic+i) - p1(i, j) - p2(i, j)
            cr(ic+i) = p1(i, j) - p2(i, j)
 50      continue
         ic = ic + nc
 60   continue

      end
   

Notes

[1]

A serious Ada interface would not define OrientationType, but introduce the two types


type RealOrientationType is (NoTranspose, Transpose);
type ComplexOrientationType is (NoTranspose, ConjugateTranspose);
   

to let the compiler do the type checking. The BLAS routines even accept strings where we use OrientationType. However, a BLAS routine is supposed to look only at the first character. The valid strings are: "No transpose", "Transpose", and "Conjugate tranpose".