Getting started in C

Every data type object in libscientific is stored in the HEAP and then supports dynamic memory allocation.

Hence, every data object such as matrix, vectors, tensors, models, in general, need to be manually allocated/deallocated by the programmer using the library predefined constructs “NewSOMETHING(&…);” and “DelSOMETHING(&…);”.

Hence, every data object is a pointer and needs to pass by reference “&”. To avoid memory fragmentation problems, please, consider deallocating every allocated variable at the end of your program :-)

Compile a program that use libscientific

A program that use libscientific requires only one directive as follow:

1#include <scientific.h>

and to compile the code with a C or C++ compiler linking with -lscientific and specify the right paths using the -L<library path/of/libscientific and -I<include path/of libscientific> options.

1gcc -o example1 -L/usr/local/lib/ -I/usr/local/include -lscientific example1.c

Vector operations

Create/Allocate a vector

There are four different types of vectors

  • Double vector: dvector

  • Integer vector: ivector

  • Unsigned integer vector: uivector

  • String vector: strvector

Here we show an example on how to allocate/deallocate these four vector types.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4
 5int main(void){
 6  int i;
 7  /* Define the variable vector, which in this case is a double vector.
 8   * If we would like to use an integer or an unsigned integer or a 
 9   * string vector instead this construct became
10   * ivector *v; or uivector *v; or strvector *v;
11   */
12
13  dvector *v;
14  NewDVector(&v, 5); // Allocate the memory space.
15
16  // Fill the value inside the vector
17  for(i = 0; i < 5; i++){
18    v->data[i] = (float)i;
19  }
20
21  // Show the vector values to video
22  PrintDVector(v);
23
24  // Free the memory space
25  DelDVector(&v);
26}
27

Append a value to a given vector

Here we show an example on how to append a value to a vector.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4
 5int main(void){
 6  int i;
 7
 8  /*
 9   * We define here the double vector. 
10   * If we would like to utilize another vector
11   * we can change dvector with the other three possibilities:
12   * - uivector
13   * - ivector
14   * - strvector
15   *
16   */
17  dvector *v;
18
19  NewDVector(&v, 5); // We intialize the vector 
20
21  // We add 5 numbers
22  for(i = 0; i < 5; i++){
23    v->data[i] = (float)i;
24  }
25 
26  // We append a new number
27  DVectorAppend(v, 123.4);
28  
29  // Print to video the result
30  PrintDVector(v);
31
32  // Free up the memory
33  DelDVector(&v);
34}
35

Work with string vectors

Matrix operations

Matrix is a user-defined data type that contains: - the number of rows - the number of columns - the 2D data array, which defines the matrix.

The data array is selected explicitly as a double type to work with an extensive range of numbers.

 * - **data** two dimensional array of double
 * - **row** number of rows
 * - **col** number of columns
 */

Create/Allocate a matrix with a specific size

Create a simple matrix of 10 rows and 15 columns and fill it with numbers.

Then print it to the terminal using “PrintMatrix();”

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4
 5int main(void)
 6{
 7    int i, j;
 8    matrix *m; // Definition of the pointer matrix variable
 9    NewMatrix(&m, 10, 15); // Create the matrix with 10 rows and 15 columns. Each value in the matrix is 0.
10    for(i = 0; i < 10; i++){
11        for(j = 0; j < 15; j++){
12            m->data[i][j] = (float)i+j; // Fill the matrix values with these numbers
13        }
14    }
15    PrintMatrix(m); // Print to video the matrix content
16    DelMatrix(&m); // Free the memory space
17}

Initialize an empty matrix and append a row/column to it

An empty matrix is an object with rows and cols equal to 0. However in that matrix object we can add dynamically rows and columns and or resize it later on.

In this example we will initialize an empty matrix and we will add to it several rows.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4int main(void)
 5{
 6    int i;
 7    matrix *mx; // Definition of the matrix variable as a pointer
 8    dvector *row; // Definition of the row variable as a pointer 
 9
10    NewDVector(&row, 15);
11    for(i = 0; i < row->size; i++){
12        row->data[i] = (double)i; // Fill one time the row vector 
13    }
14
15    initMatrix(&mx); // Initialize the empty matrix with rows and columns equal to 0
16
17    for(i = 0; i < 5; i++){
18        MatrixAppendRow(mx, row); // Append 5 times the row to the matrix mx
19    }
20
21    PrintMatrix(mx); // Print to video the matrix 
22
23    DelDVector(&row); // Free the memory space for the row vector
24    DelMatrix(&mx); // Free the memory space for the matrix
25}
26
27

Of course the same code can be reused to add a columns using the construct “MatrixAppendCol” instead of “MatrixAppendRow”.

Matrix x Column vector dot product

In this example we illustrate the product between a matrix of sizes M x N and a column double vector of size N.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4int main(void)
 5{
 6    int i, j;
 7    matrix *mx; // Definition of the matrix variable as a pointer
 8    dvector *cvect; // Definition of the column vector as a pointer 
 9    dvector *result; // Definition of the result vector between the matrix and the column vector
10
11    NewDVector(&cvect, 15);
12    for(i = 0; i < cvect->size; i++){
13        cvect->data[i] = (double)i; // Fill one time the row vector 
14    }
15
16    NewMatrix(&mx, 23, 15); // Initialize a matrix with 23 rows and 15 columns
17
18    for(i = 0; i < mx->row; i++){
19        for(j = 0; j < mx->col; j++){
20            mx->data[i][j] = (double)i+j;
21        }
22    }
23    NewDVector(&result, 23);
24
25    MatrixDVectorDotProduct(mx, cvect, result);
26    /*
27     * or MT_MatrixDVectorDotProduct if you want to run the multitask operation.
28     * This function is usefull for large matrix.
29     */
30
31
32    PrintDVector(result); // Print to video the result
33    // Free the memory spaces
34    DelDVector(&result); 
35    DelDVector(&cvect);
36    DelMatrix(&mx);
37}
38
39

Transpose a matrix

A matrix transpose is an operation that flips a matrix over its diagonal. Here is an example that shows how to produce a transpose of a given matrix.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4
 5int main(void)
 6{
 7    int i, j;
 8    matrix *m, *m_T; // Definition of the pointer matrix variable
 9    NewMatrix(&m, 10, 15); // Create the matrix with 10 rows and 15 columns. Each value in the matrix is 0.
10    NewMatrix(&m_T, m->col, m->row); // Create the transposed matrix with the flip of the columns and rows size
11
12    for(i = 0; i < 10; i++){
13        for(j = 0; j < 15; j++){
14            m->data[i][j] = (float)i+j; // Fill the matrix values with these numbers
15        }
16    }
17    MatrixTranspose(m, m_T);
18    PrintMatrix(m); // Print to video the original matrix content
19    puts("Transposed matrix");
20    PrintMatrix(m_T); // Print to video the transposed matrix 
21    // Free the memory spaces
22    DelMatrix(&m);
23    DelMatrix(&m_T);
24}

Invert a matrix

In this example we show how to invert a matrix with libscientific

 1#include <stdio.h>
 2#include <scientific.h>
 3#include <math.h>
 4
 5int main(void)
 6{
 7    int i;
 8    matrix *m; // Definition of the matrix variable as a pointer
 9    matrix *m_inv; // Definition of the inverted matrix variable as a pointer 
10
11    NewMatrix(&m, 10, 10); // Allocate the matrix to invert 
12    MatrixInitRandomFloat(m, -3., 3.); // Random fill the matrix with values within a range -3 < x < 3
13    PrintMatrix(m); // Print to video the matrix 
14
15    initMatrix(&m_inv); // Initialize the matrix to invert
16    MatrixInversion(m, m_inv); // Invert the matrix
17
18    double det = fabs(MatrixDeterminant(m)); // Calculate the determinant
19
20    printf("Determinant %.4f\n", det); // Print to video the matrix determinant
21    PrintMatrix(m_inv); // Print to video the inverted matrix
22
23    // Free the memory spaces
24    DelMatrix(&m_inv);
25    DelMatrix(&m);
26}
27
28

Calculate eigenvectors and eigenvalues of a matrix

This example shows how to calculate eigenvectors and eigenvalues of an N x N real nonsymmetric matrix. The eigenvector/eigenvalue is computed thanks to the dgeev.f code extracted from the Lapack library.

 1#include <stdio.h>
 2#include <scientific.h>
 3#include <math.h>
 4
 5int main(void)
 6{
 7    int i;
 8    matrix *m; // Definition of the matrix variable as a pointer
 9    dvector *eval; // Definition the variable to store the eigenvalues
10    matrix *evect; // Definition of the variable were to store the eigenvectors
11
12    NewMatrix(&m, 10, 10); // Allocate the matrix to invert 
13    MatrixInitRandomFloat(m, -3., 3.); // Random fill the matrix with values within a range -3 < x < 3
14    PrintMatrix(m); // Print to video the matrix 
15
16    // Initialize the variables
17    initDVector(&eval);
18    initMatrix(&evect);
19
20    EVectEval(m, eval, evect); // Calculate the eigenvectors and associated eigenvalues
21
22    PrintDVector(eval); // Print to video the eigenvalues
23    PrintMatrix(evect); // Print to video the eigenvectors. Each column correspond to an eingenvalue
24
25    // Free the memory spaces
26    DelDVector(&eval);
27    DelMatrix(&evect);
28    DelMatrix(&m);
29}
30
31

Singular Value Decomposition of a square matrix

In this example we show how to factorize a square matrix using the singular value decomposition (SVD) method

 1#include <stdio.h>
 2#include <scientific.h>
 3#include <math.h>
 4
 5int main(void)
 6{
 7    int i;
 8    matrix *m; // Definition of the matrix variable as a pointer
 9    matrix *U; // Definition of the complex unitary matrix
10    matrix *S; // Definition of the rectangular diagnonal matrix with non-negative real numbers on diagonal
11    matrix *Vt; // Definition of conjugate transpose of a complex unitary matrix
12
13    NewMatrix(&m, 10, 10); // Allocate the matrix to invert 
14    MatrixInitRandomFloat(m, -3., 3.); // Random fill the matrix with values within a range -3 < x < 3
15    PrintMatrix(m); // Print to video the matrix 
16
17    // Initialize the variables
18    initMatrix(&U);
19    initMatrix(&S);
20    initMatrix(&Vt);
21    SVD(m, U, S, Vt); // Internal method
22    // SVDlapack(m, U, S, Vt); lapack method using dgesdd
23    // Print to video the results of the factorization
24    PrintMatrix(U);
25    PrintMatrix(S);
26    PrintMatrix(Vt);
27
28    // Free the memory spaces
29    DelMatrix(&U);
30    DelMatrix(&S);
31    DelMatrix(&Vt);
32    DelMatrix(&m);
33}
34
35

Tensor operations

Tensor is a user-defined data type that contains: - order: the number of matrix - m: the array the 2D data array, which defines the tensor itself.

The data array is selected explicitly as a double type to work with an extensive range of numbers.

 * Tensor data structure 
 *
 * - **m** list of matrix
 * - **order** number of matrix layers

Create/Allocate a tensor with a specific size

Create a simple tensor of 3 blocks, 10 rows and 15 columns and fill it with numbers.

Then print it to the terminal using “PrintTensor();”

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4
 5int main(void)
 6{
 7    int i, j, k;
 8    tensor *t; // Definition of the pointer tensor variable
 9    NewTensor(&t, 3); // Create the tensor with 3 blocks;
10    for(k = 0; k < 3; k++){
11        NewTensorMatrix(t, k, 10, 15);
12        for(i = 0; i < 10; i++){
13            for(j = 0; j < 15; j++){
14                t->m[k]->data[i][j] = (float)i+j; // Fill the tensor values with these numbers
15            }
16        }
17    }
18    PrintTensor(t); // Print to video the tensor content
19    DelTensor(&t); // Free the memory space
20}

Initialize an empty tensor and append different matrix to it

An empty tensor is an object with matrix equal to 0. In that object we can add dynamically different matrix with different rows and columns.

In this example we will initialize an empty tensor and we add different matrix to it

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4
 5int main(void)
 6{
 7    size_t i, j, k;
 8    tensor *t; // Definition of the pointer tensor variable
 9    matrix *m;
10    initTensor(&t); // Tensor allocation
11    NewMatrix(&m, 10, 7); // Create of a matrix with 10 rows and 7 columns
12    TensorAppendMatrix(t, m); // Append this matrix to the tensor t
13    DelMatrix(&m); // Delete the matrix
14    
15    NewMatrix(&m, 10, 10); // Create of a second matrix with 10 rows and 10 columns
16    TensorAppendMatrix(t, m); // Append this new matrix to the tensor t
17    DelMatrix(&m);
18    
19    for(k = 0; k < t->order; k++){
20        for(i = 0; i < t->m[k]->row; i++){
21            for(j = 0; j < t->m[k]->col; j++){
22                t->m[k]->data[i][j] = (float)i+j; // Fill the tensor values with these numbers
23            }
24        }
25    }
26    PrintTensor(t); // Print to video the tensor content
27    DelTensor(&t); // Free the memory space
28}

Multivariate analysis algorithms

In this section, you will find examples of running multivariate analysis algorithms. In particular, the algorithm described here is extracted from official scientific publications and is adapted to run in multithreading to speed up the calculation.

  • PCA and PLS implements the NIPALS algorithm described in the following publication:

P. Geladi, B.R. Kowalski
Partial least-squares regression: a tutorial
Analytica Chimica Acta Volume 185, 1986, Pages 1-17
DOI:10.1016/0003-2670(86)80028-9
  • CPCA implements the NIPALS algorithm described in the following publication:

ANALYSIS OF MULTIBLOCK AND HIERARCHICAL PCA AND PLS MODELS
JOHAN A. WESTERHUIS, THEODORA KOURTI* AND JOHN F. MACGREGOR
J. Chemometrics 12, 301–321 (1998)
DOI:/10.1002/(SICI)1099-128X(199809/10)12:5<301::AID-CEM515>3.0.CO;2-S

Principal Component Analysis (PCA)

Here is an example that shows how to compute a principal component analysis on a matrix.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4int main(void)
 5{
 6    matrix *m; // Definition of the input matrix 
 7    PCAMODEL *model; // Definition of the PCA model
 8    int i, j;
 9    int nobj = 20;
10    int nvars = 8;
11    NewMatrix(&m, nobj, nvars);
12
13    // Fill with random values the matrix m
14    srand(nobj);
15    for(size_t i = 0; i < nobj; i++){
16        for(size_t j = 0; j < nvars; j++){
17            m->data[i][j] = randDouble(0,20);
18        }
19    }
20
21
22    NewPCAModel(&model); // Allocation of the PCA model
23    PCA(m, 1, 5, model, NULL); // Calculation of the PCA on matrix m using unit variance scaling (1) and the extraction of 5 principal components 
24
25    PrintPCA(model); // Print to video the PCA results
26
27    /* Of course you can print in a separate way the different results contained in the model variable
28     * model->scores is the matrix of scores
29     * model->loadings is the matrix of loadings
30     * model->colavg is the column average obtained from the input matrix
31     * model->scaling is the scaling factor obtained from the input matrix
32     */
33
34    // Free the memory spaces
35    DelPCAModel(&model);
36    DelMatrix(&m);
37}
38

Consensus Principal Component Analysis (CPCA)

Here is an example that shows how to compute a consenus principal component analysis on a tensor.

 1#include <stdio.h>
 2#include <scientific.h>
 3
 4int main(void)
 5{
 6    tensor *t; // Definition of the input tensor
 7    CPCAMODEL *model; // Definition of the CPCA model
 8    int i, j, k;
 9    int nblocks = 4;
10    int nobj = 20;
11    int nvars[4] = {8, 10, 5, 7}; //every block have different variables
12    NewTensor(&t, nblocks);
13
14    // Fill with random values the matrix m
15    srand(nobj);
16    for(k = 0; k < nblocks; k++){
17        NewTensorMatrix(t, k, nobj, nvars[k]);
18        for(i = 0; i < nobj; i++){
19            for(j = 0; j < nvars[k]; j++){
20                t->m[k]->data[i][j] = randDouble(0,20);
21            }
22        }
23    }
24
25
26    NewCPCAModel(&model); // Allocation of the CPCA model
27    CPCA(t, 1, 5, model); // Calculation of the CPCA on matrix m using unit variance scaling (1) and the extraction of 5 super principal components 
28
29    PrintCPCA(model); // Print to video the CPCA results
30
31    /* Of course you can print in a separate way the different results contained in the model variable
32     * model->super_scores is the matrix of super scores
33     * model->super_weights is the matrix of super weights
34     * model->block_scores is the tensor of scores for each block
35     * model->block_loadings is the tensor of loadings for each block
36     */
37
38    // Free the memory spaces
39    DelCPCAModel(&model);
40    DelTensor(&t);
41}
42

Partial Least Squares (PLS)

A matrix of features or independent vIariables and a matrix of targets or dependent variables is requested to calculate a PLS model. Here is a simple example that shows how to calculate a PLS model.

  1#include <stdio.h>
  2#include <scientific.h>
  3
  4int main(void)
  5{
  6    matrix *x, *y; // Define the feature matrix x and the target to predict y
  7    dvector *betas; // Define the beta coefficients
  8    PLSMODEL *m;
  9
 10    // Allocate the matrix
 11    NewMatrix(&x, 14, 6);
 12    NewMatrix(&y, 14, 1);
 13
 14    // Fill the matrix with values
 15    // This is a manual filling.
 16    // Of course we can read a csv file and fill it automatically
 17
 18    x->data[0][0] = 4.0000;  x->data[0][1] = 4.0000;  x->data[0][2] = 1.0000;  x->data[0][3] = 84.1400;  x->data[0][4] = 1.0500;  x->data[0][5] = 235.1500;
 19    x->data[1][0] = 5.0000;  x->data[1][1] = 5.0000;  x->data[1][2] = 1.0000;  x->data[1][3] = 79.1000;  x->data[1][4] = 0.9780;  x->data[1][5] = 231;
 20    x->data[2][0] = 4.0000;  x->data[2][1] = 5.0000;  x->data[2][2] = 1.0000;  x->data[2][3] = 67.0900;  x->data[2][4] = 0.9700;  x->data[2][5] = 249.0000;
 21    x->data[3][0] = 4.0000;  x->data[3][1] = 4.0000;  x->data[3][2] = 1.0000;  x->data[3][3] = 68.0700;  x->data[3][4] = 0.9360;  x->data[3][5] = 187.3500;
 22    x->data[4][0] = 3.0000;  x->data[4][1] = 4.0000;  x->data[4][2] = 2.0000;  x->data[4][3] = 68.0800;  x->data[4][4] = 1.0300;  x->data[4][5] = 363.0000;
 23    x->data[5][0] = 9.0000;  x->data[5][1] = 7.0000;  x->data[5][2] = 1.0000;  x->data[5][3] = 129.1600;  x->data[5][4] = 1.0900;  x->data[5][5] = 258.0000;
 24    x->data[6][0] = 10.0000;  x->data[6][1] = 8.0000;  x->data[6][2] = 0.0000;  x->data[6][3] = 128.1600;  x->data[6][4] = 1.1500;  x->data[6][5] = 352.0000;
 25    x->data[7][0] = 6.0000;  x->data[7][1] = 6.0000;  x->data[7][2] = 0.0000;  x->data[7][3] = 78.1118;  x->data[7][4] = 0.8765;  x->data[7][5] = 278.6400;
 26    x->data[8][0] = 16.0000;  x->data[8][1] = 10.0000;  x->data[8][2] = 0.0000;  x->data[8][3] = 202.2550;  x->data[8][4] = 1.2710;  x->data[8][5] = 429.1500;
 27    x->data[9][0] = 6.0000;  x->data[9][1] = 12.0000;  x->data[9][2] = 0.0000;  x->data[9][3] = 84.1600;  x->data[9][4] = 0.7800;  x->data[9][5] = 279.0000;
 28    x->data[10][0] = 4.0000;  x->data[10][1] = 8.0000;  x->data[10][2] = 1.0000;  x->data[10][3] = 72.1100;  x->data[10][4] = 0.8900;  x->data[10][5] = 164.5000;
 29    x->data[11][0] = 4.0000;  x->data[11][1] = 9.0000;  x->data[11][2] = 1.0000;  x->data[11][3] = 71.1100;  x->data[11][4] = 0.8660;  x->data[11][5] = 210.0000;
 30    x->data[12][0] = 5.0000;  x->data[12][1] = 11.0000;  x->data[12][2] = 1.0000;  x->data[12][3] = 85.1500;  x->data[12][4] = 0.8620;  x->data[12][5] = 266.0000;
 31    x->data[13][0] = 5.0000;  x->data[13][1] = 10.0000;  x->data[13][2] = 1.0000;  x->data[13][3] = 86.1300;  x->data[13][4] = 0.8800;  x->data[13][5] = 228.0000;
 32
 33    y->data[0][0] = 357.1500;
 34    y->data[1][0] = 388.0000;
 35    y->data[2][0] = 403.0000;
 36    y->data[3][0] = 304.5500;
 37    y->data[4][0] = 529.0000;
 38    y->data[5][0] = 510.0000;
 39    y->data[6][0] = 491.0000;
 40    y->data[7][0] = 353.3000;
 41    y->data[8][0] = 666.6500;
 42    y->data[9][0] = 354.0000;
 43    y->data[10][0] = 339.0000;
 44    y->data[11][0] = 360.0000;
 45    y->data[12][0] = 379.0000;
 46    y->data[13][0] = 361.0000;
 47
 48    // Allocate the PLS model
 49    NewPLSModel(&m);
 50
 51    /* Calculate the partial least squares algorithm taking as input:
 52     * x: the feature matrix x
 53     * y: the target matrix y
 54     * nlv: the number of latent variable nlv
 55     * xautoscaling: the autoscaling type for the x matrix
 56     * yautoscaling: the autoscaling type for the y Matrix
 57     * model: the PLSMODEL previously allocated
 58     * ssignal: a scientific signal to stop the calculation if requested by the user
 59     *
 60     * more information in the pls.h header file
 61     * void PLS(matrix *mx, matrix *my, size_t nlv, size_t xautoscaling, size_t yautoscaling, PLSMODEL *model, ssignal *s);
 62     */
 63    PLS(x, y, 3, 1, 0, m, NULL);
 64
 65    PrintPLSModel(m); // Print to video the PLS model
 66
 67    /*Validate the model using the internal validation method*/
 68    MODELINPUT minpt = initModelInput(); // Define the model input for the validation method
 69    minpt.mx = x;
 70    minpt.my = y;
 71    minpt.nlv = 3;
 72    minpt.xautoscaling = 1;
 73    minpt.yautoscaling = 0;
 74
 75    // Use the boot strap random group cross validation.
 76    BootstrapRandomGroupsCV(&minpt, 3, 100, _PLS_, m->predicted_y, m->pred_residuals, 4, NULL, 0);
 77    // We can also compute leave one out in case...
 78    // LeaveOneOut(&minpt, _PLS_, m->predicted_y, m->pred_residuals, 4, NULL, 0);
 79
 80    // Calculate the model validation statistics
 81    PLSRegressionStatistics(y, m->predicted_y, m->q2y, m->sdep, m->bias);
 82    //Print to video the results of the validation and the predicted values
 83
 84    puts("Q2 Cross Validation");
 85    PrintMatrix(m->q2y);
 86    puts("SDEP Cross Validation");
 87    PrintMatrix(m->sdep);
 88    puts("BIAS Cross Validation");
 89    PrintMatrix(m->bias);;
 90
 91    // Calculate the beta coefficients to see the importance of each feature
 92    puts("Beta coefficients");
 93    initDVector(&betas);
 94    PLSBetasCoeff(m, GetLVCCutoff(m->q2y), betas); // GetLVCCutoff select the best Q2 value from all the possibilities
 95    PrintDVector(betas);
 96
 97    puts("PREDICTED VALUES");
 98    PrintMatrix(m->predicted_y);
 99
100    puts("PREDICTED RESIDUALS");
101    PrintMatrix(m->pred_residuals);
102
103    puts("REAL Y");
104    PrintMatrix(y);
105
106    // Free the memory spaces
107    DelDVector(&betas);
108    DelPLSModel(&m);
109    DelMatrix(&x);
110    DelMatrix(&y);
111}