Getting started in C¶
Every data type object in libscientific is stored in the HEAP and then supports dynamic memory allocation.
Hence, every data object such as matrix, vectors, tensors, models, in general, need to be manually allocated/deallocated by the programmer using the library predefined constructs “NewSOMETHING(&…);” and “DelSOMETHING(&…);”.
Hence, every data object is a pointer and needs to pass by reference “&”. To avoid memory fragmentation problems, please, consider deallocating every allocated variable at the end of your program :-)
Compile a program that use libscientific¶
A program that use libscientific requires only one directive as follow:
1#include <scientific.h>
and to compile the code with a C or C++ compiler linking with -lscientific and specify the right paths using the -L<library path/of/libscientific and -I<include path/of libscientific> options.
1gcc -o example1 -L/usr/local/lib/ -I/usr/local/include -lscientific example1.c
Vector operations¶
Create/Allocate a vector¶
There are four different types of vectors
Double vector: dvector
Integer vector: ivector
Unsigned integer vector: uivector
String vector: strvector
Here we show an example on how to allocate/deallocate these four vector types.
1#include <stdio.h>
2#include <scientific.h>
3
4
5int main(void){
6 int i;
7 /* Define the variable vector, which in this case is a double vector.
8 * If we would like to use an integer or an unsigned integer or a
9 * string vector instead this construct became
10 * ivector *v; or uivector *v; or strvector *v;
11 */
12
13 dvector *v;
14 NewDVector(&v, 5); // Allocate the memory space.
15
16 // Fill the value inside the vector
17 for(i = 0; i < 5; i++){
18 v->data[i] = (float)i;
19 }
20
21 // Show the vector values to video
22 PrintDVector(v);
23
24 // Free the memory space
25 DelDVector(&v);
26}
27
Append a value to a given vector¶
Here we show an example on how to append a value to a vector.
1#include <stdio.h>
2#include <scientific.h>
3
4
5int main(void){
6 int i;
7
8 /*
9 * We define here the double vector.
10 * If we would like to utilize another vector
11 * we can change dvector with the other three possibilities:
12 * - uivector
13 * - ivector
14 * - strvector
15 *
16 */
17 dvector *v;
18
19 NewDVector(&v, 5); // We intialize the vector
20
21 // We add 5 numbers
22 for(i = 0; i < 5; i++){
23 v->data[i] = (float)i;
24 }
25
26 // We append a new number
27 DVectorAppend(v, 123.4);
28
29 // Print to video the result
30 PrintDVector(v);
31
32 // Free up the memory
33 DelDVector(&v);
34}
35
Work with string vectors¶
Matrix operations¶
Matrix is a user-defined data type that contains: - the number of rows - the number of columns - the 2D data array, which defines the matrix.
The data array is selected explicitly as a double type to work with an extensive range of numbers.
* - **data** two dimensional array of double
* - **row** number of rows
* - **col** number of columns
*/
Create/Allocate a matrix with a specific size¶
Create a simple matrix of 10 rows and 15 columns and fill it with numbers.
Then print it to the terminal using “PrintMatrix();”
1#include <stdio.h>
2#include <scientific.h>
3
4
5int main(void)
6{
7 int i, j;
8 matrix *m; // Definition of the pointer matrix variable
9 NewMatrix(&m, 10, 15); // Create the matrix with 10 rows and 15 columns. Each value in the matrix is 0.
10 for(i = 0; i < 10; i++){
11 for(j = 0; j < 15; j++){
12 m->data[i][j] = (float)i+j; // Fill the matrix values with these numbers
13 }
14 }
15 PrintMatrix(m); // Print to video the matrix content
16 DelMatrix(&m); // Free the memory space
17}
Initialize an empty matrix and append a row/column to it¶
An empty matrix is an object with rows and cols equal to 0. However in that matrix object we can add dynamically rows and columns and or resize it later on.
In this example we will initialize an empty matrix and we will add to it several rows.
1#include <stdio.h>
2#include <scientific.h>
3
4int main(void)
5{
6 int i;
7 matrix *mx; // Definition of the matrix variable as a pointer
8 dvector *row; // Definition of the row variable as a pointer
9
10 NewDVector(&row, 15);
11 for(i = 0; i < row->size; i++){
12 row->data[i] = (double)i; // Fill one time the row vector
13 }
14
15 initMatrix(&mx); // Initialize the empty matrix with rows and columns equal to 0
16
17 for(i = 0; i < 5; i++){
18 MatrixAppendRow(mx, row); // Append 5 times the row to the matrix mx
19 }
20
21 PrintMatrix(mx); // Print to video the matrix
22
23 DelDVector(&row); // Free the memory space for the row vector
24 DelMatrix(&mx); // Free the memory space for the matrix
25}
26
27
Of course the same code can be reused to add a columns using the construct “MatrixAppendCol” instead of “MatrixAppendRow”.
Matrix x Column vector dot product¶
In this example we illustrate the product between a matrix of sizes M x N and a column double vector of size N.
1#include <stdio.h>
2#include <scientific.h>
3
4int main(void)
5{
6 int i, j;
7 matrix *mx; // Definition of the matrix variable as a pointer
8 dvector *cvect; // Definition of the column vector as a pointer
9 dvector *result; // Definition of the result vector between the matrix and the column vector
10
11 NewDVector(&cvect, 15);
12 for(i = 0; i < cvect->size; i++){
13 cvect->data[i] = (double)i; // Fill one time the row vector
14 }
15
16 NewMatrix(&mx, 23, 15); // Initialize a matrix with 23 rows and 15 columns
17
18 for(i = 0; i < mx->row; i++){
19 for(j = 0; j < mx->col; j++){
20 mx->data[i][j] = (double)i+j;
21 }
22 }
23 NewDVector(&result, 23);
24
25 MatrixDVectorDotProduct(mx, cvect, result);
26 /*
27 * or MT_MatrixDVectorDotProduct if you want to run the multitask operation.
28 * This function is usefull for large matrix.
29 */
30
31
32 PrintDVector(result); // Print to video the result
33 // Free the memory spaces
34 DelDVector(&result);
35 DelDVector(&cvect);
36 DelMatrix(&mx);
37}
38
39
Transpose a matrix¶
A matrix transpose is an operation that flips a matrix over its diagonal. Here is an example that shows how to produce a transpose of a given matrix.
1#include <stdio.h>
2#include <scientific.h>
3
4
5int main(void)
6{
7 int i, j;
8 matrix *m, *m_T; // Definition of the pointer matrix variable
9 NewMatrix(&m, 10, 15); // Create the matrix with 10 rows and 15 columns. Each value in the matrix is 0.
10 NewMatrix(&m_T, m->col, m->row); // Create the transposed matrix with the flip of the columns and rows size
11
12 for(i = 0; i < 10; i++){
13 for(j = 0; j < 15; j++){
14 m->data[i][j] = (float)i+j; // Fill the matrix values with these numbers
15 }
16 }
17 MatrixTranspose(m, m_T);
18 PrintMatrix(m); // Print to video the original matrix content
19 puts("Transposed matrix");
20 PrintMatrix(m_T); // Print to video the transposed matrix
21 // Free the memory spaces
22 DelMatrix(&m);
23 DelMatrix(&m_T);
24}
Invert a matrix¶
In this example we show how to invert a matrix with libscientific
1#include <stdio.h>
2#include <scientific.h>
3#include <math.h>
4
5int main(void)
6{
7 int i;
8 matrix *m; // Definition of the matrix variable as a pointer
9 matrix *m_inv; // Definition of the inverted matrix variable as a pointer
10
11 NewMatrix(&m, 10, 10); // Allocate the matrix to invert
12 MatrixInitRandomFloat(m, -3., 3.); // Random fill the matrix with values within a range -3 < x < 3
13 PrintMatrix(m); // Print to video the matrix
14
15 initMatrix(&m_inv); // Initialize the matrix to invert
16 MatrixInversion(m, m_inv); // Invert the matrix
17
18 double det = fabs(MatrixDeterminant(m)); // Calculate the determinant
19
20 printf("Determinant %.4f\n", det); // Print to video the matrix determinant
21 PrintMatrix(m_inv); // Print to video the inverted matrix
22
23 // Free the memory spaces
24 DelMatrix(&m_inv);
25 DelMatrix(&m);
26}
27
28
Calculate eigenvectors and eigenvalues of a matrix¶
This example shows how to calculate eigenvectors and eigenvalues of an N x N real nonsymmetric matrix. The eigenvector/eigenvalue is computed thanks to the dgeev.f code extracted from the Lapack library.
1#include <stdio.h>
2#include <scientific.h>
3#include <math.h>
4
5int main(void)
6{
7 int i;
8 matrix *m; // Definition of the matrix variable as a pointer
9 dvector *eval; // Definition the variable to store the eigenvalues
10 matrix *evect; // Definition of the variable were to store the eigenvectors
11
12 NewMatrix(&m, 10, 10); // Allocate the matrix to invert
13 MatrixInitRandomFloat(m, -3., 3.); // Random fill the matrix with values within a range -3 < x < 3
14 PrintMatrix(m); // Print to video the matrix
15
16 // Initialize the variables
17 initDVector(&eval);
18 initMatrix(&evect);
19
20 EVectEval(m, eval, evect); // Calculate the eigenvectors and associated eigenvalues
21
22 PrintDVector(eval); // Print to video the eigenvalues
23 PrintMatrix(evect); // Print to video the eigenvectors. Each column correspond to an eingenvalue
24
25 // Free the memory spaces
26 DelDVector(&eval);
27 DelMatrix(&evect);
28 DelMatrix(&m);
29}
30
31
Singular Value Decomposition of a square matrix¶
In this example we show how to factorize a square matrix using the singular value decomposition (SVD) method
1#include <stdio.h>
2#include <scientific.h>
3#include <math.h>
4
5int main(void)
6{
7 int i;
8 matrix *m; // Definition of the matrix variable as a pointer
9 matrix *U; // Definition of the complex unitary matrix
10 matrix *S; // Definition of the rectangular diagnonal matrix with non-negative real numbers on diagonal
11 matrix *Vt; // Definition of conjugate transpose of a complex unitary matrix
12
13 NewMatrix(&m, 10, 10); // Allocate the matrix to invert
14 MatrixInitRandomFloat(m, -3., 3.); // Random fill the matrix with values within a range -3 < x < 3
15 PrintMatrix(m); // Print to video the matrix
16
17 // Initialize the variables
18 initMatrix(&U);
19 initMatrix(&S);
20 initMatrix(&Vt);
21 SVD(m, U, S, Vt); // Internal method
22 // SVDlapack(m, U, S, Vt); lapack method using dgesdd
23 // Print to video the results of the factorization
24 PrintMatrix(U);
25 PrintMatrix(S);
26 PrintMatrix(Vt);
27
28 // Free the memory spaces
29 DelMatrix(&U);
30 DelMatrix(&S);
31 DelMatrix(&Vt);
32 DelMatrix(&m);
33}
34
35
Tensor operations¶
Tensor is a user-defined data type that contains: - order: the number of matrix - m: the array the 2D data array, which defines the tensor itself.
The data array is selected explicitly as a double type to work with an extensive range of numbers.
* Tensor data structure
*
* - **m** list of matrix
* - **order** number of matrix layers
Create/Allocate a tensor with a specific size¶
Create a simple tensor of 3 blocks, 10 rows and 15 columns and fill it with numbers.
Then print it to the terminal using “PrintTensor();”
1#include <stdio.h>
2#include <scientific.h>
3
4
5int main(void)
6{
7 int i, j, k;
8 tensor *t; // Definition of the pointer tensor variable
9 NewTensor(&t, 3); // Create the tensor with 3 blocks;
10 for(k = 0; k < 3; k++){
11 NewTensorMatrix(t, k, 10, 15);
12 for(i = 0; i < 10; i++){
13 for(j = 0; j < 15; j++){
14 t->m[k]->data[i][j] = (float)i+j; // Fill the tensor values with these numbers
15 }
16 }
17 }
18 PrintTensor(t); // Print to video the tensor content
19 DelTensor(&t); // Free the memory space
20}
Initialize an empty tensor and append different matrix to it¶
An empty tensor is an object with matrix equal to 0. In that object we can add dynamically different matrix with different rows and columns.
In this example we will initialize an empty tensor and we add different matrix to it
1#include <stdio.h>
2#include <scientific.h>
3
4
5int main(void)
6{
7 size_t i, j, k;
8 tensor *t; // Definition of the pointer tensor variable
9 matrix *m;
10 initTensor(&t); // Tensor allocation
11 NewMatrix(&m, 10, 7); // Create of a matrix with 10 rows and 7 columns
12 TensorAppendMatrix(t, m); // Append this matrix to the tensor t
13 DelMatrix(&m); // Delete the matrix
14
15 NewMatrix(&m, 10, 10); // Create of a second matrix with 10 rows and 10 columns
16 TensorAppendMatrix(t, m); // Append this new matrix to the tensor t
17 DelMatrix(&m);
18
19 for(k = 0; k < t->order; k++){
20 for(i = 0; i < t->m[k]->row; i++){
21 for(j = 0; j < t->m[k]->col; j++){
22 t->m[k]->data[i][j] = (float)i+j; // Fill the tensor values with these numbers
23 }
24 }
25 }
26 PrintTensor(t); // Print to video the tensor content
27 DelTensor(&t); // Free the memory space
28}
Multivariate analysis algorithms¶
In this section, you will find examples of running multivariate analysis algorithms. In particular, the algorithm described here is extracted from official scientific publications and is adapted to run in multithreading to speed up the calculation.
PCA and PLS implements the NIPALS algorithm described in the following publication:
CPCA implements the NIPALS algorithm described in the following publication:
Principal Component Analysis (PCA)¶
Here is an example that shows how to compute a principal component analysis on a matrix.
1#include <stdio.h>
2#include <scientific.h>
3
4int main(void)
5{
6 matrix *m; // Definition of the input matrix
7 PCAMODEL *model; // Definition of the PCA model
8 int i, j;
9 int nobj = 20;
10 int nvars = 8;
11 NewMatrix(&m, nobj, nvars);
12
13 // Fill with random values the matrix m
14 srand(nobj);
15 for(size_t i = 0; i < nobj; i++){
16 for(size_t j = 0; j < nvars; j++){
17 m->data[i][j] = randDouble(0,20);
18 }
19 }
20
21
22 NewPCAModel(&model); // Allocation of the PCA model
23 PCA(m, 1, 5, model, NULL); // Calculation of the PCA on matrix m using unit variance scaling (1) and the extraction of 5 principal components
24
25 PrintPCA(model); // Print to video the PCA results
26
27 /* Of course you can print in a separate way the different results contained in the model variable
28 * model->scores is the matrix of scores
29 * model->loadings is the matrix of loadings
30 * model->colavg is the column average obtained from the input matrix
31 * model->scaling is the scaling factor obtained from the input matrix
32 */
33
34 // Free the memory spaces
35 DelPCAModel(&model);
36 DelMatrix(&m);
37}
38
Consensus Principal Component Analysis (CPCA)¶
Here is an example that shows how to compute a consenus principal component analysis on a tensor.
1#include <stdio.h>
2#include <scientific.h>
3
4int main(void)
5{
6 tensor *t; // Definition of the input tensor
7 CPCAMODEL *model; // Definition of the CPCA model
8 int i, j, k;
9 int nblocks = 4;
10 int nobj = 20;
11 int nvars[4] = {8, 10, 5, 7}; //every block have different variables
12 NewTensor(&t, nblocks);
13
14 // Fill with random values the matrix m
15 srand(nobj);
16 for(k = 0; k < nblocks; k++){
17 NewTensorMatrix(t, k, nobj, nvars[k]);
18 for(i = 0; i < nobj; i++){
19 for(j = 0; j < nvars[k]; j++){
20 t->m[k]->data[i][j] = randDouble(0,20);
21 }
22 }
23 }
24
25
26 NewCPCAModel(&model); // Allocation of the CPCA model
27 CPCA(t, 1, 5, model); // Calculation of the CPCA on matrix m using unit variance scaling (1) and the extraction of 5 super principal components
28
29 PrintCPCA(model); // Print to video the CPCA results
30
31 /* Of course you can print in a separate way the different results contained in the model variable
32 * model->super_scores is the matrix of super scores
33 * model->super_weights is the matrix of super weights
34 * model->block_scores is the tensor of scores for each block
35 * model->block_loadings is the tensor of loadings for each block
36 */
37
38 // Free the memory spaces
39 DelCPCAModel(&model);
40 DelTensor(&t);
41}
42
Partial Least Squares (PLS)¶
A matrix of features or independent vIariables and a matrix of targets or dependent variables is requested to calculate a PLS model. Here is a simple example that shows how to calculate a PLS model.
1#include <stdio.h>
2#include <scientific.h>
3
4int main(void)
5{
6 matrix *x, *y; // Define the feature matrix x and the target to predict y
7 dvector *betas; // Define the beta coefficients
8 PLSMODEL *m;
9
10 // Allocate the matrix
11 NewMatrix(&x, 14, 6);
12 NewMatrix(&y, 14, 1);
13
14 // Fill the matrix with values
15 // This is a manual filling.
16 // Of course we can read a csv file and fill it automatically
17
18 x->data[0][0] = 4.0000; x->data[0][1] = 4.0000; x->data[0][2] = 1.0000; x->data[0][3] = 84.1400; x->data[0][4] = 1.0500; x->data[0][5] = 235.1500;
19 x->data[1][0] = 5.0000; x->data[1][1] = 5.0000; x->data[1][2] = 1.0000; x->data[1][3] = 79.1000; x->data[1][4] = 0.9780; x->data[1][5] = 231;
20 x->data[2][0] = 4.0000; x->data[2][1] = 5.0000; x->data[2][2] = 1.0000; x->data[2][3] = 67.0900; x->data[2][4] = 0.9700; x->data[2][5] = 249.0000;
21 x->data[3][0] = 4.0000; x->data[3][1] = 4.0000; x->data[3][2] = 1.0000; x->data[3][3] = 68.0700; x->data[3][4] = 0.9360; x->data[3][5] = 187.3500;
22 x->data[4][0] = 3.0000; x->data[4][1] = 4.0000; x->data[4][2] = 2.0000; x->data[4][3] = 68.0800; x->data[4][4] = 1.0300; x->data[4][5] = 363.0000;
23 x->data[5][0] = 9.0000; x->data[5][1] = 7.0000; x->data[5][2] = 1.0000; x->data[5][3] = 129.1600; x->data[5][4] = 1.0900; x->data[5][5] = 258.0000;
24 x->data[6][0] = 10.0000; x->data[6][1] = 8.0000; x->data[6][2] = 0.0000; x->data[6][3] = 128.1600; x->data[6][4] = 1.1500; x->data[6][5] = 352.0000;
25 x->data[7][0] = 6.0000; x->data[7][1] = 6.0000; x->data[7][2] = 0.0000; x->data[7][3] = 78.1118; x->data[7][4] = 0.8765; x->data[7][5] = 278.6400;
26 x->data[8][0] = 16.0000; x->data[8][1] = 10.0000; x->data[8][2] = 0.0000; x->data[8][3] = 202.2550; x->data[8][4] = 1.2710; x->data[8][5] = 429.1500;
27 x->data[9][0] = 6.0000; x->data[9][1] = 12.0000; x->data[9][2] = 0.0000; x->data[9][3] = 84.1600; x->data[9][4] = 0.7800; x->data[9][5] = 279.0000;
28 x->data[10][0] = 4.0000; x->data[10][1] = 8.0000; x->data[10][2] = 1.0000; x->data[10][3] = 72.1100; x->data[10][4] = 0.8900; x->data[10][5] = 164.5000;
29 x->data[11][0] = 4.0000; x->data[11][1] = 9.0000; x->data[11][2] = 1.0000; x->data[11][3] = 71.1100; x->data[11][4] = 0.8660; x->data[11][5] = 210.0000;
30 x->data[12][0] = 5.0000; x->data[12][1] = 11.0000; x->data[12][2] = 1.0000; x->data[12][3] = 85.1500; x->data[12][4] = 0.8620; x->data[12][5] = 266.0000;
31 x->data[13][0] = 5.0000; x->data[13][1] = 10.0000; x->data[13][2] = 1.0000; x->data[13][3] = 86.1300; x->data[13][4] = 0.8800; x->data[13][5] = 228.0000;
32
33 y->data[0][0] = 357.1500;
34 y->data[1][0] = 388.0000;
35 y->data[2][0] = 403.0000;
36 y->data[3][0] = 304.5500;
37 y->data[4][0] = 529.0000;
38 y->data[5][0] = 510.0000;
39 y->data[6][0] = 491.0000;
40 y->data[7][0] = 353.3000;
41 y->data[8][0] = 666.6500;
42 y->data[9][0] = 354.0000;
43 y->data[10][0] = 339.0000;
44 y->data[11][0] = 360.0000;
45 y->data[12][0] = 379.0000;
46 y->data[13][0] = 361.0000;
47
48 // Allocate the PLS model
49 NewPLSModel(&m);
50
51 /* Calculate the partial least squares algorithm taking as input:
52 * x: the feature matrix x
53 * y: the target matrix y
54 * nlv: the number of latent variable nlv
55 * xautoscaling: the autoscaling type for the x matrix
56 * yautoscaling: the autoscaling type for the y Matrix
57 * model: the PLSMODEL previously allocated
58 * ssignal: a scientific signal to stop the calculation if requested by the user
59 *
60 * more information in the pls.h header file
61 * void PLS(matrix *mx, matrix *my, size_t nlv, size_t xautoscaling, size_t yautoscaling, PLSMODEL *model, ssignal *s);
62 */
63 PLS(x, y, 3, 1, 0, m, NULL);
64
65 PrintPLSModel(m); // Print to video the PLS model
66
67 /*Validate the model using the internal validation method*/
68 MODELINPUT minpt = initModelInput(); // Define the model input for the validation method
69 minpt.mx = x;
70 minpt.my = y;
71 minpt.nlv = 3;
72 minpt.xautoscaling = 1;
73 minpt.yautoscaling = 0;
74
75 // Use the boot strap random group cross validation.
76 BootstrapRandomGroupsCV(&minpt, 3, 100, _PLS_, m->predicted_y, m->pred_residuals, 4, NULL, 0);
77 // We can also compute leave one out in case...
78 // LeaveOneOut(&minpt, _PLS_, m->predicted_y, m->pred_residuals, 4, NULL, 0);
79
80 // Calculate the model validation statistics
81 PLSRegressionStatistics(y, m->predicted_y, m->q2y, m->sdep, m->bias);
82 //Print to video the results of the validation and the predicted values
83
84 puts("Q2 Cross Validation");
85 PrintMatrix(m->q2y);
86 puts("SDEP Cross Validation");
87 PrintMatrix(m->sdep);
88 puts("BIAS Cross Validation");
89 PrintMatrix(m->bias);;
90
91 // Calculate the beta coefficients to see the importance of each feature
92 puts("Beta coefficients");
93 initDVector(&betas);
94 PLSBetasCoeff(m, GetLVCCutoff(m->q2y), betas); // GetLVCCutoff select the best Q2 value from all the possibilities
95 PrintDVector(betas);
96
97 puts("PREDICTED VALUES");
98 PrintMatrix(m->predicted_y);
99
100 puts("PREDICTED RESIDUALS");
101 PrintMatrix(m->pred_residuals);
102
103 puts("REAL Y");
104 PrintMatrix(y);
105
106 // Free the memory spaces
107 DelDVector(&betas);
108 DelPLSModel(&m);
109 DelMatrix(&x);
110 DelMatrix(&y);
111}