Getting started in Python¶
Every data type object in libscientific is stored in the HEAP and then supports dynamic memory allocation.
In python, there is no need to allocate/deallocate matrix/vectors/tensors and models in general because the python binding itself automatically handles them.
Use libscientific in python¶
First, you need to install the c library and the python package. Please follow the process described here.
A program that use libscientific requires to import the python binding as follows
1import libscientific
2...
Vector operations¶
Create a vector in python¶
There are four different types of vectors
Double vector: dvector
Integer vector: ivector
Unsigned integer vector: uivector
String vector: strvector
Here we show an example on how create these four vector types.
1#!/usr/bin/env python3
2import libscientific
3from random import random
4
5# Create a list of values that you whant to convert to a double vector
6a = [random() for j in range(5)]
7
8# Transform the list a into a double vector d
9d = libscientific.vector.DVector(a)
10
11# Just print to video the content of vector d
12d.debug()
13
14# If you want to catch the value in position 1
15print(d[1])
16
17# If you want to modify the value in position 1
18d[1] = -2
19
20#If you want to get back the result as a "list"
21dlst = d.tolist()
22
23for item in dlst:
24 print(item)
Append a value to a given vector¶
Here we show an example on how to append a value to a vector.
1#!/usr/bin/env python3
2import libscientific
3from random import random
4
5# Create a list of values that you whant to convert to a double vector
6a = [random() for j in range(5)]
7d = libscientific.vector.DVector(a)
8# print the output of the double vector d
9print("orig vector")
10d.debug()
11
12
13# append the value 0.98765 at the end of d
14d.append(0.98765)
15print("append 0.98765 at the end")
16d.debug()
17
18# extend the vector d with more other values from a list
19d.extend([0.4362, 0.34529, 0.99862])
20print("extent the vector with 3 more values")
21d.debug()
Matrix operations¶
Matrix is a user-defined data type that contains information in regards to - the number of rows - the number of columns - the 2D data array which defines the matrix
The data array in python uses the c language implementation. However, memory allocation/destruction is carried out directly from the python class. Hence there is no need to free up the memory manually.
Create a matrix in python¶
In this example, we show how to create a matrix from a list of lists (or numpy array), modify its content and convert it again to a list of lists.
1#!/usr/bin/env python3
2import libscientific
3from random import random
4
5# Create a random list of list
6a = [[random() for j in range(2)] for i in range(10)]
7
8# Convert the list of list matrix into a libscientific matrix
9m = libscientific.matrix.Matrix(a)
10
11# Get the value at row 1, column 1
12print("Get value example")
13print(m[1, 1])
14
15# Modify the value at row 1, column 1
16print("Set value example")
17m[1, 1] = -2.
18m.debug()
19
20
21# Convert the matrix again to a list of list
22mlst = m.tolist()
23for row in mlst:
24 print(row)
Tensor operations¶
Tensor is a user-defined data type that contains: - order: the number of matrix - m: the array the 2D data array, which defines the tensor itself.
The data array in python uses the c language implementation. However, memory allocation/destruction is carried out directly from the python class. Hence there is no need to free up the memory manually
Create a tensor in python¶
In this example, we show how to create a tensor from a list of list of lists (or numpy array), modify its content and convert it again to a list of lists.
1#!/usr/bin/env python3
2import libscientific
3from random import random
4
5# Create a random list of list
6a = [[[random() for j in range(2)] for i in range(10)] for k in range(3)]
7
8# Convert the list of list of lists into a libscientific tensor
9t = libscientific.tensor.Tensor(a)
10
11# Get the value at row 1, column 1
12print("Get value example")
13print(t[1, 1, 1])
14
15# Modify the value at row 1, column 1
16print("Set value example")
17t[1, 1, 1] = -2.
18t.debug()
19
20
21# Convert the matrix again to a list of list
22tlst = t.tolist()
23i = 1
24for block in tlst:
25 print("Block %d" % (i))
26 for row in block:
27 print(row)
28 i+=1
Multivariate analysis algorithms¶
In this section, you will find examples of running multivariate analysis algorithms. In particular, the algorithm described here is extracted from official libscientific publications and is adapted to run in multithreading to speed up the calculation.
PCA and PLS implements the NIPALS algorithm described in the following publication:
CPCA implements the NIPALS algorithm described in the following publication:
Principal Component Analysis (PCA)¶
Here is an example that shows to compute a principal component analysis on a matrix.
1#!/usr/bin/env python3
2
3import libscientific
4import random
5
6def mx_to_video(m, decimals=5):
7 for row in m:
8 print("\t".join([str(round(x, decimals)) for x in row]))
9
10random.seed(123456)
11
12# Create a random matrix of 10 objects and 4 features
13a = [[random.random() for j in range(4)] for i in range(10)]
14print("Original Matrix")
15mx_to_video(a)
16
17# Compute 2 Principal components using the UV scaling (unit variance scaling)
18model = libscientific.pca.PCA(scaling=1, npc=2)
19# Fit the model
20model.fit(a)
21
22# Show the scores
23print("Showing the PCA scores")
24scores = model.get_scores()
25mx_to_video(scores, 3)
26
27# Show the loadings
28print("Showing the PCA loadings")
29loadings = model.get_loadings()
30mx_to_video(loadings, 3)
31
32# Show the explained variance
33print(model.get_exp_variance())
34
35# Show the loadings
36print("Predict/Project new data into the PCA model")
37p_scores = model.predict(a)
38mx_to_video(p_scores)
39
40# Reconstruct the original PCA matrix from the 2 principal components
41print("Reconstruct the original PCA matrix using the PCA Model")
42ra = model.reconstruct_original_matrix()
43mx_to_video(ra)
44
45# Save model
46model.save("mymodel.sqlite3")
47
48# Load model
49model2 = PCA()
50model2.load("mymodel.sqlite3")
Consensus Principal Component Analysis (CPCA)¶
Here is an example that shows how to compute a consenus principal component analysis on a tensor.
1#!/usr/bin/env python3
2
3import libscientific
4import random
5
6def mx_to_video(m, decimals=5):
7 for row in m:
8 print("\t".join([str(round(x, decimals)) for x in row]))
9
10def t_to_video(t):
11 i = 1
12 for m in t:
13 print("Block: %d" % (i))
14 mx_to_video(m, 3)
15 i+=1
16
17random.seed(123456)
18
19# Create a random matrix of 10 objects and 4 features
20a = [[[random.random() for j in range(4)] for i in range(10)] for k in range(4)]
21
22print("Original Matrix")
23t_to_video(a)
24
25# Compute 2 Principal components using the UV scaling (unit variance scaling)
26model = libscientific.cpca.CPCA(scaling=1, npc=2)
27# Fit the model
28model.fit(a)
29
30# Show the super scores
31print("Showing the CPCA super scores")
32sscores = model.get_super_scores()
33mx_to_video(sscores, 3)
34
35# Show the super weights
36print("Showing the CPCA super weights")
37sweights = model.get_super_weights()
38mx_to_video(sweights, 3)
39
40# Show the block scores
41print("Showing the CPCA block scores")
42block_scores = model.get_block_scores()
43t_to_video(block_scores)
44
45# Show the block loadings
46print("Showing the CPCA block loadings")
47block_loadings = model.get_block_loadings()
48t_to_video(block_loadings)
49
50# Show the total variance explained by the super scores
51print("Showing the CPCA total variance explained")
52print(model.get_total_exp_variance())
53
54# Predict/Project new data into the model
55print("Project/Predict new data into the CPCA model")
56p_ss, p_bs = model.predict(a)
57print("Showing the predicted super scores")
58mx_to_video(p_ss, 3)
59print("Showing the predicted block scores")
60t_to_video(p_bs)
61
62# Save model
63model.save("mymodel.sqlite3")
64
65# Load model
66model2 = CPCA()
67model2.load("mymodel.sqlite3")
Partial Least Squares (PLS)¶
A matrix of features or independent variables and a matrix of targets or dependent variables is requested to calculate a PLS model.
Here is a simple example that shows how to calculate a PLS model.
1#!/usr/bin/env python3
2
3import libscientific
4import random
5
6def mx_to_video(m, decimals=5):
7 for row in m:
8 print("\t".join([str(round(x, decimals)) for x in row]))
9
10random.seed(123456)
11x = [[random.random() for j in range(4)] for i in range(10)]
12y = [[random.random() for j in range(1)] for i in range(10)]
13xp = [[random.random() for j in range(4)] for i in range(10)]
14
15print("Original Matrix")
16print("X")
17mx_to_video(x)
18print("Y")
19mx_to_video(y)
20print("XP")
21mx_to_video(xp)
22print("Computing PLS ...")
23model = libscientific.pls.PLS(nlv=2, xscaling=1, yscaling=0)
24model.fit(x, y)
25print("Showing the PLS T scores")
26tscores = model.get_tscores()
27mx_to_video(tscores, 3)
28
29print("Showing the PLS U scores")
30uscores = model.get_uscores()
31mx_to_video(uscores, 3)
32
33print("Showing the PLS P loadings")
34ploadings = model.get_ploadings()
35mx_to_video(ploadings, 3)
36
37print("Showing the X Variance")
38print(model.get_exp_variance())
39
40
41print("Predict XP")
42py, pscores = model.predict(xp)
43print("Predicted Y for all LVs")
44mx_to_video(py, 3)
45print("Predicted Scores")
46mx_to_video(pscores, 3)
47
48# Save model
49model.save("mymodel.sqlite3")
50
51# Load model
52model2 = PLS()
53model2.load("mymodel.sqlite3")