Matrix

To reduce overhead and external dependencies, PyHPO uses an internal data matrix, pyhpo.Matrix. It is used for row- and columnwise comparisons of HPOSets.

Matrix should not be used for other purposes, as it does not contain much error handling and expects conform clients.

Matrix class

class pyhpo.matrix.Matrix(rows, cols, data=None)[source]

# noqa: E501

Poor man’s implementation of a DataFrame/Matrix

This is used to calculate similarities between HPO sets and is surprisingly much faster than using pandas DataFrames

Note

Pandas:

===== COMPARING SETS ======
23806489 function calls (23770661 primitive calls) in 19.705 seconds

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
....
9900    0.267    0.000   19.106    0.002 set.py:318(similarity)
9900    1.124    0.000   14.330    0.001 set.py:477(_sim_score)
....

Matrix:

===== COMPARING SETS ======
12870433 function calls in 6.642 seconds

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
....
9900    0.048    0.000    6.424    0.001 set.py:316(similarity)
9900    0.928    0.000    5.112    0.001 set.py:432(_sim_score)
....

Warning

This Matrix should not be used as a public interface. It’s only used internally for calculations.

Parameters:
  • rows (int) – The number of rows in the Matrix

  • cols (int) – The number of columns in the Matrix

  • data (list of values, default None) – A list with values to fill the Matrix.

n_rows

The number of rows in the Matrix

Type:

int

n_cols

The number of columns in the Matrix

Type:

int

rows

Iterator over all rows

Example:

print(matrix)

>>    ||   0|   1|   2|   3|
>> =========================
>> 0  ||  11|  12|  13|  14|
>> 1  ||  21|  22|  23|  24|
>> 2  ||  31|  32|  33|  34|

for row in matrix.rows:
    print(row)

>> [11, 12, 13, 14]
>> [21, 22, 23, 24]
>> [31, 32, 33, 34]
Type:

iterator

columns

Iterator over all columns

Example:

print(matrix)

>>    ||   0|   1|   2|   3|
>> =========================
>> 0  ||  11|  12|  13|  14|
>> 1  ||  21|  22|  23|  24|
>> 2  ||  31|  32|  33|  34|

for col in matrix.columns:
    print(col)

>> [11, 21, 31]
>> [12, 22, 32]
>> [13, 23, 33]
>> [14, 24, 34]
Type:

iterator