Similarity

Builtin Similarity algorithms

Resnik

class pyhpo.similarity.defaults.Resnik(**data)[source]

Based on Resnik P, Proceedings of the 14th IJCAI, (1995)

https://www.ijcai.org/Proceedings/95-1/Papers/059.pdf

Parameters:

dependencies (List[str]) –

Lin

class pyhpo.similarity.defaults.Lin(**data)[source]

Based on Lin D, Proceedings of the 15th ICML, (1998)

https://dl.acm.org/doi/10.5555/645527.657297

Parameters:

dependencies (List[str]) –

JC (Jiang & Conrath)

class pyhpo.similarity.defaults.JC(**data)[source]

Jiang & Conrath similarity Score, based on Jiang J, Conrath D, Rocling X, (1997) and Deng Y, et. al., PLoS One, (2015)

https://aclanthology.org/O97-1002.pdf

Note

This method was previously wrongly implemented and fixed in 3.3.0 based on this discussion

Parameters:

dependencies (List[str]) –

Relevance

class pyhpo.similarity.defaults.Relevance(**data)[source]

Based on Schlicker A, et.al., BMC Bioinformatics, (2006)

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-302

Parameters:

dependencies (List[str]) –

InformationCoefficient

class pyhpo.similarity.defaults.InformationCoefficient(**data)[source]

Based on Li B, et. al., arXiv, (2010)

https://arxiv.org/abs/1001.0958

Parameters:

dependencies (List[str]) –

GraphIC

class pyhpo.similarity.defaults.GraphIC(**data)[source]

Graph based Information coefficient, based on Deng Y, et. al., PLoS One, (2015)

https://pubmed.ncbi.nlm.nih.gov/25664462/

Parameters:

dependencies (List[str]) –

Distance

class pyhpo.similarity.defaults.Distance(**data)[source]

actual distance (number of hpos) between Terms

Parameters:

dependencies (List[str]) –

Custom Similarity algorithms

The similarity submodule allows to create custom Similarity calculations for comparison of single terms or term-sets.

It provides a simple interface to register custom Similarity handler, so that they can be called directly on an pyhpo.term.HPOTerm or an pyhpo.set.HPOSet.

class pyhpo.similarity.base.SimilarityBase(**data)[source]

Base class to use for custom similarity calculations.

Custom implementation must inherit from pyhpo.similarity.base.SimilarityBase and provide a __call__ method with the same signature as pyhpo.similarity.base.SimilarityBase.__call__()

You can also provide a list of dependencies of other similarity methods that should be called beforehand. Results of these calls will be passed as dependencies parameter to the __call__ method.

Parameters:

dependencies (List[str]) –

__call__

SimilarityBase.__call__(term1, term2, kind, dependencies)[source]

This method does the actual calculation of the similarity. This method must be provided in all custom similarity classes

Parameters:
  • term1 (HPOTerm) – One of the two terms to compare

  • term2 (HPOTerm) – The other of the two terms to compare

  • kind (str) – This can be an extra parameter, ususally omim or gene to specify which annotations to consider for similarity

  • depdencies – A list of other calculation-results that should be calculated beforehand. This is not needed at all, but helpful if your implementation builds on an already existing similarity calculation.

  • dependencies (List[float]) –

Return type:

float

Examples

from pyhpo.similarity.base import SimScore, SimilarityBase
from pyhpo import Ontology

class CustomSimscore(SimilarityBase):
    # For demo purposes, we will just check for equality

    def __call__(
        self,
        term1: 'pyhpo.HPOTerm',
        term2: 'pyhpo.HPOTerm',
        kind: str,
        dependencies: List[float]
    ) -> float:
        if term1 == term2:
            return 1
        else:
            return 0

SimScore.register('custom_method', CustomSimscore)

_ = Ontology()

term1 = Ontology.get_hpo_object('Scoliosis')
term2 = Ontology.get_hpo_object('Thoracic scoliosis')

sim_score = term1.similarity_score(
    other=term2,
    kind='omim',  # actually doesn't matter in this example
    method='custom_method'
)

assert sim_score == 0

# Now comparing the same term to each other
sim_score = term1.similarity_score(
    other=term1,
    kind='omim',  # actually doesn't matter in this example
    method='custom_method'
)

assert sim_score == 1