Similarity
Builtin Similarity algorithms
Resnik
- class pyhpo.similarity.defaults.Resnik(**data)[source]
Based on Resnik P, Proceedings of the 14th IJCAI, (1995)
https://www.ijcai.org/Proceedings/95-1/Papers/059.pdf
- Parameters:
dependencies (List[str]) –
Lin
- class pyhpo.similarity.defaults.Lin(**data)[source]
Based on Lin D, Proceedings of the 15th ICML, (1998)
https://dl.acm.org/doi/10.5555/645527.657297
- Parameters:
dependencies (List[str]) –
JC (Jiang & Conrath)
- class pyhpo.similarity.defaults.JC(**data)[source]
Jiang & Conrath similarity Score, based on Jiang J, Conrath D, Rocling X, (1997) and Deng Y, et. al., PLoS One, (2015)
https://aclanthology.org/O97-1002.pdf
Note
This method was previously wrongly implemented and fixed in 3.3.0 based on this discussion
- Parameters:
dependencies (List[str]) –
Relevance
- class pyhpo.similarity.defaults.Relevance(**data)[source]
Based on Schlicker A, et.al., BMC Bioinformatics, (2006)
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-302
- Parameters:
dependencies (List[str]) –
InformationCoefficient
- class pyhpo.similarity.defaults.InformationCoefficient(**data)[source]
Based on Li B, et. al., arXiv, (2010)
https://arxiv.org/abs/1001.0958
- Parameters:
dependencies (List[str]) –
GraphIC
- class pyhpo.similarity.defaults.GraphIC(**data)[source]
Graph based Information coefficient, based on Deng Y, et. al., PLoS One, (2015)
https://pubmed.ncbi.nlm.nih.gov/25664462/
- Parameters:
dependencies (List[str]) –
Distance
Custom Similarity algorithms
The similarity
submodule allows to create custom Similarity calculations
for comparison of single terms or term-sets.
It provides a simple interface to register custom Similarity handler, so
that they can be called directly on an pyhpo.term.HPOTerm
or an
pyhpo.set.HPOSet
.
- class pyhpo.similarity.base.SimilarityBase(**data)[source]
Base class to use for custom similarity calculations.
Custom implementation must inherit from
pyhpo.similarity.base.SimilarityBase
and provide a__call__
method with the same signature aspyhpo.similarity.base.SimilarityBase.__call__()
You can also provide a list of
dependencies
of other similarity methods that should be called beforehand. Results of these calls will be passed asdependencies
parameter to the__call__
method.- Parameters:
dependencies (List[str]) –
__call__
- SimilarityBase.__call__(term1, term2, kind, dependencies)[source]
This method does the actual calculation of the similarity. This method must be provided in all custom similarity classes
- Parameters:
term1 (
HPOTerm
) – One of the two terms to compareterm2 (
HPOTerm
) – The other of the two terms to comparekind (
str
) – This can be an extra parameter, ususallyomim
orgene
to specify which annotations to consider for similaritydepdencies – A list of other calculation-results that should be calculated beforehand. This is not needed at all, but helpful if your implementation builds on an already existing similarity calculation.
dependencies (List[float]) –
- Return type:
float
Examples
from pyhpo.similarity.base import SimScore, SimilarityBase
from pyhpo import Ontology
class CustomSimscore(SimilarityBase):
# For demo purposes, we will just check for equality
def __call__(
self,
term1: 'pyhpo.HPOTerm',
term2: 'pyhpo.HPOTerm',
kind: str,
dependencies: List[float]
) -> float:
if term1 == term2:
return 1
else:
return 0
SimScore.register('custom_method', CustomSimscore)
_ = Ontology()
term1 = Ontology.get_hpo_object('Scoliosis')
term2 = Ontology.get_hpo_object('Thoracic scoliosis')
sim_score = term1.similarity_score(
other=term2,
kind='omim', # actually doesn't matter in this example
method='custom_method'
)
assert sim_score == 0
# Now comparing the same term to each other
sim_score = term1.similarity_score(
other=term1,
kind='omim', # actually doesn't matter in this example
method='custom_method'
)
assert sim_score == 1