HPOSet

HPOSet instances contains a set of HPO terms. This class is useful to represent a patient’s clinical information.

It provides analytical helper functions to narrow down the actual provided clinical information.

HPOSet class

class pyhpo.set.HPOSet(items)[source]
Parameters:

items (Iterable[pyhpo.HPOTerm]) –

child_nodes

HPOSet.child_nodes()[source]

Return a new HPOSet tha contains only the most specific HPO term for each subtree

It basically will return only HPO terms that do not have descendant HPO terms present in the set

Returns:

HPOSet instance that contains only the most specific child nodes of the current HPOSet

Return type:

HPOSet

remove_modifier

HPOSet.remove_modifier()[source]

Removes all modifier terms. By default, this includes

  • Mode of inheritance: 'HP:0000005'

  • Clinical modifier: 'HP:0012823'

  • Frequency: 'HP:0040279'

  • Clinical course: 'HP:0031797'

  • Blood group: 'HP:0032223'

  • Past medical history: 'HP:0032443'

Returns:

HPOSet instance that contains only Phenotypic abnormality HPO terms

Return type:

HPOSet

replace_obsolete

HPOSet.replace_obsolete(verbose=False)[source]

Replaces obsolete terms with the replacement term

Warning

Not all obsolete terms have a replacement. Obsolete terms without replacements will be removed from the set.

Parameters:

verbose (bool, default: False) – Print warnings if an obsolete term does not have a replacement.

Returns:

A new HPOSet

Return type:

HPOSet

all_genes

HPOSet.all_genes()[source]

Calculates the union of the genes attached to the HPO Terms in this set

Returns:

Set of all genes associated with the HPOTerms in the set

Return type:

set of annotations.Gene

omim_diseases

HPOSet.omim_diseases()[source]

Calculates the union of the Omim diseases attached to the HPO Terms in this set

Returns:

Set of all Omim diseases associated with the HPOTerms in the set

Return type:

set of annotations.Omim

information_content

HPOSet.information_content(kind='')[source]

Gives back basic information content stats about the HPOTerms within the set

Parameters:

kind (str, default: omim) – Which kind of information content should be calculated. Options are [β€˜omim’, β€˜orpha’, β€˜decipher’, β€˜gene’]

Returns:

Dict with the following items

  • mean - float - Mean information content

  • max - float - Maximum information content value

  • total - float - Sum of all information content values

  • all - list of float - List with all information content values

Return type:

dict

variance

HPOSet.variance()[source]

Calculates the distances between all its term-pairs. It also provides basic calculations for variances among the pairs.

Returns:

Tuple with the variance metrices

  • float Average distance between pairs

  • int Smallest distance between pairs

  • int Largest distance between pairs

  • list of int List of all distances between pairs

Return type:

tuple of (int, int, int, list of int)

combinations

HPOSet.combinations()[source]

Helper generator function that returns all possible two-pair combination between all its terms

This function is direction dependent. That means that every pair will appear twice. Once for each direction :rtype: Iterator[Tuple[HPOTerm, HPOTerm]]

Yields:

Tuple of term.HPOTerm – Tuple containing the follow items

  • HPOTerm instance 1 of the pair

  • HPOTerm instance 2 of the pair

Return type:

Iterator[Tuple[HPOTerm, HPOTerm]]

Examples

ci = HPOSet([term1, term2, term3])
ci.combinations()

# Output:
[
    (term1, term2),
    (term1, term3),
    (term2, term1),
    (term2, term3),
    (term3, term1),
    (term3, term2)
]

combinations_one_way

HPOSet.combinations_one_way()[source]

Helper generator function that returns all possible two-pair combination between all its terms

This methow will report each pair only once :rtype: Iterator[Tuple[HPOTerm, HPOTerm]]

Yields:

Tuple of term.HPOTerm – Tuple containing the follow items

  • HPOTerm instance 1 of the pair

  • HPOTerm instance 2 of the pair

Return type:

Iterator[Tuple[HPOTerm, HPOTerm]]

Example

ci = HPOSet([term1, term2, term3])
ci.combinations()

# Output:
[
    (term1, term2),
    (term1, term3),
    (term2, term3)
]

similarity

HPOSet.similarity(other, kind='', method='', combine='funSimAvg')[source]

Calculates the similarity to another HPOSet According to Robinson et al, American Journal of Human Genetics, (2008) and Pesquita et al, BMC Bioinformatics, (2008)

Parameters:
  • other (HPOSet) – Another HPOSet to measure the similarity to

  • kind (str, default '') – Which kind of information content should be calculated. Options are [β€˜omim’, β€˜orpha’, β€˜decipher’, β€˜gene’] See pyhpo.term.HPOTerm.similarity_score() for options

  • method (string, default '') –

    The method to use to calculate the similarity. See pyhpo.term.HPOTerm.similarity_score() for options

    Additional options:

    • equal - Calculates exact matches between both sets

  • combine (string, default funSimAvg) –

    The method to combine similarity measures.

    Available options:

    • funSimAvg - Schlicker A, BMC Bioinformatics, (2006)

    • funSimMax - Schlicker A, BMC Bioinformatics, (2006)

    • BMA - Deng Y, et. al., PLoS One, (2015)

Returns:

The similarity score to the other HPOSet

Return type:

float

Raises:
  • RuntimeError – The specified method or combine does not exist

  • NotImplementedError – This error can only occur with custom Similarity-Score methods that do not have a similarity method defined.

  • AttributeError – The information content for kind does not exist

toJSON

HPOSet.toJSON(verbose=False)[source]

Creates a JSON-like object of the HPOSet

Parameters:

verbose (bool, default False) – Include extra properties of the HPOTerm

Returns:

a list of HPOTerm dict objects

Return type:

list of dict

serialize

HPOSet.serialize()[source]

Creates a string serialization that can be used to rebuild the same HPOSet via pyhpo.set.HPOSet.from_serialized()

Returns:

A string representation of the HPOSet

Return type:

str

BasicHPOSet class

class pyhpo.set.BasicHPOSet(items)[source]

Child of HPOSet that automatically:

  • removes parent terms

  • removes modifier terms

  • replaces obsolete terms

Parameters:

items (Iterable[pyhpo.HPOTerm]) –

Class methods

from_queries

classmethod HPOSet.from_queries(queries)[source]

Builds an HPO set by specifying a list of queries to run on the pyhpo.ontology.Ontology

Parameters:

queries (list of (string or int)) – The queries to be run the identify the HPOTerm from the ontology

Returns:

A new HPOset

Return type:

pyhpo.set.HPOSet

Examples

ci = HPOSet([
    'Scoliosis',
    'HP:0001234',
    12
])

from_serialized

classmethod HPOSet.from_serialized(pickle)[source]

Re-Builds an HPO set from a serialized HPOSet object

Parameters:

pickle (str) – The serialized HPOSet object

Returns:

A new HPOset

Return type:

pyhpo.set.HPOSet

Examples

ci = HPOSet(ontology, '12+24+66628')