metakb.repository.base#

Declare base repository interface + associated helper functions.

class metakb.repository.base.AbstractRepository[source]#

Abstract definition of a repository class.

Used to access and store core MetaKB data.

abstract get_all_assertion_ids()[source]#

Return all assertion IDs

Return type:

list[str]

abstract get_gene(gene_id)[source]#

Attempt to retrieve a gene given exact ID match

Parameters:

gene_id (str) – exact gene_id as stored in DB (eg “metakb.gene:hgnc_6407”)

Return type:

Optional[MappableConcept]

Returns:

gene if available, with child gene objects in extensions

abstract get_statement(statement_id)[source]#

Retrieve a statement

Parameters:

statement_id (str) – ID of the statement minted by the source

Return type:

Optional[Statement]

Returns:

complete statement if available

abstract get_stats()[source]#

Fetch counts for entities

Return type:

RepositoryStats

Returns:

structured stats data class

abstract initialize()[source]#

Set up DB schema

Return type:

None

abstract load_assertion(assertion)[source]#

Add or update a complete assertion object to the DB

Parameters:

assertion (Statement) – metakb assertion

Return type:

None

abstract search_statements(variation_ids=None, gene_ids=None, therapy_ids=None, disease_ids=None, statement_ids=None, start=0, limit=None)[source]#

Perform entity-based search over all statements.

Return all statements matching any item within a given list of entity IDs.

IE: Given a list for [DrugA, DrugB] and [GeneA, GeneB], return all statements that involve both one of the two given drugs AND one of the two given genes.

Probable future changes * Combo-therapy specific search * Specific logic for searching diseases/conditionsets * Search on source values rather than normalized values

Parameters:
  • variation_ids (Optional[list[str]]) – list of normalized variation IDs

  • gene_ids (Optional[list[str]]) – list of normalized gene IDs

  • therapy_ids (Optional[list[str]]) – list of normalized therapy IDs

  • disease_ids (Optional[list[str]]) – list of normalized disease IDs

  • statement_ids (Optional[list[str]]) – list of source statement IDs

  • start (int) – pagination start point

  • limit (Optional[int]) – page size

Return type:

list[Statement | VariantDiagnosticStudyStatement | VariantPrognosticStudyStatement | VariantTherapeuticResponseStudyStatement]

Returns:

list of statements matching provided criteria

abstract teardown_db()[source]#

Reset repository storage.

Return type:

None

class metakb.repository.base.RepositoryStats(**data)[source]#

Define structure for reporting entity counts from the DB

num_diseases: int[source]#
num_documents: int[source]#
num_drugs: int[source]#
num_genes: int[source]#
num_metakb_assertions: int[source]#
num_source_statements: int[source]#
num_variations: int[source]#