metakb.normalizers#

Handle construction of and relay requests to VICC normalizer services.

exception metakb.normalizers.IllegalUpdateError[source]#

Raise if illegal update operation is attempted.

class metakb.normalizers.NormalizerName(value, names=_not_given, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Constrain normalizer CLI options.

DISEASE = 'disease'[source]#
GENE = 'gene'[source]#
THERAPY = 'therapy'[source]#
class metakb.normalizers.VariationRuntimeProbeResult(variation_found, warnings, error=None)[source]#

Define result model for variation-normalizer runtime probes.

__init__(variation_found, warnings, error=None)[source]#
error: Optional[Exception] = None[source]#
variation_found: bool[source]#
warnings: Optional[list[str]][source]#
class metakb.normalizers.ViccNormalizers(db_url=None, cache_size=DEFAULT_CACHE_SIZE)[source]#

Manage VICC concept normalization services.

The therapy, disease, and gene normalizer wrappers all behave roughly the same way: given a list of possible terms, run each of them through the normalizer, and return the normalized record for the one with the highest match type (tie goes to the earlier search term). Variations are handled differently; from the provided list of terms, the first one that normalizes completely is returned, so order is particularly important when multiple terms are given.

See concept normalization services in the documentation for more.

__init__(db_url=None, cache_size=DEFAULT_CACHE_SIZE)[source]#

Initialize normalizers. Construct a normalizer instance for each service (gene, variation, disease, therapy) and retain them as instance properties.

  • Gene concept lookups within the Variation Normalizer are resolved using the Gene Normalizer instance, rather than creating a second sub-instance.

  • The normalizers are exposed interally as callback functions wrapped in an lru_cache at initialization, so that a configurable cache size variable can be passed to the caching wrapper

Parameters:
  • db_url (Optional[str]) – optional definition of shared normalizer database. Because the same parameter is passed to each concept normalizer, this only works for connecting a DynamoDB backend. If not given, each normalizer falls back on default behavior for connecting to a database, which includes checking their corresponding environment variables.

  • cache_size (Optional[int]) – size of LRU cache used for each normalizer. Use None to use an unbounded cache (ie no max size).

static get_regulatory_approval_extension(therapy_norm_resp)[source]#

Given therapy normalization service response, extract out the regulatory approval extension

Parameters:

therapy_norm_resp (NormalizationService) – Response from normalizing therapy

Return type:

Optional[Extension]

Returns:

Extension containing transformed regulatory approval and indication data if it regulatory_approval extensions exists in therapy normalizer

normalize_disease(query)[source]#

Attempt to normalize a disease query

>>> from metakb.normalizers import ViccNormalizers
>>> v = ViccNormalizers()
>>> v.normalize_disease("von hippel-lindau syndrome")[1]
'ncit:C3105'
Parameters:

query (str) – Disease query normalize

Raises:

TokenRetrievalError – If AWS credentials are expired

Return type:

tuple[NormalizationService, Optional[str]]

Returns:

Disease normalization response and normalized disease ID, if available.

normalize_gene(query)[source]#

Attempt to normalize a gene query

>>> from metakb.normalizers import ViccNormalizers
>>> v = ViccNormalizers()
>>> v.normalize_gene("BRAF")[1]
'hgnc:1097'
Parameters:

query (str) – Gene query to normalize

Raises:

TokenRetrievalError – If AWS credentials are expired

Return type:

tuple[NormalizeService, Optional[str]]

Returns:

Gene normalization response and normalized gene ID, if available.

normalize_therapy(query)[source]#

Attempt to normalize a therapy query

>>> from metakb.normalizers import ViccNormalizers
>>> v = ViccNormalizers()
>>> v.normalize_therapy("VAZALORE")[1]
'rxcui:1191'
Parameters:

query (str) – Therapy query normalize

Raises:

TokenRetrievalError – If AWS credentials are expired

Return type:

tuple[NormalizationService, Optional[str]]

Returns:

Therapy normalization response and normalized therapy ID, if available.

async normalize_variation(query)[source]#

Attempt to normalize a variation query

Parameters:

query (str) – Variation query to normalize

Raises:

TokenRetrievalError – If AWS credentials are expired

Return type:

UnionType[Allele, CopyNumberChange, CopyNumberCount, None]

Returns:

A normalized variation, if available.

metakb.normalizers.check_normalizers(db_url, normalizers=None)[source]#

Perform basic health checks on the gene, disease, and therapy normalizers.

Uses the internal health check methods provided by each. Note that health check failures (i.e. tables unavailable or unpopulated) are logged as WARNINGs, but unhandled exceptions encountered during checks are suppressed and logged as ERRORs.

>>> from metakb.normalizers import check_normalizers, NormalizerName
>>> check_normalizers([NormalizerName.DISEASE])
True  # indicates success
>>> check_normalizers([NormalizerName.THERAPY])
False  # indicates failure
Parameters:
  • db_url (Optional[str]) – optional designation of DB URL to use for each. Currently only works for DynamoDB. If not given, normalizers will fall back on their own env var/default configurations.

  • normalizers (Optional[Iterable[NormalizerName]]) – names of specific normalizers to check (check all if empty/None)

Return type:

bool

Returns:

True if all normalizers pass all checks, False if any failures are encountered.

async metakb.normalizers.probe_variation_normalizer_runtime(normalizers, query='BRAF V600E')[source]#

Probe variation-normalizer runtime readiness for a query.

This helper intentionally does not suppress exceptions so callers can present targeted diagnostics.

Parameters:
  • normalizers (ViccNormalizers) – initialized normalizer container

  • query (str) – variation query to test

Return type:

VariationRuntimeProbeResult

Returns:

probe result with variation/warnings data or captured exception

metakb.normalizers.update_normalizer(normalizer, db_url)[source]#

Refresh data for a normalizer.

Parameters:
  • normalizer (NormalizerName) – name of service to refresh

  • db_url (Optional[str]) – normalizer DB URL. If not given, will fall back on normalizer defaults.

Raises:

IllegalUpdateError – if attempting to update cloud DB instances

Return type:

None