The exports module¶
CIViCpy supports exporting of CIViC records to Variant Call Format (VCF) files.
This enables downstream analyses such as integrating with IGV, VEP, and
other common bioinformatics tools. VCF exports are maintained via the civic_vcf_writer
and civic_vcf_record modules in the civicpy.exports namespace:
>>>from civicpy.exports.civic_vcf_writer import CivicVcfWriter
>>>from civicpy.exports.civic_vcf_record import CivicVcfRecord
CIViCPy also supports exporting of CIViC assertion records to JSON files, where
assertions are represented as Global Alliance for Genomics and Health (GA4GH) Genomic
Knowledge Standards (GKS) objects. GKS JSON exports are maintained via the
civic_gks_writer and civic_gks_record modules in the civicpy.exports
namespace:
>>>from civicpy.exports.civic_gks_writer import CivicGksWriter
>>>from civicpy.exports.civic_gks_record import CivicGksPredictiveAssertion, CivicGksDiagnosticAssertion, CivicGksPrognosticAssertion
Other file formats are planned for future releases. Suggestions are welcome on our GitHub issues page.
VCF¶
VCFs are written using the civicpy.exports.CivicVcfWriter class, to
which you add civicpy.exports.CivicVcfRecord by adding them during
initialization. civic.Variant records can be converted to
civicpy.exports.CivicVcfRecord records by passing in the variant
during initialization.
In order to verify whether a variant can be converted to a CivicVcfRecord object, the convenience method
is_valid_for_vcf can be called on a civic.Variant object.
Each CivicVcfRecord object passed to the CivicVcfWriter is written to the VCF file.
If two records share the same chromosome, start position, and
reference allele(s), they will not be combined into one VCF record but will instead be written
as separate VCF records. Additional CIViC data are added to the VCF as annotations to the
CSQ (consequence) INFO field. All CIViC molecular profiles that the underlying variant is a part
are identified and the evidence items and assertions linked to these molecular profiles are added
to the CSQ field with one CSQ entry for each evidence item and/or assertion. Whether
a specific CSQ entry reflects an evidence item or an assertion is determined by the
CIViC Entity Type CSQ field. By utilizing the CSQ field for annotations, the resulting VCF is compatible for
import into Google BigQuery (git.io/bigquery-variant-annotation).
The status of the Assertions and EvidenceItems added to the CSQ annotations can
be controlled by the include_status parameter. Only items matching the
desired include_status(es) will be added to the CSQ annotation.
VCF CSQ Field Attributes
CSQ Field |
Description |
Compound Field [*] |
|---|---|---|
Allele |
Alternate allele |
No |
Consequence |
CIViC sequence ontology variant types for this variant |
Yes |
SYMBOL |
HGNC gene symbol for the gene associated with this variant |
No |
Entrez Gene ID |
Entrez gene identifier for the gene associated with this variant |
No |
Feature_type |
“transcript” |
No |
Feature |
The Ensembl identifier for the CIViC representative transcripts of this variant |
No |
HGVSc |
Variant representation using HGVS notation (DNA level), corresponding to the Feature |
No |
HGVSp |
Variant representation using HGVS notation (Protein level), corresponding to the Feature |
No |
CIViC Variant Name |
The CIViC variant name of this variant |
No |
CIViC Variant ID |
The CIViC internal identifier for this variant |
No |
CIViC Variant Aliases |
CIViC aliases for this variant |
Yes |
CIViC Variant URL |
CIViC URL for this variant |
No |
CIViC Molecular Profile Name |
The CIViC molecular profile name for the molecular profile of the evidence item or assertion described in this CSQ record. The molecular profile may either be a simple molecular profile for just this variant or a complex molecular profile involving this variant in combination with other CIViC variants. |
No |
CIViC Molecular Profile ID |
The CIViC internal identifier for the molecular profile |
No |
CIViC Molecular Profile Aliases |
CIViC aliases for this molecular profile |
Yes |
CIViC Molecular Profile URL |
CIViC URL for this molecular profile |
No |
CIViC HGVS |
CIViC HGVS strings for this variant |
Yes |
Allele Registry ID |
The allele registry identifier for this variant |
No |
ClinVar IDs |
ClinVar IDs associated with this variant |
Yes |
CIViC Molecular Profile Score |
The CIViC score reflecting the reelative abundance of total available curated evidence for this molecular profile |
No |
CIViC Entity Type |
The type of entity being annotated, either “evidence” or “assertion” |
No |
CIViC Entity ID |
The CIViC internal identifier for the entity being annotated |
No |
CIViC Entity URL |
The CIViC direct URL to the entity being annotated |
No |
CIViC Entity Source |
For evidence entities, the identifier of the publication used to create the evidence including the source type in the format “sourceId_(sourceType)” |
No |
CIViC Entity Variant Origin |
The variant origin of the entity being annotated, either “Somatic”, “Rare Germline”, “Common Germline”, “Unknown”, “N/A”, or “Mixed” |
No |
CIViC Entity Status |
The status of the CIViC entity being annotated, either “submitted”, “accepted”, or “rejected” |
No |
CIViC Entity Significance |
The type of signifiance of the entity being annotated |
No |
CIViC Entity Direction |
The direction of the significance of the entity being annotated, either “Supports”, or “Does Not Support” |
No |
CIViC Entity Disease |
The cancer or cancer subtype context for the entity being annotated |
No |
CIViC Entity Therapies |
A list of therapies applicable to the entity being annotated |
Yes |
CIViC Entity Therapy Interaction Type |
A term describing now more than one therapy interact with each other in the context of the entity being annotated, either “Combination”, “Sequential”, or “Substitutes” |
No |
CIViC Evidence Phenotypes |
A list of HPO phenotype terms linked to entity being annotated |
Yes |
CIViC Evidence Level |
For evidence entities, a level describing the robustness of the of the study supporting the evidence |
No |
CIViC Evidence Rating |
For evidence entities, a 1-5 rating indicating the curator’s confidence in the quality of the summarized evidence as a number of stars |
No |
CIViC Assertion ACMG Codes |
For assertion entities, a list of ACMG codes used in the assessment of the variant under the ACMG/AMP classification guidelines |
Yes |
CIViC Assertion AMP Category |
For assertion entities, a clinical classification by AMP/ASCO/CAP guidelines |
No |
CIViC Assertion NCCN Guideline |
For assertoin entities, a string of the NCCN guideline and version |
No |
CIVIC Assertion Regulatory Approval |
For assertion entities, a boolean indicating whether or not the therapies in this assertion have regulatory approval for the treatment of the assertion disease |
No |
CIVIC Assertion FDA Companion Test |
For assertion entities, a boolean indication whether or not theassertion has an associated FDA companion test |
No |
CivicVcfRecord¶
CivicVcfWriter¶
- class civicpy.exports.civic_vcf_writer.CivicVcfWriter(filepath, vcf_records, version=4.2)[source]¶
- Parameters:
filepath (path) – A file path for the VCF output file
vcf_records ([civic.exports.CivicVcfRecord]) – A list of
civic.exports.CivicVcfRecordobjects to write to the output fileversion (string) – VCF version
Example¶
Here’s an example of how to export all variants from CIViC to VCF:
from civicpy import civic
from civicpy.exports.civic_vcf_writer import CivicVcfWriter
from civicpy.exports.civic_vcf_record import CivicVcfRecord
records = []
for variant in civic.get_all_variants():
if variant.is_valid_for_vcf():
records.append(CivicVcfRecord(variant))
CivicVcfWriter("civic_variants.vcf", records)
GKS JSON¶
GKS JSON files are written using the civicpy.exports.civic_gks_writer.CivicGksWriter
class to which you add civicpy.exports.civic_gks_record.CivicGksPredictiveAssertion,
civicpy.exports.civic_gks_record.CivicGksDiagnosticAssertion, or
civicpy.exports.civic_gks_record.CivicGksPrognosticAssertion during initialization.
In order to verify whether an assertion can be converted to a CivicGksAssertionRecord
object, the convenience method is_valid_for_gks_json can be called on a
civic.Assertion object.
_CivicGksAssertionRecord¶
- class civicpy.exports.civic_gks_record._CivicGksAssertionRecord(assertion: Assertion, approval: Approval | None = None)[source]¶
Bases:
_CivicGksEvidenceAssertionMixin,ABCAbstract class for CIViC assertion record represented as GKS
- Parameters:
assertion – CIViC assertion record
- Raises:
CivicGksRecordError – If CIViC assertion is not able to be represented as GKS object
- get_classification_and_strength(amp_level: str) tuple[ga4gh.core.models.MappableConcept | None, ga4gh.core.models.MappableConcept | None][source]¶
Get classification and strength
- Parameters:
amp_level – AMP/ASCO/CAP level
- Returns:
Classification and strength, if found
- static get_contributions(approval: Approval) list[ga4gh.va_spec.base.core.Contribution][source]¶
Get contributions for an approval
- Parameters:
approval – Approval for assertion
- Returns:
List of contributions containing when the approval was last reviewed
- get_evidence_lines(assertion: Assertion, strength: MappableConcept) list[ga4gh.va_spec.base.core.EvidenceLine][source]¶
Get evidence lines for a CIViC assertion
Only the CIViC evidence items that are supported for GKS will be included
- Parameters:
assertion – CIViC assertion
strength – The CIViC Assertion’s strength
- Returns:
List of CIViC evidence lines
CivicGksPredictiveAssertion¶
- class civicpy.exports.civic_gks_record.CivicGksPredictiveAssertion(assertion: Assertion, approval: Approval | None = None)[source]¶
Bases:
_CivicGksAssertionRecord,VariantTherapeuticResponseStudyStatementClass for representing CIViC predictive assertion as GKS VariantTherapeuticResponseStudyStatement
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CivicGksDiagnosticAssertion¶
- class civicpy.exports.civic_gks_record.CivicGksDiagnosticAssertion(assertion: Assertion, approval: Approval | None = None)[source]¶
Bases:
_CivicGksAssertionRecord,VariantDiagnosticStudyStatementClass for representing CIViC diagnostic assertion as GKS VariantDiagnosticStudyStatement
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CivicGksPrognosticAssertion¶
- class civicpy.exports.civic_gks_record.CivicGksPrognosticAssertion(assertion: Assertion, approval: Approval | None = None)[source]¶
Bases:
_CivicGksAssertionRecord,VariantPrognosticStudyStatementClass for representing CIViC prognostic assertion as GKS VariantPrognosticStudyStatement
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CivicGksWriter¶
- class civicpy.exports.civic_gks_writer.CivicGksWriter(filepath: Path, gks_records: list[civicpy.exports.civic_gks_record.CivicGksPredictiveAssertion | civicpy.exports.civic_gks_record.CivicGksPrognosticAssertion | civicpy.exports.civic_gks_record.CivicGksDiagnosticAssertion], errors: list[civicpy.exports.civic_gks_writer.GksAssertionError] | None = None)[source]¶
Class for writing CIViC GKS assertions to JSON file
- Parameters:
filepath – The output file path to write the JSON file to
gks_records – List CIViC assertions represented as GKS objects
- Raises:
ValueError – If
filepathdoes not have .json suffix
Examples¶
Here’s an example of how to export all assertions to GKS JSON:
from civicpy import civic
from civicpy.exports.civic_gks_writer import CivicGksWriter
from civicpy.exports.civic_vcf_record import CivicVcfRecord
records = []
for assertion in civic.get_all_assertions():
if assertion.is_valid_for_gks_json():
try:
if assertion.assertion_type == "DIAGNOSTIC":
gks_record = CivicGksDiagnosticAssertion(assertion)
elif assertion.assertion_type == "PREDICTIVE":
gks_record = CivicGksPredictiveAssertion(assertion)
else:
gks_record = CivicGksPrognosticAssertion(assertion)
except CivicGksRecordError:
continue
records.append(gks_record)
CivicGksWriter("gks.json", records)
Here’s an example of how to export all assertions approved by a specific organization that are ready for submission to ClinVar.:
from civicpy import civic
from civicpy.exports.civic_gks_writer import CivicGksWriter
from civicpy.exports.civic_vcf_record import CivicVcfRecord
records = []
organization_id = 1
for assertion in civic.get_all_assertions_ready_for_clinvar_submission_for_org(organization_id):
if assertion.is_valid_for_gks_json():
try:
if assertion.assertion_type == "DIAGNOSTIC":
gks_record = CivicGksDiagnosticAssertion(assertion)
elif assertion.assertion_type == "PREDICTIVE":
gks_record = CivicGksPredictiveAssertion(assertion)
else:
gks_record = CivicGksPrognosticAssertion(assertion)
except CivicGksRecordError:
continue
records.append(gks_record)
CivicGksWriter("gks.json", records)