GRCh38/hg38 17p12 (chr17:14184616-15581544) x1 Web viewThe code is the HGNC code for the gene, the...

document.docx

Background and reader guidance

1. Live demo of an almost up to date version of this content- with HL7 sample message generator

At: https://lforms-demo.nlm.nih.gov/ you can see and play with the latest version of the executable form for collecting almost all of the variables in this document. It also provides the ability to generate example HL7 v2 messages, (but that part is a few steps behind our specifications- so the OBX-4 values and the specification of NR data type may not be quite up to date) the two variants of the genomics form are the first two in the left side of the web page. One shows all of the variables in one vertical list and the other (easier to read) shows them in many horizontal rows.

2. One constraint and one question

a) At present this specification address human variations only .Want to be sure that is understood. (The H in HGNC, stands for Human. Not possible to deal with changes for broader set of species now. But would be relatively easy to extend to veterinary medicine species and could explore broader generalizations in the future

b) We need to clarify whether everyone would prefer the use of the OBX-4 Dewey decimal hierarchy rather than the OBR- OBX nesting as in release 2 of the current implementation guide. We have had that definite impression since January that every one preferred the OBX-4 approach but have not poled the group to be sure. There are also advantages to the OBR OBX approach. The current example all assume the OBX-4 dot notation. But if everyone preferred the other, we could change. Would NOT want to support both.

3. Guidance for reading this document

The table includes the LOINC terms, their Optionality, cardinality, with sample values for OBX-3, OBX-4 and OBX-5. The whole table is organized as an example message with repeating rows and valid data within each row as might appear in a real message. (We will produce full sample cHL7 messages for the ballot.)

We have highlighted terms that are new since the last presentation, by marking the LOINC code for the term with yellow. We accepted suggestions for including a number of additional attributes –because they fit the model and would not have to be used by those that did not want to.

document.docx

Almost all of the variables now have LOINC codes, we have added Word marginal comments to the terms that have been there in most of the past versions, but have new LOINC codes

We have added explanations and guidance to the comments (The last column) so though the terms may always have been there you might want to review the comments

We have organized the table into 5 sections labeled as such. - Different kinds of reports will use 2-3 of these sections in their reporting. Almost all will take terms from section 1.

1) Items that that apply to the whole report should be populated with 1’s or 1.1s for the terms that may repeat in the overall report section

2) Items that apply to a simple variant. Simple variants may repeat, but with no overarching content beyond the overall report variables. OBX-4s in this section should begin with 2.i where x increments for each repeat of a particular term

3) Structural variants. The OBX-4s for these terms will contain a 3 or a 3.i depending on whether he term repeat4) Pharmacogenomic summary statements- the OBX-4s will all include “4 , or 4.i or f4.i.j (Where i and j are integers) See table for

example5) Complex variants come at the end, because the example is so long and most of it is just repeats of the simple variant... OBX-4

Be aware that terms for the simple variants are present three different times. Once under section 2 for simple variants and twice under section 5 for complex variants. Once you have reviewed the simple variant terms in once place, don’t bother reviewing the 2nd instances except to understand the example

Of special note we added back 69548-6 –genetic assement of variant. Most genetic reporting of negatives is by default- only the positives are reported explicitly. LOINC 69548-6 is the term that enables interpretations on all variants whether normal or not and includes in its answer list the “no call” option. We also included a term for phase, it is intended only to assign phase within a report, is simpler than Bob’s M’s proposal but believe it could co-exist comfortably with his proposal.

document.docx

We added a whole section with nesting for reporting pharmacogenomic findings (apologies to Bob F). I had wanted to avoid that complexity but was pushed very hard by committee members and others who said it was essential. The new material deals mostly with reporting the “genotype “--- which mostly comes out as star alleles --- related to one or more genes (We can use code lists in one OBX-5 to indicate that the finding is due to the genotypes of two or more genes.

Did have a good discussion about the direction with Kevin Power and Don Rule to get some guidance. Meant to add an overall term to the pharmacogenomics panel for storing a PDF or (other kind of document) that presents the whole summary statements about lots of drugs. Most of the labs in this area produce a big (5-10 page) document that says all kinds of things about the influence of the genetic variants on all kinds of drugs by class and sometimes by specialty. Don’t think there is any reasonable hope of coding this but should probably have a way to deliver it as a document as they do now. Will try to make a new LOINC code for that soon, unless there are big objections.

It then allow the specific interpretations of one or multiple drugs. Would have liked to use the existing LOINC terms for CIPIC, but there our whole model avoids the use of GENE names in the variables. There is a way by adding another term to the pharmacology results to link to very specific results in any of the sections. OBX-21 is designated as unique OBX identifier, and we can find a way to link the pharmacogenomic genotypes to detailed genetic results. I just did not have time to do it...

The first panel in the spread sheet called the configuration panel, is only there so allow us to configure the executable form in different ways. These will help us insert the right codes and coding systems into sample HL7 messages and configure the panel appropriately for different classes of genetic tests. Implementers of these messages can decide what coding systems to use (when there are choices) and insert them directly into their message. The comments for these terms do explain how to deliver an arbitrary reference sequence that is not one of the designated coding systems. You send the OID for that source in CWE.14. Further be aware that when you don’t have a code and a coding system, With CWE’s you can send text instead (we may not want to change CWE to CNE for some variables to avoid that). That text goes into CWE.9 (original text) nowhere else.

document.docx

Report Section 1 for Variables that apply to the Overall Study.Data Type

LOINC #NoteLinks

Observation display Name- draft version

Card OBX-4

OBX-5Examples data

Comments

81247-9 Master HL7 genetic variant reporting panel

1..1 This term provides a handle within the LOINC data base on all of the terms and panels that could be used in a V2 Genomics lite message. It would not be part of the message if the committee decides so use OBX-4 to organize the hierarchy in the messages

Note all of the genomic data reported in this panel assume that nucleotide positions starts at 1, and focuses on the positive strand. This is the assumption embedded in HGVS, the public distribution of NCBI, Ensembl, COSMIC, and most other genomic databases.

81294-1 Genetic form configuration panel

0..* Users of the NLM- forms widget can configure the behavior of the widget by making choices to each of the questions in this panel. Message creators could make these same choices directly in the construction of their message. These are not LOINC codes that would be included in the HL7 message.

NEW Choose kind of mutations targeted

1..* N/A Choices:1) Simple small variants2) Complex small variants3) Structural variants4) Pharmacogenomics

NEW Choose Region of interest specification

0..1 N/A Choices:1) Specific targeted mutations 2)Range targeted in the reference sequence

CWE 81248-7 Default Transcript reference sequence coding system

0..1 1 NCBI-NM; Answer list will include: 1) Ensembl transcript identifiers with prefix =”ENST”, 2) NCBI’s RefSeq transcript RefSeq’s with prefix =”NM_”, 3) Other T RefSeq coding system

If user pick this “other” choice he/she will have to answer an

document.docx

additional question to identify the OID for this other transcript reference sequence coding system, and that OID should be recorded in CEW.14 (or CNE.14) for the term in question. An OID is only needed when v2.x linkage or symbolic name has not been registered in the HL& OID registry. The V2 Genomics Lite guide will include symbolic names for all of the coding systems it names ( To COME)

An entry for this question is needed only to control the NLM form behavior. Message implementers would assert their coding system by inserting their chosen coding system directly into the appropriate part of the CWE coding system specification (usually CWE.3) within the message

B 1) 81249-5 O Default Genomic reference sequence coding system

0..1 1 NCBI-NG-NC Answer list will include 1) Ensembl genomic prefix =”ENSG”, 2) NCBI-NG-NC – which includes RefSeq with prefix =”NG_” or “NC_”,3)Other G RefSeq coding system

If user picks “other G RefSeq Coding system”, he/she will have to answer an additional question to enter the OID for this other genomic reference sequence coding system , and that OID should be recorded in CEW.14 (or CNE.14) for the term in question. An OID is only needed when v2.x linkage or symbolic name has not been registered in the HL& OID registry. The V2 Genomics Lite guide will include symbolic names for all of the coding systems it names ( To COME)

An entry for this question is needed only to control the NLM form behavior. Message implementers would assert their coding system by inserting their chosen coding system directly into the appropriate part of the CWE coding system specification (usually CWE.3) within the message. OID goes

82122-3 O Genetic variant coding system

0..1 1 Identifies a public source that carries the code name and other attributes for the genetic variant.

document.docx

Answer list includes 1) NCBI 2) COSMIC

An entry for this question is needed only to control the NLM form behavior. Message implementers would assert their coding system by inserting their chosen coding system directly into the appropriate part of the CWE coding system specification (usually CWE.3) An entry is needed but only for controlling form behavior. Message developers can assert the coding system of choice where applicable This question is only required to support some special features of the form Answer list choices for now will be ClinVar Variant ID, COSMIC (??We already include other coding systems beyond these two in the guide Should we provide an “other” with option to specify variant besides those already included) and require an OID. Or just explain that one can do so when building messages

D ST 81295-8 C OID For other T RefSeq coding system

0..1 1 #### #### ##### If answer to LOINC 81248-7 Default transcript Reference coding system is “Other T RefSeq coding system” then ask for the OID for this other coding system.

E ST 81296-6 C OID For other G RefSeq coding system

0..1 1 ##### #### ##### If answer to question 81249-5 is Other G RefSeq coding system – ask for the OID for this other coding system

81306-3 Variables that apply to the overall studyA TX 53577-3 O Reason for study 0..* 1 “Worried about

family planning”This is an ask-at-order-entry question.HL7 provides OBR-31 for recoding the reason for the study. The LOINC code is included in this panel for convenience of form definition, because it is often captured in a form with this variable. But it should be delivered in HL7 OBR-31.This is an ask-at-order-entry question.

document.docx

B CWE 51967-8 C Genetic disease(s) assessed 0..1 1 2971795010^Deficiency of isoibutyryl-coenzyme A dehydrogenase (disorder)^SCT

Applies only to genetic studies for specific disease

Identifies the disease (usually genetic) being assed. If entered as a code may use a variety of coding systems (SNOMED CT, ICD-9CM, ICD10, NCBIs genetic diseases list (more than 20,000 genetic diseases and their codes link to SNOMED codes when available.

We encourage the use of SNOMED-CT in this field. It will be up to the message generator to specify the coding system within the message. The guide will supply coding system for the code systems listed above. Other coding systems will require the use of CWE.14 to record the OID.

For the example values shown in the OBX-5 column we show SNOMED –CT Codes (SCT) as coding system), because is the direction we will go in the US. However, the NLM-forms demo of this draft specification shows the content from the NCBI MedGen, because it is focused on genetic diseases and we and the demo can show them publically worldwide.

C CWE 51963-7 C Medications assessed [Nom]

0..* 1 50005^Fluoxitine ^RxN-ingrd~ 84701^Atorvastatin^RxN-ingrd~ 45000^Naproxin^RxN-Ingrd~ 11289^Coumadine^RxN-ingrd

Applies only to pharmacogenomic studies. Carries the medications for which there is concern that genetic variation might influence the efficacy and/or the rate of metabolism. (These would typically be medications being considered for use, or being used by the patent).

This content can indicate what was specified by the requester, or if they user does not specify, what the laboratory targeted.

If this variable is empty or not used, the lab will provide the usual information they provide about common affected drugs. This content will usually be an ask-at-order-entry question Can enter multiple medication identifiers separated by repeat delimiters. Or send each in a separate OBX, but in that case OBX-4 must be different for each OB-4.We recommend 1.1, 1.2 ,1.3 etc.

D CWE 36908-2 C Gene mutations tested for[Nom]

0..* 1 Required if the study is a targeted (e.g. Either looking for known family mutations, or a set of mutations offered by the laboratory

document.docx

looking at specific known mutation) mutation analysis.

Multiple mutations can be entered in one OBX-5 if separated by repeat delimiters, and we would encourage that approach because it will incorporate compactly into most EMR system. Alternatively they can each be reported in a separate OBX. In that case OBX-4 should be different for each such OBX, We recommend 1.1, 1.2, 1.3, etc.

E CNE 48018-6 C Gene(s) assessed [ID] 0..* 1 21497^ACAD9^HGNC-symb

The code is the HGNC code for the gene, the print string (name) is the HGNC symbol. If the study includes more than one gene, they can all be entered in one OBX, separated by the repeat delimiter.

Alternatively they can be entered into separate OBX’s but the content of OBX-4 will have to be unique for each such repeat. We recommend 1.2, 1.2, 1.3 etc. for this variable.

HCNC focuses on Human genes (H stands for human) so the specification is currently limited to human mutations. (Will address extension to other species in the future).

This is an important variable for studies of large packs of genes, such as tens of hundreds of cancer associated genes, but not for studies of single genes or small sets of genes where the target gene(s) are part of the order name

Summary resultsB CNE 51968-6 R Genetic analysis overall

interpretation1..1 1 51968-

6^Positive^LNProvides a coarse overall interpretation of the report. More detailed interpretations are associated with each separate variants reported below.

C FT;ED

51969-4 O Full narrative report (e.g. PDF, Word Document that would look like current reports)

0..1 1 Need example This attribute can carry the full narrative report in two different data types, e.g. FT=Formatted text or as ED = encapsulated data which can accommodate WORD DOCs, PDFs and other special media types.

If this content is not reported as the simple formatted text, follow HL7 V2 specifications for recording the media type and other attributes of an HL7 encapsulated data type.

Technical details

document.docx

D NR 51959-5 C Ranges of DNA sequence examined

0..* 1 2000753^2234579 Preferred if the method is a sequencing study. The first value of the range defines the start location, the second value, the end location of the Sequence. We recognize that this information may be proprietary and is often not revealed. The variable may repeat if the range is discontinuous,

The locations are specified in terms of the associated Genomic reference sequence.- may repeat if the range is discontinuous

At present HL7 V2 each repeat of an NR requires a separate OBX, and the OBX-4 values will have to differ among such repeats. We recommend 1.1, 1.2 1.3 etc.

Note- NR data types are supported by HL7 V2, but not by the LRI lab specification, We have asked that NR be allowed in that spec and to allow repeated NR values in one OBX-5 as is allowed for coded data types.

E TX 81293-3 C Description of ranges of DNA sequences examined

0..1 1 All coding regions and appropriate flanking regions

Genetic test reports only rarely include explicit numeric ranges because they are often proprietary. So reports tend to describe the regions in narrative. E.g. “all coding regions and appropriate flanking regions”. This variable is included to capture such descriptions. It is only relevant to sequencing studies.

F CWE 62374-4 C Human reference sequence assembly [ID]

0..1 1 GRCh37 May or may not be needed depending on the Reference sequences to which the results are anchored. It is needed for when the References sequences are NCBI genomic sequences without the version number and for Ensembl genomic reference sequences. Not required or transcript reference sequences.

One slot is provided for the assembly and build in the overall report section. This assumes that laboratories do not reference more than one assembly and build per report.

document.docx

G ST 81303-0 O HGVS version [ID] 0..1 1 2.120831 HGVS now gives new version numbers when its recommendations change based on the date of the change. So if the change date was e 2012-08-31. The version of the HGVS version would be specified as 2.120831.

Note: The reporting of the HGVS version is very limited at present

H CWE 82115-7 O dbSNP version [ID] 1 <need example> This variable only relevant, if dbSNP identifiers were included

Version changes are only made to correct errors. The version does not change the meaning of the dbSNP RS # per se, but may change the value of the location number in relation to the build.

document.docx

Report Section 2 for Variables that define a Simple Variant (could be more than one simple variant not related to each other).Data Type

LOINC # RCO

Observation display Name- draft

card OBX-4

OBX-5 Example values Comments

N/A 81250-3 Simple genetic variant panel

0..* Does not carry values Repeats for each simple variant reported.

This panel code does not carry values in its OBX-5 It provides a handle for holding all of the LOINC term needed to define a simple variation

It will not be included in the message since if we use the use OBX-4 content to define the hierarchy rather than nested OBRs and OBX’s

A CWE 81252-9 C Simple genetic variant [ID] 0..1 2.1 30880^NM_014049.4(ACAD9):c.1249C>T (p.Arg417Cys)^ClinVar-V

The code for the CWE is the ID specified for the variant in the source public data base, the name is that given by the public data base—usually a combination of attributes e.g. the RefSeq, gene, c.HGVS etc. If the variant has been registered in a public database, these attributes can be automatically pulled from the public database and loaded into attribute specific LOINC terms (See those that follow this term in the panel) If it has not been registered, reporters can enter the details for each component in the OBX’s specified by the terms that follow . Either the simple variant or at least the three following terms must be included

Separate observations for each of the components of the simple genetic variant name. B CWE 48018-6 C Gene studied [ID] 0..1 2.1 21497^ACAD9^HGNC-

symbThe code is the HGNC code for the gene, the print string (name) is the HGNC – symbol. The gene identifier is also carried in the transcript reference sequence data base; so the gene information tends to be redundant but is almost always stated separately.

C CWE 51958-7 C Transcript Reference Sequence ID:

0..1 2.1 NM_014049.4^^NCBI-NM

If At least one of the transcript or genomic reference sequence must be included. If the c.HGVS) is included, the transcript RefSeq must be included.

D CWE 48004-6 C DNA change c.HGVS 0..1 2.1 c.1249C>T^^c.HGVS HGVS specification of the change at the DNA level relative to the transcript RefSeq

E CWE 48005-3 C Amino acid change 0..1 2.1 p.Arg417Cys^^p.HGVS HGVS specification of the change at the amino acid (protein)

document.docx

p.HGVS: level caused by the DNA change. If the change is in a non-coding region, this variable will not be reported

F CWE 48019-4 O DNA change [type] 0..1 2.1 LA6690-7^Substitution^LN

Type of DNA variation reported

G CWE 48006-1 O Amino acid change [type] 0..1 2.1 LA6698-0^missense ^LN Type of amino acid change reported Genomic specification

H CWE 48013-7 C Genomic Reference Sequence [ID}:

0..1 2.1 NG_017064.1^^NCBI-NG-NC

Required if 69547-8, 81254-5 or 69551-0 is present-

I ST 69547-8 C Genomic Ref allele: 0..1 2.1 C The DNA string in the reference sequence (ref Allele) with which the DNA in the test sample differs starting at the position given in 81254-5 Genome Allele location

J NR 81254-5 C Genomic Allele location: 0..1 2.1 31731^31731 The beginning and end of the ref Allele that was replaced by the Alt Allele. The beginning is counted as the first position in the genomic reference sequence where anything changed in the sample DNA being tested, and the end is the comparable last position.

K ST 69551-0 C Genomic Alt allele: 0..1 2.1 T The DNA sequence in the test sample (ref allele) that is different from the DNA in the reference sequence (ref Allele) – Note the content of 69547-8, 81254-5 and 69551-0could also be described in a g.HGVS expression as: g.31731C>T

Other optional codes related to a simple genetic variantL ID 81255-2 O dbSNP ID: 0..10

..12.1 rs368949613^^dbSNP The “SNP” in NCBI’s dbSNP data base name, originally

meant Single Nucleotide variants, but not defined to mean “Simple Nucleotide. Variants. The code is the HGNC code for the gene, the print string (name) is the HGNC – symbol Variant”. Each DB SNP is given an ID with a prefix of “rs”. , The dbSNP data base defines the location and size of the variant, but does not distinguish among different patterns of the same size at the same location. So there would be one RS SNP codes for AAA and ACA at the same location, but AAAA at that location would get a different dbSNP code. DbSNP has versions, but they don’t change the meaning of a dbSNP RS#s. Some Rs#s may have different locations with respect to the build depending on the version, but only when a change was made to correct an error. The actual meaning of the SNP RS code does not change

document.docx

M CWE 81256-0 C COSMIC-simple genetic variant

0..1 2.1 Example soon to come COSMIC- Catalogue of Somatic Mutations in Cancer is the preferred code for cancer mutations. COSMIC simple mutations data base carries records that include a Mutation and a specimen ID as well as mutation (HGVS) the organ, histology and other specifics for each submission. The NLM forms look up includes only one record per unique mutation ID.

Their simple variant file has fields that correspond too many of the fields in ClinVar except that COSMIC uses Ensembl RefSeqs and the single letter code for p.HGVS.

.COSMIC stores simple mutations and structural mutations in separate tables and this specification provides separate coding systems for each (See structural variants in report section 3

N CWE Allele

81257-8 O CIGAR [Nom} 0..1 2.1 Example Used primarily for alignment in earlier stages of genetic study analysis. We have not seen usage in routine clinical reports

Other possible attributes 1.1O NM 81258-6 P Allelic Frequency[NFR] 0..1 2.1 0.47 This variable, reports the fraction of all of the reads at this

genomic location that were represented by the reported allele. For homozygotes it will be close to one, For heterozygotes it will be close to 0.5. It can be a smaller number when mosaics or multiple chromosome, or mixtures of tumor cells and normal cells are mixed

P CWE 48001-2 O Chromosome location of variant

0..1 2.1 3q21 These locations can be recorded as text or taken from a list of Cytogenetic chromosome locations as explained by NLM’s Genetic Home Reference (https://ghr.nlm.nih.gov/primer/howgeneswork/genelocation).The set we provide access to in the NLM forms for this proposal includes all of those reported as locations in NCBI’s ClinVar. It is not exhaustive

Q CNE 69548-6 O Genetic variant assessment in Blood or Tissue by Molecular genetics method [Imp]

0..1 2.1 LA9633-4^present^LN Genetic reporting is usually by exception. So for targeted mutation analyses, the lab lists the mutations they looked for and report the ones they it found. For sequencing studies, similarly, they report the genetic range studied and the variants found.

document.docx

This variable permits a different style of reporting in which a set of examined variations or loci could be described individually as present absent (or no call).

Q CNE 48002-0 O Genomic source class [Type]

2.1 LA6683-2^Germline^LN We associate this variable with the variant so that distinction about the kind of variant can be made when somatic and germline variants will be observed in one study and they have to be distinguished in the report.

Allelic state and interpretive attributesR CNE 53034-5 C Allelic state: 0..1 2.1 LA6706-

1^Heterozygous^LNThis variable describes the relationship between the alleles found at the same locus on different chromosomes. It is not always discerned by the study

S CNE 53037-8 O Genetic variation clinical significance [Imp]:

0..1 2.1 LA6668-3^Pathogenic^LN

See answer list

T CWE 81259-4 O Probable associated phenotype [Nom]:

0..1 2.1 Acyl-CoA dehydrogenase family, member 9, deficiency of

The disorder with which this variant is associated. Allow same coding systems as for disease assessed , but this term may more often be recorded as narrative text to allow more qualifiers

Other candidate variablesU CWE 82120-7 O Allelic Phase 2.1 1 Defines which variations are on the same or different

chromosomes. Can accommodate trisomys and distinguishing whether the chromosome is maternal or paternal when such details can be inferred (e.g. when the parent’s genotype is also available. (This is a trial set of answers and will likely expand –or change.)

Answers 1, 2, 3, 4 ,, Maternal, PaternalV NM 82121-5 O Allelic read depth 2.1 208 A whole number, usually less than 300 and more than 25-20.

Different methods and purposes have different numbers of reads to be acceptable.

If a second and a third simple variant was reported- without special meaning together they would show up as additional simple variants with OBX-4 valued with 2.2 for the 2nd set and with 2.3 for the 3rd set of variables

document.docx

Report Section 3 – Structural VariationsData Type

LOINC # RCO

card OBX-4

N/A 81297-4 Genomic structural variant

0..* Repeats for each structural variant.This is the first.

Does not carry values – Defines the structure but would not be carried into the message if we organize via OBX-4 content rather than by nested OBX

Note some structural variants can be reported via strict ClinGen IDs and/or COSMIC IDs in the above sections

A CWE 81286-7 P Genomic structural variant 0..1 3.1 nsv995237^17p12(chr17:14184616-15581544)x1

CWE 82154-6 Genomic structural variant name 3.1GRCh38/hg38 17p12 (chr17:14184616-15581544) x1

Need to research this further. We have found multiple was to name this variant:

The one in row A, the one in row B and the following- which may be the best

NC_000017.11:g. (? _14184616)_(15581544_?)del (GRCh38)

NM 82155-3 Genomic structural variant copy number

3.1 <need example>The number of repeats of the large variant.

B NR 81287-5 P Genomic structural variant start-end:

0..1 3.1 14184616^15581544The reported start and end of the structural variant, when distinctions between out bound and inner bound are not made (See Row G and Row H)

document.docx

C NM 81299-0 C Genomic structural variant reported arrGCH ratio:

0..1 3.1 Usually only applicable to ArrCGH and related studies.

Its value is a number (less than 1.

D CWE 81289-1 P DNA structural variation type : 0..1 3.1 LA6686-5^Deletion^LN Base answer list ken from NCBI http://www.ncbi.nlm.nih.gov/dbvar/content/overview/#ref22 is

Structural variants require a different list of type from ordinary simple variants so we created a new term with an appropriate answer lit

E CWE 82119-9 C COSMIC structural variant 0..1 3.1 <<need example>> COMIC has separate tables for simple variants and structural variants, and they have different kinds of content. This variable uses the COSMIC mutation ID from the structural variant table as its code and will have a name analogous to the NCB DbVar name.

Typically somatic (cancer) cancer structural variants will use this variable and germline will use DbVar IDs

The NLM Genetics report form will soon have a look up for COSMIC structural variants s

F NM 81300-6 O Structural variant length: 0..1 3.1 1,396,929 Don’t often see this data in routine clinical reportsG NR 81301-4 O Structural variant outer start and

end: 0..1 3.1 13200589^15592000 Don’t often see this content in routine clinical

reports. H NR 81302-2 O Structural variant inner start and

end:0..1 3.1 14184616^15581544 Don’t often see this content in routine clinical

reportsI CWE 81290-9 C Structural variant HGVS: 0..1 3.1 NC_000017.11:g.(?

_14184616)_(15581544_?)dupAt least one of ISCN and HGVS representation should be included.

J CWE 81291-7 C Structural variant ISCN: 0..1 3.1 <<need example>> Include if availableK CWE 81298-2 R Structural variant cytogenetic

location: 0..1 3.1 17p12 Include if available

document.docx

L CWE 81304-8 P Structural variant method type 0..1 3.1 Sequencing Answer list under development. NEED input

Information about the general class of methods is of special relevance to structural variation because it suggests the kind of precision available for their locations.1) aCGH2) Oligo aCGH ;2)BAC aCGH3) SNP genotyping data

4)PEM –paired end mapping5) EM with next-generation sequencing6)FISH;

7) MLFP;8)CNV-seq,

2015 review of the newest method at : PMCID: PMC4479793

document.docx

Report Section 4 reporting of r pharmacogenomics studies – the detailed allelic content could be reported in Section 5 and linked to the resultsData Type

LOINC # RCO

Card. OBX-4 OBX-5 Example values Comments

82118-1

Pharmacogenomics results panel

0.* Will repeat for each gene tested

Results for first gene in the study

A CNE 48018-6 Gene studied 1..3 I 1559^CYP2C9^HGNC-Symbol~23661^VKORC1^HGNC-Symbol

In some cases, such as in the example of CYP2C9 and VKORC1, the effect of variations in more than one gene can determine the effect on drug metabolism or efficacy, in which case the genes with the combined effect and be listed in one OBX-5 separated by the repeat delimiter.

B ST 47998-0 Genotype display name

Display name ( general)

1..3 4.1 *2/*5~*A/*A In this context the corresponding alleles for each of the genes listed under gene(s) studied are also shown separated by a slash e.g., *1/*2 as is the common usage

If the effect metabolism/efficacy effect is based on 2 genes the results for each gene are shown separate by repeat delimiter in the same order as the gene symbols are displayed in YYYYY

C CWE 53040-2 C Genetic variation’s effect on drug metabolism [Imp]interp

0..1 4.1 LA9657-3^Rapid metabolizer ^LN

For pharmacogenomics studies at least one of 53040-2 (effect on drug metabolism ) or 51961-1 (effect on drug efficacy ) must be included in the panel

D CWE 51961-1 C Genetic variation’s effect on drug efficacy interp

0..1 4.1 <not in this example> For pharmacogenomics studies at least one of 53040-2 (effect on drug metabolism ) or 51961-1 (effect on drug efficacy ) must be included in the panel

E - 82117-3 O Medication usage implications panel

0..* 4.1.1 This term identifies the set of LOINC terms needed, but would not be included in the message assuming we use the OBX-4 construct to organize

1st medication assessed under the first gene or pair of genes studies.Provides a way to present the guidance about specific drugs or drug classes. May repeat for as many drugs as relevant to the tested gene. The

document.docx

the hierarchy alternative is narrative or PDF of current guidance now being delivered. (See <<term 51969-4>> provided for that purposes.

F CWE 51963-7 R Medication assessed 0..* 4.1.1 7258^Naproxen^RxN-ingred

Required when the medication usage panel is implemented .The coding system could be an RxNorm ingredient subset RxN-ingrd.

G CWE 82116-5 C Medication usage suggestion [type]

1..1 4.1.1 ^Increase the dosage ^LN There is little consistency in the answer lists used by different laboratories. S, Until there is a consensus this will have to be locally decided. We have suggested a draft starter set. At least one of the medication usage type or narrative should be included when the panel is implemented.

H TX 82116-5 C Medication usage suggestion [narrative]

0..1 4.1.1 May need higher dosage than usual.

Used to deliver whatever specific content hat laboratories want to deliver. At least one of the medication usage type or narrative should be included when the panel is implemented

I - 82117-3 O Medication usage implications panel

0..* 4.1.2 This term carries the LOINC terms needed, but would not be included in the message assuming we use the OBX-4 construct to organize the hierarchy

2nd medication assessed under first gene studied

J CWE 51963-7 R Medication assessed 0..* 4.1.2 611247^Fluoxetine olanzapine^RxN-ingd

K CWE 82116-5 C Medication usage suggestion [type]

1..1 4.1.2 ^Increase the dosage^LN

L TX 82116-5 C Medication usage [narrative] 0..1 4.1.2 May need higher dosage than usual.

Results for second gene in the studyA CNE 48018-6 Gene studied 1..3 4.2 1557^CYP2C19^HGNC-Symb

ST 47998-0 Genotype display name 1..3 4.2 *1/*1CWE 53040-2 C Genetic variation’s effect on

drug metabolism interp0..1 4.2 LA25391-6^Normal

metabolizer^LNFor pharmacogenomics studies at least one of 53040-2 (effect on drug metabolism ) or 51961-1 (effect on drug efficacy ) must be included in the panel

CWE 51961-1 C Genetic variation’s effect on 0..1 4.2 Not applicable -- term would For pharmacogenomics studies at least one of

document.docx

drug efficacy interp not be include in this report report

53040-2 (effect on drug metabolism ) or 51961-1 (effect on drug efficacy ) must be included in the panel

N/A 82117-3 O Medication usage implications panel

0..* 4.2.1 This term carries the LOINC terms needed, but would not be included in the message assuming we use the OBX-4 construct to organize the hierarchy

A way to present the guidance about specific drugs or drug classes. May repeat for as many drugs as relevant to the tested gene. The alternative is narrative or PDF of current guidance now being delivered. (See <<new term 51969-4>> provided for that purposes.

CWE 51963-7 R Medication assessed 1..1 4.2.1 6754^Meperidine^RxN-ingd

Required when the medication usage panel is implemented. The coding system could be an RxNorm ingredient subset RxN-ingrd. If there is really a modest sized fixed list, would entertain that

82116-5 C Medication usage suggestion [type]

0..1 4.2.1 ^usual dosage^LN There is little consistency in the answer lists used by different laboratories. S, Until there is a consensus this will have to be locally decided. We have suggested a draft starter set. At least one of the medication usage type or narrative should be included when the panel is implemented

Results for third gene or gene pair studiedCNE 48018-6 Gene(s) studied 1..3 4.3 2637^CPY3A4^HGNC-

Symbol~1577^CYP3A5^HGNC-Symb

This names the gene(s) –for Pharmaco genomics they are usually CYP genes-- the example of CPY2C9 and VKORI, the changes in two genes may be reported as unit because they have a combined effect In this case the genes with the combined effect can be listed in one OBX separated by repeat delimiters

ST 47998-0 Genotype display name 1..3 4.3 *1/*1 In this context the corresponding alleles for one gene are shown separated by a slash e.g., *1/*2 as is the common usage. If the effect metabolism/efficacy effect is based on 2 genes the results for each gene are shown separate by repeat delimiter in the same order as the gene symbols

document.docx

are displayed in the gene(s) studied observationCWE 53040-2 C Genetic variation’s effect on

drug metabolism interp0..1 4.3 LA25391-6^Rapid

Metabolizer^LNFor pharmacogenomics studies at least one of 53040-2 (effect on drug metabolism ) or 51961-1 (effect on drug efficacy ) must be included in the panel

CWE 51961-1 C Genetic Variation’s effect on drug efficacy interp

0..1 4.3 <not represented in this example>

document.docx

Report Section 5: Complex Variants (those with multiple alleles) – We have moved complex variants to the last section, because, it is a very long example, they are most applicable to pharmacogenomics and such reporting do not always go into the details of the simple variants within the haplotypes.

Sources for example: http://www.ncbi.nlm.nih.gov/clinvar/variation/16895/ http://www.ncbi.nlm.nih.gov/gene/1565 Data Type

LOINC # RCO

card OBX-4

81251-1 Complex genetic variant – panel

0..* (repeats for each complex variant)

Complex variants are made up of two or more simple variants which together have phenotypic implications. Usually they carry information about phase (i.e. whether reported chromosomes are on same or different chromosome). Would be needed for detailed haplotype and compound hets among other types.

OBX’s that follow OBX-4 l increments by 1 for each repeated complex variant. The example only present one Complex variant

Information that applies to one complex variant as a whole

B CWE 81260-2 C Complex genetic variant [Identifier]

0..1 5.1 16895^NM_000106.5(CYP2D6):c.[886C>T;457G>C] – Haplotype^ClinVar

Following the pattern of simple variant, the code is the identifier from a public genetic data base and the name is a concatenation of the RefSeq, the gene symbol, the HGVS describing for the multiple variants, and the complex variant type

F CWE 81262-8 C Complex variant HGVS name 0..1 5.1 c.[1749A>G ; 2549delA]^^HGVS

Includes HVGS for the separate variants that make this complex variant. The square bracket surrounding multiple variants indicates they are together on one chromosome. When each simple variant is surrounded by square brackets that means they are on separate chromosomes. HGVS syntax can also assert that the phase is unknown.

CWE 81265-6 C Complex variant type 0..1 5.1 LAXXXXX-X^Haplotype^LN Answer list can include Haplotype, Complex heterozygote, and others (to be determined)

document.docx

H CWE 81259-4 O Associated phenotype 0..1 5.1 688395015^Debrisoquine adverse reaction (disorder)^SCT

Disorder with which this complex variant is associated (See same term in simple variant

I CNE 53037-8 O Clinical significance 0..1 5.1 LA6668-3^Pathogenic^LN Applies to the set of simple variants in the complex variant

J CNE 53034-5 O Allelic state 0..1 5.1 LA6706-1^Heterozygous^LN See same term in simple variant- but applies to the whole complex variant

Information that applies to the simple variants that make up the complex variant (one at a time)

81250-3 Simple genetic variant panel

1..* N/A Does not deliver data, and this code will not be included in the message if we use the OBX-4 instead of nested OBR’s to organize the hierarchy

1st simple variant within the first complex variant. Same as stand-alone simple variant panel

A CWE 48008-7 R Simple Variant: 0..1 5.1.1 31934^NM_000106.5(CYP2D6):c.886C>T (p.Arg296Cys)^ClinVar-V

See previous instance of same term

Transcript specificationB CWE 48018-6

CGene: 0..1 5.1.1 2625^CYP2D6^HGNC-symbol See previous instance of same term

C CWE 51958-7 C Transcript Reference Sequence ID (aka NM_RefSeq):

0..1 5.1.1 NM_000106.5^^RefSeq See previous instance of same term

D CWE 41103-3 C DNA change: 0..1 5.1.1 c.886C>T^^c.HGVS See previous instance of same termE CWE 48005-3 C Amino acid change: 0..1 5.1.1 p.Arg296Cys^^p.HGVS See previous instance of same termF CWE 48019-4 O DNA sequence variation type 0..1 5.1.1 LA6690-7^Substitution^LN See previous instance of same term G CWE 48006-1 O Amino acid change type 0..1 5.1.1 LA6698-0^Missence^LN

Genomic specificationH CWE 48013-7 C Genomic Reference Sequence: 0..1 5.1.1 NG_008376.3^^RefSeqGene See previous instance of same termI ST 69547-8 C Genomic Reference (Ref) allele: 0..1 5.1.1 C See previous instance of same termJ NM 81254-5 C Genomic Allele location: 0..1 5.1.1 42127941 See previous instance of same termK ST 69551-0 C Genomic Alternate (Alt) allele: 0..1 5.1.1 T See previous instance of same term

Other optional codes related to simple variation

document.docx

L CWE 48004-6 O dbSNP ID: 0..1 5.1.1 rs16947^^dbSNPM CWE 81256-0 C COSMIC 0..1 5.1.1 See previous instance of same termN CWE 81257-8 O CIGAR 0..1 5.1.1 See previous instance of same term

Other possible attributesO NM 81258-6 P Allelic Frequency NFR 0..1 5.1.1 0.40045 See previous instance of same termP CWE 48001-2 O Cytogenetic Location (Synonym -

Chromosome region)0..1 5.1.1 22q14.2 See previous instance of same term

81250-3 Simple variant panel 2nd Simple variant within complex variant. Same as stand-alone simple variant panel --

A CWE 48008-7 Simple variant: 0..1 5.1.2 38485^NM_000106.5(CYP2D6):c.1457G>C (p.Ser486Thr)^ClinVar

Transcript specification B CWE 48018-6 C Gene: 0..1 5.1.2 2625^CYP2D6^HGNC-symb See previous instance of same term C CWE 51958-7 C Transcript Reference Sequence ID

:0..1 5.1.2 NM_000106.5^^NCBI-NM See previous instance of same term

D CWE 41103-3 C DNA change: 0..1 5.1.2 c.1457G>C^^c.HGVS See previous instance of same termE CWE 48005-3 C Amino acid change: 0..1 5.1.2 p.Ser486Thr See previous instance of same term F CWE 48019-4 C DNA sequence variation type 0..1 5.1.2 LA6690-7^Substitution^LN See previous instance of same term G CWE 48006-1 O Amino acid change type 0..1 5.1.2 LA6698-0^Missence^LN See previous instance of same term

Genomic specificationH 48013-7 C Genomic Reference Sequence: 0..1 5.1.2 NG_008376.3^^NCBI-NG-NC See previous instance of same termI ST 69547-8 C Genomic Reference (Ref) allele: 0..1 5.1.2 G See previous instance of same term J NM 81254-5 C Genomic Allele location: 0..1 5.1.2 42126611 See previous instance of same termK ST 69551-0 C Genomic Alternate (Alt) allele: 0..1 5.1.2 C See previous instance of same term

Other optional codes related to simple variationL CWE 48004-6 O dbSNP: 0..1 5.1.2 rs368949613^^dbSNP See previous instance of same termM CWE 81256-0 C COSMIC 0..1 5.1.2 See previous instance of same term N CWE 81257-8 O CIGAR 0..1 5.1.2 See previous instance of same term

Other possible attributesO NM 81258-6 P Allelic Frequency NFR 0..1 5.1.2 0.59168 See previous instance of same termP CWE 48001-2 O Chromosome region 0..1 5.1.2 22q13.2 See previous instance of same term

Allelic state and interpretive attributes

document.docx

Q CNE 53034-5 C Allelic state: 0..1 5.1.2 LA6706-1^Heterozygous^LN See previous instance of same term

R CNE 53037-8 O Clinical significance: 0..1 5.1.2 LA6668-3^Pathogenic^LN See previous instance of same term S TX 81259-4 Associated phenotype: 0...1 5.1.2 688395015^Debrisoquine

adverse reaction (disorder)^SCT

If there were a second complex variant in the report it would identified with 5.2 and its first constituent simple variants as 5.2.1 and variables in the second a 5.2.2 etc.

Source of answer list for structured variant type -- from NCBI http://www.ncbi.nlm.nih.gov/dbvar/content/overview/#ref22Variant Call Type Type

abbreviation Sequence Ontology ID

copy number gain SO: 0001742 A sequence alteration whereby the copy number of a given region is greater than the reference sequence.copy number loss SO:0001743 A sequence alteration whereby the copy number of a given region is less than the reference sequence.duplication dup SO:0001742 (copy number gain) A sequence alteration whereby the copy number of a given region is greater than the

reference sequence.deletion del SO:0000159 The point at which one or more contiguous nucleotides were excised.insertion SO:0000667 The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence.mobile element insertion

SO:0001837 A kind of insertion where the inserted sequence is a mobile element.

novel sequence insertion

SO:0001838 An insertion the sequence of which cannot be mapped to the reference genome.

tandem duplication

SO:1000173 A duplication consisting of 2 identical adjacent regions.

inversion inv SO:1000036 A continuous nucleotide sequence is inverted in the same position.intrachromosomal breakpoint

SO:0001874 A rearrangement breakpoint within the same chromosome.

interchromosomal breakpoint

SO:0001873 A rearrangement breakpoint between two different chromosomes.

translocation SO:0000199 A region of nucleotide sequence that has translocated to a new position.complex SO:0001784 A structural sequence alteration or rearrangement encompassing one or more genome fragments.

document.docx

sequence alteration

SO:0001059 A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence.

Variant Call Type Sequence Ontology ID Variant Region Type

copy number gainSO:0001742 A sequence alteration whereby the copy number of a given region is greater than the reference sequence. copy number variation

copy number lossSO:0001743 A sequence alteration whereby the copy number of a given region is less than the reference sequence. copy number variation

duplicationSO:0001742 (copy number gain) A sequence alteration whereby the copy number of a given region is greater than the reference sequence. copy number variation

deletion SO:0000159 The point at which one or more contiguous nucleotides were excised. copy number variation

insertionSO:0000667 The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence. insertion

mobile element insertion SO:0001837 A kind of insertion where the inserted sequence is a mobile element. mobile element insertion

novel sequence insertion SO:0001838 An insertion the sequence of which cannot be mapped to the reference genome. novel sequence insertion

document.docx

Variant Call Type Sequence Ontology ID Variant Region Type

tandem duplication SO:1000173 A duplication consisting of 2 identical adjacent regions. tandem duplication

inversion SO:1000036 A continuous nucleotide sequence is inverted in the same position. inversion

intrachromosomal breakpoint

SO:0001874 A rearrangement breakpoint within the same chromosome. translocation or complex chromosomal mutation

interchromosomal breakpoint

SO:0001873 A rearrangement breakpoint between two different chromosomes. translocation or complex chromosomal mutation

translocation SO:0000199 A region of nucleotide sequence that has translocated to a new position. translocation

complexSO:0001784 A structural sequence alteration or rearrangement encompassing one or more genome fragments. complex

sequence alterationSO:0001059 A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence. sequence alteration

document.docx

Overview: Genomics LOINC panel names

81247-9 HL7 genetic variant reporting panel81250-3 Simple variant panel81251-1 Complex variant panel81294-1 Genetic form configuration controls81297-4 Structural variant panel82118-1 Pharmacogenomics result panel82117-3 Medication usage implications panel

Proposed symbolic names (CWE.3) for new coding systems

Name of source system

Coding system symbolic name

Coding system long name

Coding system OID (will fill in after we register in Hl7)

Comment URL

RxNorm Ingredients subset

RxN-ingrd

HGVS- transcript syntax c.HGVSHGVS- Genomic syntax g.HGVSHGVS-Protein syntax p.HGVSNCBI-transcript reference sequences

NCBI-NM

NCBI genomic or chromosome reference sequence

NCBI-NG-NC

Ensembl transcript reference sequence

Ensembl genomic ENSG

document.docx

reference sequenceEnsembl protein reference sequence

DbSNP dbSNPGene identifiers HGNC- with symbol defined as the print string ( CWE.2) in the coding system

HGNC-symb

SNOMED-CT SCTLOINC LN

COSMIC –simple variants

COSM-Smpl

COSMIC-Structural variants

COSMIC-Strc

Clinvar variant ID coding system

ClinVar-V Uses variant ID as the code (CWE-1) rather than the allele code –also available in each record

NCBI Medgen disease subset

NCBI-DS

GRCh38/hg38 17p12 (chr17:14184616-15581544) x1 Web viewThe code is the HGNC code for the gene, the...

Documents

Transcript of GRCh38/hg38 17p12 (chr17:14184616-15581544) x1 Web viewThe code is the HGNC code for the gene, the...

Querying Graph-Structured Datadb.in.tum.de/teaching/ws1920/foundationsde/RDFQueryOpt.pdf · Europeana Nomenclator Asturias Red Uno Internacional GNOSS Geo Wordnet Bio2RDF HGNC Ctic

Exploration of pathomechanisms triggered by a single-nucleotide … · 2017-03-02 · 2:178769890–178769893 referenced to GRCh38/hg38 human genome assembly. This is mutation T2896I

Missing Segments from the Human GRCh38 …uu.diva-portal.org/smash/get/diva2:1271526/FULLTEXT01.pdfThe human de novo assemblies available based on long-read data thus indicate that

LISTE DES ANALYSES HORS-QUEBEC List for testing done out ... · Code Test / Test Code (Dictionnaire MSSS) SYMBOL HGNC DU GÈNE et NOM OMIM DU GÈNE / HGNC Gene Symbol and OMIM Gene

pubmed.mineR: An R package with text-mining algorithms to ... · 10. Uniprotfun(): To get the information about human genes from the UniProt. Argument: HGNC approved gene symbol.

REXTAL: Regional Extension of Assemblies Using Linked-Reads · UCSC browser [4] to access HG38 and selected subtelomere DNA segments for analysis. 2.3 Alignment of Subtelomeric Region

GRCh38/hg38 17p12(chr17:14184616-15581544)x1 - LOINC - 6 - … · Web view2016-06-03 930A Outline GenomicModel_ for HL7 ... e.g. FT=Formatted text or as ED = encapsulated data which

Hg19 (GRCh37) vs. hg38 (GRCh38) Human Genome Reference Comparison Zuotian Tatum Department of Human Genetics Leiden University Medical Center.

Analysis of Subtelomeric REXTAL Assemblies Using QUAST · 2020. 8. 31. · UCSC browser [7] was used to access HG38 and select sub-telomere DNA segments for analysis. We tested REXTAL

End to End Graph Analytics Technology - hgnc · #1 the-washington-post-company #2 hailo #3 intelligent-apps-mytaxi gettaxi #5 fab #6 mybaze #7 daimler o #8 colabination 255 #9 maker6

The Influence of Polyphenol-Rich Diets in Mice A ... · Genes Identifiers : • Entrez Gene, GenBank, Ensembl • EC Number, RefSeq, UniGene, HUGO • HGNC, EMBL 15 October 25, 2012

A flexible, high-throughput software pipeline for ......References: 1. 2. Human genome reference with GRCh38 and ENSEMBL annotations 3. ...

research.cchmc.org · Web viewGene query based on HGNC symbols result in an output having summarized information about the gene embedded from with links to different databases categorized

DTL Focus meeting: Using GRCh38 in NGS data analysis Time slotSpeakerSubject 12:45-13:00Coffee/tea 13:00-13:20Ies Nijman (UMCU) Welcome & Introduction.

Homo sapiens chromosome 3, GRCh38.p12 Primary Assemblygenesdev.cshlp.org/content/suppl/2019/05/23/gad.324657... · 2019-05-23 · 1 Homo sapiens chromosome 3, GRCh38.p12 Primary Assembly

HUGO Gene Nomenclature Committee (HGNC), Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, UK. The work.

Increased alignment sensitivity improves the usage of genome … · 4 NucleicAcidsResearch,2017 hg38: chr19:5,710,576-5,783,935 20 kb Basic Gene Annotation Set from GENCODE Version

Multiplex RT-PCR kit. - DNA Diagnosticdna-diagnostic.com/files/downloads/HemaVision/76.pdf · interpretation 9 8. gene abbrevations according to the hgnc 10 9. references ... hv06-rmp

The BioMart community portal: an innovative alternative to ... · Committee (HGNC), European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK,

Deanna M. Church, Jason Harris, Stephen Chervitz, …Challenges of Moving a Clinical Lab to GRCh38 Deanna M. Church, Jason Harris, Stephen Chervitz, Gabor Bartha, Anil Patwardhan,