Virscidian Poster Pittcon 2010
-
Upload
mark-bayliss -
Category
Documents
-
view
146 -
download
0
Transcript of Virscidian Poster Pittcon 2010
Summary of Study Results
Using the combined Query approach, it is possible to construct extensive result query conditions that allow you to evaluate a variety of conditions. The implementation is easily extensible to further customized queries. A snapshot only is reported here.
A quantitative assessment of large scale data processing for LC/UV & MS based compound QC
Mark A. Bayliss, Joseph D. Simpkins, Virscidian Inc., Raleigh NC 27601
Abstract In our experiences, we have found a significant number of situations that force us to have to QC
a much greater percentage of our LC/MS UV, ELSD compound QC results than we feel should be
necessary. This oftentimes means a 100% QC. Some of the reasons are summarized as: Target(s)
Found (Green) but the purity or concentration of the sample being too low to be of practical
use. Targets found but eluting in a region with significant level of impurities and therefore more
challenging for auto‐purification. Targets eluting within the solvent front or end of the
chromatographic run typically with poor integration. Targets being poorly classified as found,
maybe or not found due to challenges in the signal processing, baselining, peak integration, MS
peak classification, poor assignment of adducts and so on. The major issue of course, was that
we were not really sure to what level these issues were prevalent or were causing us to over QC
results. To better understand these effects, we have undertaken a relatively large scale review of
our results to determine where most of the problem situations occur and to remedy as many as
possible. We were also looking to increase the trust we have our processing and to be able to
trap those situations where an analyst needs to make informed decisions and communicate
these effectively. This presentation summarizes some of our findings and how we have
attempted to solve these needs.
Results
Conclusions This study really set out to quantitatively answer three basic questions
1) Do we need additional tools to visualize hidden content within the results deck?
2) Do we need workflow based visualization tools that can assist a scientist through
the process of results review, reporting and publishing?
3) Is it possible to reliably reduce the need for 100% results review?
1) Results Review – Visualization of the hidden content
Results analysis simply using the traditional traffic light approach answers a very
limited question and thus can lead to under and over expression of the true reality of
results as highlighted in this example study. Thus for both effective targeted review
and discovery of the true nature of results, we propose that additional ways to
visualize the results deck are necessary and advantageous.
We found that by making focused review based on one result aspect at a time, the
overall quality of results review was much improved. It is important to state this has
not been quantified and may be more about how the individual reviewer works. It
could be interesting to study this more deepply.
2) Workflow based results review
Effective review of results is as much about the design of the tools for effective
visualization as well as the tools required to generate the result in the first place.
Reviewing large quantities of results requires a complimentary yet different
implementation and must be able to guide the user to often hidden problem areas in
the results.
For this type of sample analysis, chemists are really interested in those samples that
are “Found” AND “Pure” (AND optionally is the concentration above some minimum
acceptable level). These are typical requirements for example for target substance
activity screening.
To answer the first two parts of the question, analysts need to understand which
samples are “Found” AND “Pure (>80% Area by Detector X)” AND “not eluting in the
solvent front” AND “ Not eluting at the end of the chromatography” etc…
By embedding the results with specific tags of information coupled with a flexible
graphical query system, an analyst may easily generate any number of combinations of
test conditions that exposes the hidden detail of the results. This makes for very
effective targeted decision making.
3) Reducing the need for 100% results review
This study found that under even difficult analysis conditions, that it is not necessary
to perform a 100% data review, rather a more targeted workflow based approach can
be used with good accuracy. While our aim of a 100% accurate evaluation is our goal, a
97% accuracy under these conditions represents a very respectable and usable level of
performance.
Final Comments
We have found consistently that the quality of both processing and interface design
should be considered equal and important. We continue to improve both and aim to
achieve 100% accuracy.
For Further Information
www.virscidian.com
Contact Joseph Simpkins at [email protected] Contact Mark Bayliss at [email protected]
Virscidian Inc. 7330 Chapel Hill Road, Suite 201, Raleigh, NC 27607,
USA (919) 809‐7651or (919) 655 8050
Method • A batch of 1015 of random crude synthesis data sets were selected representing what we
can refer too as challenging samples. The data were originally acquired using an Agilent
Technologies Ion Trap, with the following streams of data [MS1 (+ve), UV310 and ELSD]. A
fast chromatographic gradient over 2 minutes was used for separation of the substances.
Data Processing
• The data were analysed using Virscidian’s Analytical Studio Professional‐Compound QC
software beta version 1.2 with a new statistical data collection plug‐in.
• The method for processing was optimized on a small random batch of the samples.
• Attention in optimization was given to: baselining, peak integration regions, solvent
front exclusion from calculations, peak selection and rejection criteria, adduct
classification criteria, peak demotion criteria and detector offset alignment.
• Batches of data were then selected from different non‐consecutive days of sample
acquisitions to make up the 1015 test sample collection.
• All samples were processed using same processing method with no changes allowed.
Evaluation of results
• Results were captured for the following conditions:
• Number of samples with Status = Found before and after manual review
• Number of samples with Status = Wells Maybe
• Number of samples with status = Wells containing any maybe peak(s)
• Number of wells with status = Multiple Target substances before and after review
• Number of wells with status = Found AND Pure AND no solvent Elution AND NOT
eluting at the end of the chromatography
• Comparisons of the traditional “Traffic Light” Approach were made against a potentially
more appropriate and practical “Combined Query” approach.
• A review of the workflow optimization for large batch‐wise results review is also reported.
Is a 100% Results QC Necessary to Ensure Accuracy? As noted already, the sample datasets used in this study were chosen from synthetic crude mixtures. It so
happened that there was also significant chromatographic column contamination with a large baseline
disturbance and some baseline resonance towards the end of the chromatography as shown below. We were
able to deal with this effectively in almost all samples analyzed.
As part of this study, we wanted to determine if it would be possible to use a combined query system as a
workflow and be able to rely on the quality of the answers it provided. We found that for the 1015 samples that
were analyzed, a total of 110 before a complete results review, were determined to meet the following
compound query criteria ‐ “Found” AND “Pure {Integrated %Area of UV310} AND NOT “eluting within the solvent
front” AND NOT “Eluting at the end of the chromatography (within 0.2 minutes). After review the same
compound query returned 107 samples meeting the same criteria. This query is consistent with the requirements
for substance screening for activity.
We found that the 3% of results which were updated during the review, the main reason for change was the
target substance being defined as found when they should have actually been not found {false positive}. This
exclusively happened in the analysis of weak secondary adducts {ie: [M+Na]+, [2M+H]+, [2M+Na]+ etc.. } which
were often very spikey and low numbers of points across the peak. Our approach was to downgrade these to
target not found.
A Workflow Based Approach to Results Review
Review of samples within a workflow
Global visualization is at the plate level, local visualization of individual samples is achieved by having a results summary display and an automated auto‐advance (Play button) which visualizes only what remains unfiltered. A reviewer can therefore evaluate quickly large quantities of results without needing to use the mouse or keyboard.
Exposing Hidden Detail Using a “Combined Query” Approach
Using a query based approach to results evaluation exposes the hidden detail in the samples, and allows a reviewer to remained focused on a single review task. We found that this appears to improve the accuracy of final results – though this has not been quantified at this time and was not the initial aim of the experiment.
TRADITIONAL TRAFFIC LIGHT APPROACH
REVIEW AS AREA%
MAYBE PEAKS PRESENT
FOUND, AREA%≥80%, NO SOLVENT FRONT ELUTION, DOES NOT ELUTE AT END OF
CHROMATOGRAPHY
REQUIRES PURIFICATION
PURIFICATION – SLOW GRADIENT
CUSTOM QUERY ‐ 1
CUSTOM QUERY ‐ N
STEP 1
STEP 2
STEP 3
STEP 4
STEP 5
STEP 6
STEP …
STEP “N”
TIMED AUTO‐ADVANCE TO NEXT UNFILTERED POSITION
EXAMPLE QUERY &
VISUALIZATION CRITERIA
Traffic Light Approach Combined Query Approach
47
217
Using the Combined Query Approach to find all "Maybe" peaks {Sample size 1015 samples}
Maybe
Traffic Light Approach Before Review After Review Found 690 715 Maybe 47 0
Not Found 278 300 Total 1015 1015
Limited Visibility Data Review using the Traditional Traffic Light Approach
Combined Query Approach Before Review After Review “Found” AND “%Area UV310 >=
80% AND NOT “Solvent Front” AND NOT “End of Chrom
110 107
Any Maybe Peaks Present 217 0 Not Found 278 300
Isobaric substances present 140 163 Substance Elutes in Solvent Front 6 6
Combined Query Approach that allows you to ask many varied questions of the results
Left – The Traditional Visualization of Found/Maybe/Not Found
Right ‐ Visualization of all samples containing any maybe peaks