Post on 22-Feb-2016
description
“An automated tool designed to ease the pain of test creation and maintenance.”
Nil WeerasingheBryan RobbinsMohamed Ibrahim
American University Presentation Copyright 2011 FINRA 2
About FINRA
■ Financial Industry Regulatory Authority
• Largest independent regulator for all securities firms doing business in the U.S.
• ~4,500 brokerage firms• ~163,500 branch offices• ~634,400 registered securities
representatives
Arial Body Copy
Providing independent, vigorous regulation
Educating & informing investors
Inviting active industry involvement
& input
Actively supportingfirms’ compliance
efforts
Our Mission:Investor Protection. Market Integrity.
Computerized certification and continued education.
Series 7, 63 …etc.
American University Presentation Copyright 2011 FINRA 3
FINRA Open Source Projects
■ Increase Community Involvement
■ FINRA Open Source Projects• http://finraos.github.io/
■ DataGenerator• http://finraos.github.io/DataGenerator/
■ JTAF-ExtWebDriver• http://finraos.github.io/JTAF-ExtWebDrive
r/
American University Presentation Copyright 2011 FINRA 4
How to get involved.
■ Use it■ Extend it
• Fork it• Discuss idea
– Open ticket– Google group discussion– opensource@finra.org
• Commit– DCO and ApacheV2
■ Report bugs■ Help document http://finraos.github.io/DataGenerator/
https://github.com/FINRAOS/DataGenerator
American University Presentation Copyright 2011 FINRA 5
Agenda
• What is the DataGenerator?
• Demo.–Dependency Modeling–Pairwise Data Generation.
• Current Limitations.
• Re-architecture plan.
• Questions
American University Presentation Copyright 2011 FINRA 6
Video
http://www.youtube.com/watch?v=Wxa1T0gp56k
http://finraos.github.io/DataGenerator/
American University Presentation Copyright 2011 FINRA 7
Current Approach
■ Two ways to describe and generate datasets• Equivalence Classes + Combinations• Dependency Model + Graph Coverage
■ Both use Apache Velocity to generate output from templates
DataSpec
Model Datasets Outputs
American University Presentation Copyright 2011 FINRA 7
Demo
■ Pairwise Combinations• Uses equivalence classes from
DataSpec to populate datasets
■ All Paths• Uses annotations from graphical
model to populate datasets
DataSpec
Model
American University Presentation Copyright 2011 FINRA 8
Limitations of Current Approach
■ Limited set of graph annotations• Can only set variable values within model• No support for logic, pos/neg equivalence classes in current version• We need more powerful annotation
■ Logic often split across spec, model, and templates• Anything dynamic must be injected into Velocity template, as model and
spec are both static• We need more dynamic evaluation
■ Performance considerations• Breadth-first enumeration doesn’t scale well as domain becomes more
complex• We need more performant implementation
American University Presentation Copyright 2011 FINRA 10
Re-architecting Data Generator
■ Replacing Visio with SCXML, an open standard to represent the state machine.
<scxml xmlns="http://www.w3.org/2005/07/scxml" xmlns:cs="http://commons.apache.org/scxml" version="1.0" initial="start">
<state id="start"> <transition event="RECORD_TYPE" target="RECORD_TYPE"/> </state>
<state id="RECORD_TYPE"> <!-- Mandatory --> <onentry> <assign name="var_out_RECORD_TYPE" expr="set:{a,b,c}"/> </onentry> <transition event="REQUEST_IDENTIFIER" target="REQUEST_IDENTIFIER"/> </state>...
American University Presentation Copyright 2011 FINRA 11
Re-architecting Data Generator
■ SCXML Allows for complex modelling using embedded EL
<state id="PRODUCT_TYPE_CODE"> <!-- Mandatory --> <onentry> <assign name="var_out_PRODUCT_TYPE_CODE" expr="#ProductTypeCode_Cycle"/> </onentry> <transition event="OPTIONS_SYMBOLOGY_IDENTIFIER" target="OPTIONS_SYMBOLOGY_IDENTIFIER" cond="${var_out_PRODUCT_TYPE_CODE=='Derivatives-Options'}" /> <transition event="OPTIONAL_SECURITY_SYMBOL" target="OPTIONAL_SECURITY_SYMBOL" cond="${var_out_PRODUCT_TYPE_CODE!='Derivatives-Options'}" /> </state>...
American University Presentation Copyright 2011 FINRA 12
Re-architecting Data Generator
■ SCXML Allows for complex modelling: A state can be written as a state machine itself■ We’re using apache commons-scxml in out POC
American University Presentation Copyright 2011 FINRA 13
Re-architecting Data Generator
■ Overcoming memory issues by enhancing the all-paths algorithm, use DFS with minimal memory overhead
American University Presentation Copyright 2011 FINRA 14
Re-architecting Data Generator
■ Short demo:<scxml xmlns=http://www.w3.org/2005/07/scxml xmlns:cs=http://commons.apache.org/scxml version="1.0" initial="start"> <state id="start"> <transition event="RECORD_TYPE" target="RECORD_TYPE"/> </state>
<state id="RECORD_TYPE"> <onentry> <assign name="var_out_RECORD_TYPE" expr="set:{a,b,c}"/> </onentry> <transition event="REQUEST_IDENTIFIER" target="REQUEST_IDENTIFIER"/> </state>
<state id="REQUEST_IDENTIFIER"> <onentry> <assign name="var_out_REQUEST_IDENTIFIER" expr="set:{1,2,3}"/> </onentry> <transition event="MANIFEST_GENERATION_DATETIME" target="MANIFEST_GENERATION_DATETIME"/> </state>
<state id="MANIFEST_GENERATION_DATETIME"> <onentry> <assign name="var_out_MANIFEST_GENERATION_DATETIME" expr="#{nextint}"/> </onentry> <transition target="end"/> </state>
<state id="end"> </state></scxml>
American University Presentation Copyright 2011 FINRA 15
Re-architecting Data Generator
■ Restructure the code to allow Hadoop Map Reduce and Giraph to operate on it.■ Data Generator won’t itself directly depend on Hadoop or Girpah,
but will abstract the following:• Input: Allow input from files• Execution: Allow the execution from a middle state provided input variables• Output: Allow outputs to different formats text files, several files, gz. The
user will be able to extend the output to support: sequence files, redshift, hbase
American University Presentation Copyright 2011 FINRA 16
Re-architecting Data Generator