Big Data Task Force Legacy from NAC IT …...Big Data Task Force Legacy from NAC IT Infrastructure...

25
Big Data Task Force Legacy from NAC IT Infrastructure Committee Charles P Holmes Chair, BDTF Formerly Vice-Chair, ITIC February 16, 2016

Transcript of Big Data Task Force Legacy from NAC IT …...Big Data Task Force Legacy from NAC IT Infrastructure...

  • Big Data Task ForceLegacy from

    NAC IT Infrastructure Committee

    Charles P HolmesChair, BDTF

    Formerly Vice-Chair, ITICFebruary 16, 2016

  • Mandatory Flow Chart

    NASA Advisory Council

    Human Ops Aeronautics

    Planetary Science

    Heliophysics

    Science

    Astrophysics

    Earth Science

    Planetary Protection

    IT Infrastructure Etc.

    NAC Structure 2010 - 2013

    2/16/16 NAC Big Data Task Force 2

  • NAC Committee on IT Infrastructure Recommendation #1 July 31, 2013

    • Recommendation: The NASA NAC ITIC & Science Committees should collaboratively explore the existing and planned evolution of NASA’s science data cyberinfrastructure that supports broad access to data repositories for NASA SMD missions. This exploration should be undertaken in the context of effective practices within NASA, other Federal agencies, as well as industry and research institutions.

    Wording Agreed to by Both ITIC and Science CommitteesJuly 31, 2013

    Work Will Continue as Big Data Taskforce Under Science Committee

    NAC Big Data Task Force2/16/16 3

  • NAC Committee on IT Infrastructure Recommendation #1

    • Recommendation: To enable NASA to gain experience on emerging leading-edge IT technologies such as:

    • Data-Intensive Cyberinfrastructure, 100 Gbps Networking, GPU Clusters, and Hybrid HPC Architectures,

    •••

    we recommend that NASA aggressively pursue partnerships with other Federal agencies, specifically NSF and DOE, as well as public/private opportunities. We believe joint agency program calls for end users to develop innovative applications will help keep NASA at the leading edge of capabilities and enable training of NASA staff to support NASA researchers as these technologies become mainstream.

    2/16/16 NAC Big Data Task Force 4

  • NAC Committee on IT Infrastructure DRAFT* Recommendation #2

    • Recommendation: NASA should formally review the existing national data cyberinfrastructure supporting access to data repositories for NASA SMD missions. A comparison with best-of-breed practices within NASA and at other Federal agencies should be made. We request a briefing on this review to a joint meeting of the NAC IT Infrastructure, Science, and Education committees within one year of this recommendation. The briefing should contain recommendations for a NASA data-intensive cyberinfrastructure to support science discovery by both mission teams, remote researchers, and for education and public outreach appropriate to the growth driven by current and future SMD missions.

    * To be completed after a joint meeting of ITIC, Science, and Education Committees in July 2012 and the final recommendation submitted to July 2012 NAC meeting

    2/16/16 NAC Big Data Task Force 5

  • NAC Committee on IT Infrastructure Recommendation #2 (continued)

    • Major Reasons for the Recommendation: NASA data repository and analysis facilities for SMD missions are distributed across NASA centers and throughout U.S. universities and research facilities. – There is considerable variation in the sophistication of the integrated

    cyberinfrastructure supporting scientific discovery, educational reuse, and public outreach across SMD subdivisions. The rapid rise in the last decade of “mining data archives” by groups other than those funded by specific missions implies a need for a national-scale cyberinfrastructure architecture that can allow for free-flow of data to where it is needed. Other agencies, specifically NSF’s Ocean Observatories Initiative Cyberinfrastructure program, should be used as a benchmark for NASA’s data-intensive architecture.

    • Consequences of No Action on the Recommendation: The science , education, and public outreach potential of NASA’s investment in SMD space missions will not be realized .

    2/16/16 NAC Big Data Task Force 6

  • ITIC Finding• SMD Data Resides in a Highly Distributed Servers

    – Many Data Storage and Analysis Sites Are Outside NASA CentersAccess to Entire Research Community Essential–

    • Over Half Science Publications are From Using Data ArchivesSecondary Storage Needed in Cloud with High Bandwidth and User Portal

    – Education and Public Outreach of Data Rapidly Expanding• Images/Videos for Public Relations

    Apps for Smart PhonesCrowd Sourcing

    ••

    PI

    Research

    Community

    Education

    Public Outreach

  • Partnering Opportunities with DOE: ARRA Stimulus Investment for DOE ESnet

    National-Scale 100Gbps Network Backbone

    2/16/16 NAC Big Data Task Force

    Source: Presentation to ESnet Policy Board 8

  • Global Partnering Opportunities:The Global Lambda Integrated Facility

    Research Innovation Labs Linked by 10Gps Dedicated Networks

    www.glif.is/publications/maps/GLIF_5-11_World_2k.jpg

  • SMD is a Growing NASA HPC User Community

    p r o

    j e

    c t e

    d

    Source: Tsengdar Lee, Mike Little, NASA

  • EOS-DIS Data Products DistributionApproaching ½ Billion/Year!

  • NAC – Information Technology Infrastructure Committee

  • NAC – Information Technology Infrastructure Committee

  • Solar Dynamics Observatory 4096x4096 AIA Camera – 57, 600 Images/Day

    JSOC is Archiving ~5TB/day From 6 CamerasLeads to over 1 Petabyte per year!

    March 6, 2012 X5.4 Flare from Sunspot AR1429 Captured by

    the Solar Dynamics Observatory (SDO)

    in the 171 Angstrom Wavelength

    Credit: NASA/SDO/AIA

  • 0

    200

    400

    600

    800

    1000

    1994 1997 2000 2003 2006 2009 2012 2015 2018

    Tera

    byte

    s

    Projected

    Multi-Mission Data Archives at STSIWill Continue to Grow - Doubling by 2018

    Cumulative Petabyte Over 20 Years

    JWST

    JWSTS&ITOther

    JWST

    Chart3

    Projected

    HST199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820191.351.92.53.75.26.16.87.2916.8722.226.7430.6631.5236.5444.8452.4960.5268.5576.5884.6192.64100.67108.7116.73124.76HLA199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820193.51054.276470126.8175145.27225999999999164.82028185.28201.34217.4233.46249.52GALEX199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820190.110.458000000000000023.666.9816.816.67000000000000224.3524.298080808080808080KEPLER199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820190.314.019999999999999610.614.618.60000000000000122.626.630.634.638.642.6Other199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820190.010.61.23.19999999999999974.31000000000000054.825.4235.9847.5710.7914.20914.4214.71600000000000115.36700000000000115.85699999999999915.9416.0916.5917.2918.3419.91499999999999922.277525.82124999999999931.13687499999999639.11031249999999951.070468749999996JWST S&IT199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820194.014.335812162024282828JWST1994199519961997199819992000200120022003200420052006200720082009201020112012201320142015201620172018201950200

    Terabytes

    Sheet1

    HLAHSTOtherIUEFUSEVLA FIRSTOTHERDSSGSC I+IIGALEXKEPLERMAST+HSTJWST S&ITJWSTEUVEEPOChSmallXMM-OMHLSPFudge

    19941.350.010.011.36

    19951.90.60.62.5

    19962.51.21.23.7

    19973.73.20.60.10.10.51.96.9

    19985.24.310.60.10.1112.59.51

    19996.14.820.60.10.121.52.510.92

    20006.85.4230.60.0930.10.1322.512.223

    20017.25.9840.60.1440.10.142.52.513.184

    200297.570.60.220.10.1542.516.57

    200316.8710.790.60.30.150.2472.50.127.66

    200422.214.2090.60.6050.1840.32102.5136.4090.10.060.16

    200526.7414.420.60.7360.1840.4102.50.45841.160.10.060.24

    200630.6614.7160.60.9520.1840.48102.53.6645.3760.10.060.140.18

    20073.531.5215.3670.61.0330.1841.05102.56.9846.8870.10.060.390.380.12

    20081036.5415.8570.61.20.2071.35102.516.852.3970.10.060.470.660.06

    200954.2744.8415.940.61.2030.2071.43102.516.670.3160.780.10.060.470.80

    20106452.4916.090.61.2030.2071.58102.524.354.0268.584.010.10.050.060.470.90

    20117060.5216.590.61.2030.2072.08102.524.2910.677.114.3350.10.050.060.471.40

    2012126.817568.5517.290.61.2030.2072.78102.58014.685.8480.10.050.060.472.10

    2013145.2722676.5818.340.61.2030.2073.83102.58018.694.92120.10.050.060.473.150

    2014164.8202884.6119.9150.61.2030.2075.405102.58022.6104.525160.10.050.060.474.7250

    2015185.2892.6422.27750.61.2030.2077.7675102.58026.6114.9175200.10.050.060.477.08750

    2016201.34100.6725.821250.61.2030.20711.31125102.58030.6126.49125240.10.050.060.4710.631250

    2017217.4108.731.1368750.61.2030.20716.626875102.58034.6139.836875280.10.050.060.4715.946875

    2018233.46116.7339.11031250.61.2030.20724.6003125102.58038.6155.840312528500.10.050.060.4723.9203125

    2019249.52124.7651.070468750.61.2030.20736.56046875102.58042.6175.83046875282000.10.050.060.4735.88046875

    HSTGSC I&IIDSSIUEFUSEVLA-FIRSTGALEXEUVESmall ProjectsXMM-OMHLSP

    22954.582500100002.4605.430.82896.866406255012.536222.60440625Mar-0562.5

    23370.222500100004.8639.781.6563439622.536977.956Apr-05118.5

    23730250010000600659.86183.9863439662.138174.946May-05158.1

    23996250010000600673.53183.986458.59662.138570.116Jun-05158.1

    24302.78250010000600697.22183.986458.59662.138900.586Jul-05158.1

    24625250010000600716.34183.986458.59662.139241.926Aug-05158.1

    24867250010000600722.76183.986458.59662.139490.346Sep-05158.1

    25758250010000600722.76183.986458.59662.140381.346Oct-05158.1

    25434250010000600725.56183.986458.59662.140060.146Nov-05158.1

    26379.21250010000600726.07183.986458.59662.141005.866Dec-05158.1

    26737.54250010000600736.38183.986458.59662.141374.506Jan-06158.1

    27038.44250010000600752.92183.986458.59662.114.22641706.172Feb-06172.326

    27421.47250010000600769.25183.986458.59662.114.22642105.532Mar-06172.326

    27763.69250010000600827.02183.986720.59662.114.22642767.522Apr-06172.326

    28107.86250010000600871.34183.9861469.369662.114.22643904.872May-06172.326

    28313.99250010000600890.49183.9861469.369662.114.22644130.152Jun-06172.326

    28590.13250010000600899.813183.9861469.369662.114.22644415.615Jul-06172.326

    28891.24250010000600917.4183.9861469.369662.114.22644734.312Aug-06172.326

    29132.29714250010000600923.2183.9861469.369662.214.22644981.26914Sep-06172.426

    29443.14250010000600928.7183.9861469.369662.214.22645297.612Oct-06172.426

    29784.57250010000600935.43183.9861469.369662.214.22645645.772Nov-06172.426

    30281.41250010000600939.63183.98619199662.214.22646596.452Dec-06172.426

    30659.06250010000600951.82183.9863659.89662.214.22648727.092Jan-07172.426

    30807.39250010000600967.33183.9863681.89662.214.22648912.932Feb-07172.426

    30957.59250010000600970.73183.9863681.89662.214.22649066.532Mar-07172.426

    31049.49250010000600970.73183.9863681.89662.214.22649158.432Apr-07172.426

    31123.99250010000600985.8183.9863681.89662.214.22649248.002May-07172.426

    31162.432500100006001000.87183.98645599662.214.22650178.712Jun-07172.426

    31189.722500100006001004.84183.98646599662.214.22650309.972Jul-07172.426

    31128.3115432500100006001011.96183.98647559662.238.6350376.087543Aug-07196.83

    31031.0543262500100006001027.10178183.98647929662.238.6350330.972106Sep-07196.83

    30738.2946482500100006001032.80369183.98652269662.238.6350477.914338Oct-07196.83

    30738.3952500100006001032.806183.9865255.4499662.238.6350507.466Nov-07196.83

    31446.21067820492500100006001035.4884657143183.9865273.4499662.238.63379.4769687551615.4411126692Dec-07576.30696875

    31522.132500100006001032.686183.98669809662.238.63384.2753399.902Jan-08581.1

    31951.42215763682500100006001053.1477916921183.98674249662.238.63384.2754293.6559493289Feb-08581.1

    322302500100006001053.1477916921183.98674249662.238.63384.2754572.2337916921Mar-08581.1

    New HLA

    201064

    2011706

    2012126.817556.8175

    2013145.2722618.45476

    2014164.8202819.54802

    2015185.2820.45972

    Chart2

    HLA19941995199619971998199920002001200220032004200520062007200820092010201120123.51054.276470126.8175HST19941995199619971998199920002001200220032004200520062007200820092010201120121.351.92.53.75.26.16.87.2916.8722.226.7430.6631.5236.5444.8452.4960.5268.55Other19941995199619971998199920002001200220032004200520062007200820092010201120120.010.61.23.19999999999999974.31000000000000054.825.4235.9847.5710.7914.20914.4214.71600000000000115.36700000000000115.85699999999999915.9416.0916.5917.29

    Calendar Year

    Terabytes

    Chart1

    HST19941995199619971998199920002001200220032004200520062007200820092010201120121.351.92.53.75.26.16.87.2916.8722.226.7430.6631.5236.5444.8452.4960.5268.55Other19941995199619971998199920002001200220032004200520062007200820092010201120120.010.61.23.19999999999999974.31000000000000054.825.4235.9847.5710.7914.20914.4214.71600000000000115.36700000000000115.85699999999999915.9416.0916.5917.29MAST+HST19941995199619971998199920002001200220032004200520062007200820092010201120121.362.53.76.99.510000000000001610.9212.22299999999999913.18400000000000116.5727.6636.40899999999999941.1645.37600000000000546.88752.39699999999999860.7868.5877.1185.84

    Calendar Year

    Terabytes

    3D Chart

    MAST DATA VOLUME GROWTH

    MAST (with HST SM4)1994199519961997199819992000200120022003200420052006200720081.362.53.76.99.510000000000001610.9212.22299999999999913.18400000000000116.5727.6636.40899999999999941.1645.37600000000000546.88752.396999999999998

    Year

    Terabytes

    Terabytes

    ChartX

    FUSEHSTIUEVLADSSGSCOther0.22110.40.60.11520.16

    Sheet2

    FUSE0.221

    HST10.4

    IUE0.6

    VLA0.11

    DSS5

    GSC2

    Other0.16

    Sheet3

  • 32 of the 200+ Apps in the Apple iStore that Return from a Search on “NASA”

  • Crowdsourcing Science: Galaxy Zoo and Moon Zoo Bring the Public into Scientific Discovery

    More than 250,000 people have taken part in Galaxy Zoo so far.In the 14 months the site was up Galaxy Zoo 2 users helped us make over 60,000,000

    classifications. Over the past year, volunteers from the original Galaxy Zoo project created the world's largest database of galaxy shapes.

    www.galaxyzoo.org

  • Finding #1

    • The U.S. government has issued several new guidance and directives on open data:

    – OSTP February 22, 2013 Increasing Access to the Results of Federally Funded Scientific Research

    OSTP March 29, 2013 Big Data is a Big Deal

    Presidential Exec Order May 9, 2013 Open Data Policy-Managing Information as an Asset

  • White House Big Data Initiative

    • National Science Foundation National Institutes of HealthDepartment of DefenseDepartment of EnergyU.S. Geological Survey

    ••••

    NAC – Information Technology Infrastructure Committee20

  • NAC Information Technology Infrastructure Committee

  • Re-organization of the NASA Advisory Council –(Memo signed April 28, 2014)

    The NASA Administrator shall establish the following Council committees, subcommittees, and task forces:

    – Aeronautics Committee. Human Exploration and Operations Committee. Science Committee.

    ––

    • Astrophysics Subcommittee. Earth Science Subcommittee. Heliophysics Subcommittee. Planetary Protection Subcommittee. Planetary Science Subcommittee. Ad Hoc Task Force on Big Data.

    •••••

    – Technology, Innovation, and Engineering Committee. Institutional Committee. Ad Hoc Task Force on Science, Technology, Engineering, and Mathematics (STEM) Education.

    ––

    2/16/16 NAC Big Data Task Force 22

  • Timeline

    • ITIC in existence – April 2010 – Dec 2013NAC reorganized – April 2014 •– Science Committee to have a Big Data Task Force

    • BDTF Terms of Reference signed Jan 8, 2015BDTF members appointed Dec 2015SMD appoints Exec. Sec. who solicits feed back from the Committee members and subcommittees1st meeting of BDTF – Feb 16, 2016

    ••

    2/16/16 NAC Big Data Task Force 23

  • NAC Committee on IT Infrastructure Recommendation #1 July 31, 2013

    • Recommendation: The NASA NAC ITIC & Science Committees should collaboratively explore the existing and planned evolution of NASA’s science data cyberinfrastructure that supports broad access to data repositories for NASA SMD missions. This exploration should be undertaken in the context of effective practices within NASA, other Federal agencies, as well as industry and research institutions.

    Wording Agreed to by Both ITIC and Science CommitteesJuly 31, 2013

    Work Will Continue as Big Data Taskforce Under Science Committee

    NAC Big Data Task Force2/16/16 24

  • Need I say more?

    2/16/16 NAC Big Data Task Force 25

    Big Data Task Force�Legacy from �NAC IT Infrastructure Committee Mandatory Flow ChartNAC Committee on IT Infrastructure �Recommendation #1 July 31, 2013NAC Committee on IT Infrastructure �Recommendation #1NAC Committee on IT Infrastructure �DRAFT* Recommendation #2NAC Committee on IT Infrastructure �Recommendation #2 (continued)ITIC FindingPartnering Opportunities with DOE: �ARRA Stimulus Investment for DOE ESnetGlobal Partnering Opportunities:�The Global Lambda Integrated FacilitySMD is a Growing NASA HPC User Community EOS-DIS Data Products Distribution�Approaching ½ Billion/Year!Slide Number 12Slide Number 13Solar Dynamics Observatory �4096x4096 AIA Camera – 57, 600 Images/DayMulti-Mission Data Archives at STSI�Will Continue to Grow - Doubling by 2018Slide Number 1632 of the 200+ Apps in the Apple iStore that Return from a Search on “NASA”Crowdsourcing Science: Galaxy Zoo and Moon Zoo Bring the Public into Scientific DiscoveryFinding #1White House Big Data InitiativeSlide Number 21Re-organization of the NASA Advisory Council – (Memo signed April 28, 2014)TimelineNAC Committee on IT Infrastructure �Recommendation #1 July 31, 2013Need I say more?