Aspera bt-big-data-cloud
-
Upload
dkumiaspera -
Category
Technology
-
view
1.746 -
download
0
Transcript of Aspera bt-big-data-cloud
Enabling The Big Data Cloud for HPC and Collaboration With High-Speed Data Transport
PRESENTER AND AGENDA
PRESENTER
Daniel KumiDirector, New Market Development [email protected] • Who and Why Aspera?
• WAN Transport
• Wireless Transport
• Customer Use Cases
• Cloud and Big Data – Transfer Challenges for HPC and Collaboration
• Aspera On Demand
• BT-Aspera Discussion
AGENDA
ASPERA’S MISSION
Creating next-generation transport technologies
that move the world’s digital assets at maximum speed,
regardless of file size, transfer distance and network conditions.
Aspera: moving the world’s digital assets at maximum speed
Expanded to Asia PAC and Latin America through direct and channel
50% YOY growth in revenue and employees
Over 10,000 licenses sold, and over 1,500 customers world wide
Patents issued or pending in 32 countries
Continuing to innovate: fasp3™, fasp-MC™, mobile transport, cloud enablement
Aspera Ecosystem of Partners
Life SciencesLife Sciences
BIG DATA TRANSFER CHALLENGE
What Happened to my Bandwidth?
1000 Mbps• 170ms RTT• 0.001% packet loss rate Paris
Seattle
WAN Throughput is 1000Mbps
Max TCP Throughput ~29Mbps
Where’s my 970Mbps?
At 29Mbps50GB transfer will take 4 hrs1TB transfer will take 3.3 days
WAN
BIG-DATA and WAN TRANSFER WITH TCP
TCP WAS DESIGNED IN THE EARLY 80’S• When data was small & bandwidth was limited• Fantastic for reliable data delivery• Not fast enough for big-data
TCP IS THE ENGINE THAT DRIVES• FTP, HTTP & HTTPS• RSYNC, SCP & DICOM• CIFS & NFS
TCP DOES NOT LIKE NETWORK LATENCY/ RTT• Geographic distance increases latency• Network congestion increases latency
TCP DOES NOT LIKE PACKET LOSS• Loss is caused by congestion• Different network capacity• Wireless and satellite communications
The Aspera SolutionSo if TCP doesn’t work, what’s the answer?
WAN is 1000Mbps
Max TCP Throughput ~29Mbps
Max Aspera Throughput ~995Mbps (gain of x34)
ROI measured in $$ cost of not using 971Mbps
Same WAN Scenario with Aspera
1000 Mbps• 170ms RTT• 0.001% packet loss rate ParisSeattle
WAN
At 995 Mbps• 50GB transfer will take ~4 hrs• 50GB transfer will take ~7 mins
• 1TB transfer will take 3.3 days• 1TB transfer will take 2.4 hrs
FASP™ — HIGH-PERFORMANCE DATA TRANSPORT
MAXIMUM LINE-RATE WAN TRANSFER SPEED• Transfer performance scales with bandwidth independent
of transfer distance and resilient to packet loss• Optimal end-to-end throughput efficiency
CONGESTION AVOIDANCE AND POLICY CONTROL• Automatic, full utilization of available bandwidth• On-the-fly prioritization and bandwidth allocation
UNCOMPROMISING SECURITY AND RELIABILITY• Secure, user/endpoint authentication • AES-128 cryptography in transit & at-rest
SCALABLE MANAGEMENT, MONITORING AND CONTROL• Real-time progress, performance and bandwidth utilization• Detailed transfer history, logging, and manifest
ENTERPRISE-CLASS FILE DELIVERY• Transfers up to thousands of times faster than FTP/HTTP(S)• Precise and predictable transfer times• Extreme scalability (concurrency and throughput)
fasp Bandwidth ROI
FTP Across US US – EU US – ASIA Satellite
1 GB 1 – 2 hrs 2 – 4 hrs 4 – 20 hrs 8 – 20 hrs
10 GB 15 – 20 hrs 20 – 40 hrs Impractical Impractical
100 GB Impractical Impractical Impractical Impractical
fasp™ 2 Mbps 10 Mbps 45 Mbps 100 Mbps 200 Mbps 1 Gbps
1 GB 70 min. 14 min. 3.2 min. 1.4 min. 42 sec. 8.4 sec.
10 GB 11.7 hrs 140 min. 32 min. 14 min. 7 min. 1.4 min.
100 GB 23.3 hrs 5.3 hrs 2.3 hrs 1.2 hrs 14 min.
FTP: Limited by Distance & Packet Loss, Not B/W
Aspera: Scales Linearly with Bandwidth
Distance & Packet Loss Independent
FASP vs TCP PERFORMANCE
6 Gbps Scalable WAN Throughput
~6Gbps Big-Data Throughput• Latency independent• Loss independent
x3000 improvement vs. TCP• 1TB data moved in 20 min• 2 days with TCP over LAN conditions
Scale to ~10Gbps with IQ Accelerator
High Speed Mobile Data Transfer with fasp-AIR™
fasp-AIR SDK – maximum data transfer speed and predictability for mobile devices
• Embeddable software library allows app developers to integrate superior transport capabilities to their own applications such as faster and more predictable downloads/uploads.
• Available for Android and iOS on Aspera Developer Network• Designed for wireless networks with high latency, high packet loss environments• Integrated transfer queuing, pause, resume and progress reporting• Achieves significant performance improvements for upload and download
speeds over 3G, 4G and 802.11 g/n.
fasp-AIR Benchmarks on Verizon 4G
In some cases (highlighted in orange), speeds will vary greatly, depending on available bandwidth and the underlying condition of the wireless network.
CUSTOMER USE CASES: NCBI/NIH, HUTCHINSON
Large-scale Global Collaboration: 1000 Genomes
Petabytes of data transferred monthly• Files range in size from KBs to many GBs
Repository contents• 2,500 genomes from 27 populations• Several types of variations: SNPs, small insertions and deletions,
structural variants, and copy number variants
Available on web - 4 locations• 1000genomes.org, AWS, NCBI, and EBI websites• Technology web sites use:
• Aspera Connect Server• Aspera Developers’ Network and SDK
• Researchers across all locations use:• Aspera Connect client • (Freely distributable with server license)
NIH
Upload/Download
NIHNIH
Data
Cloud
Researcher to Researcher Collaboration
Use case : Genomic research
Genomic research results sharing• Research made available to collaborators• Research published—globally
Workflow• Illumina > Storage > Researcher > Aspera• Publish one-to-many
Collaboration options• Person-to-person, one-to-many (faspex server)• Publish-subscribe (faspex or connect server)
Seattle
Faspex in use by world-renowned Cancer Research Center in Seattle, WA
CLOUD COMPUTING & BIG DATA
• Eliminates the need to plan ahead• Allows companies to meet demand• Without the lead-time bottleneck
THE POTENTIAL OF INFINITE COMPUTING RESOURCES, ON DEMAND
CLOUD COMPUTING — WHY IS IT SO COMPELLING?
• Reduce capital outlay and investment risk• Start small & increase h/w resources to match need• Auto-scale to meet demand
THE ELIMINATION OF AN UP-FRONT COMMITMENT
• CPU’s by the hour• Storage by the day• Bandwidth by the GB
PAY-FOR-USE RESOURCE MODEL
SO? WHAT CAN I DO WITH IT?
• Near-line for editing, creative apps and processing• B2B / B2C data workflow• Offsite storage for disaster recovery and business
continuity
• OTT, play out, release, project & event specific marketing
• Collaborative data exchange• CDN and global delivery
• Compute Intensive: 10’s, 100’s, 1000’s of CPU cores• Transcoding, rendering, encoding, watermarking• Big-data analytics & HPC
DATA & CONTENTDISTRIBUTION
DATA PROCESSING & CONTENT CREATION
STORAGE FOR ARCHIVE & D/R
GETTING IN AND OUT OF THE CLOUDKNOWING WHEN TO CHOSE THE RIGHT TOOL
CHALLENGES OF STORING BIG FILES IN THE CLOUD?
BEWARE THE OBJECT STORE:• Not like traditional NAS or SAN• Bigger, better, but possibly much more complex• a.k.a. Google File System, Amazon S3, Hadoop Distributed File System • Simple read/write of data “blobs”, indexed by a key• Multiple replicas are distributed across storage for durability and optimized for access • Should work well for storing large numbers of files
UNDERSTAND CHUNKS, BLOCKS and BLOBS• You need to deal with chunks, blocks and blobs• “Chunk” sizes are small (64 MB/128 MB)
• Large media files must be “chunked” (1TB file = transporting and reassembling 10,000+ chunks!)• Multi-chunk APIs impede workflow and are complex
• Data I/O use the standard HTTP(s) protocol • VERY SLOW at distance• Single HTTP stream slow even locally (<100 Mbps).
BIG-DATA SERVICES WILL NEED A HIGH-SPEED BRIDGE TO THE CLOUD• Large files moved at full bandwidth capacity with global access• Overcome the WAN and storage bottleneck• Support files of any size or quantity• Transparent to the end user/data owner (GUI, command line, API, browser, etc.)• No hardware to support B2B, B2C, C2B workflow
FIRST MAJOR BOTTLENECKS: WAN TRANSFER
SECOND MAJOR BOTTLENECKS: LOCAL HTTP I/O
2nd Bottleneck — Data Center
1st Bottleneck - WAN
S3 & BIG-DATA: UNDERSTAND THE CONTRAINTS
S3 & BIG-DATA: MEET ASPERA’s DIRECT-TO-S3
clientcargo downloader
mobile apps
connect plug-in
point-to-point
OVERCOMING BOTH BOTTLENECKS
#1 — TRANSFER DATA TO EC2 OVER WAN EFFECTIVE THROUGHPUT
• http transfer over WAN (single stream)• Typical internet conditions
• 50–250ms latency & 0.1–3% packet loss• 15 parallel http streams
<10 Mbps
<10 to 100 Mbps
• Aspera fasp transfer over WAN to EC2 up to 1Gbps (per EC2 Extra Large Instance)
#2 — TRANSFER DATA FROM EC2 TO S3 EFFECTIVE THROUGHPUT
• Standard single stream http 10 to 100 Mbps
• Aspera S3 Proxy• With parallel I/O http streams
up to 1Gbps(per EC2 Extra Large Instance)
ASPERA + AWS | ~10 TB transferred per 24 hours | PER EC2 INSTANCE
ASPERA DIRECT-TO-S3 — LINE RATE ACCESS TO THE CLOUD
UNRIVALED ASPERA PERFORMANCE• Built on Aspera fasp™ technology for maximum transfer speed
• Regardless of file size, transfer distance and network conditions• Precise bandwidth control ensures the available bandwidth is utilized to achieve maximum transfer
speeds, while being fair to other business-critical network traffic
SEAMLESS INTEGRATION WITH S3• Integrated with S3 multi-part HTTP for maximum “last foot” performance• Simple configuration of S3 credentials, for both shared and dedicated docroot• Transfers directly into S3 are seamless and transparent to user
ENTERPRISE-GRADE SECURITY AND RELIABILITY• Secure authentication with encryption in transit & at rest (AES-128, FIPS 140-2, HIPPA Compliant)• Packet-level data integrity verification• Automatic resume of partial or failed transfers• Full support for AWS S3 Service-side-encryption at rest
INTEROPERATES WITH ALL ASPERA HOST OPTIONS• Any platform (Windows, Linux, MAC, UNIX, iOS, Android)• Any Aspera Clients (CLI, Desktop, Point-to-Point, Mobile, Web, Embedded)• Any Aspera Servers (Enterprise, Connect, faspex)
ASPERA FOR AWS: DIRECT-TO-S3
fasp
HTTP – multipa
rt
HTTP – multipart
Aspera TransferServer
Aspera Client
Client, Dallas, TX
1. Upload using typical multi-part HTTP client
2. fasp high-speed upload Direct-to-S3
1
2
Herndon, VA
Scale out
HYBRID CLOUD DEPLOYMENT (PUBLIC/PRIVATE)
fasp Shares
NodeNode
Shares app transparently communicates with Aspera server Nodes in cloud and in enterprise
User browses content across authorized shares
High-speed data transfers with Datacenter
High-speed data transfers with Direct-to-S3
DMZ
Herndon, VA
fasp
Datacenter, Emeryville, CA
Client, NY, NY
ASPERA SOFTWARE ON DEMAND
KEY FEATURES• On demand high-performance data transport to and from remote infrastructures• Unlimited scale out of transfer capacity with additional AMIs• Support for all Aspera Server software and use cases• Additional Client Options: Mobile, Outlook Plug-in & Cargo (Aspera faspex)• Flexible Storage Options: Local, EBS, AWS S3 • Seamlessly interoperates with on-premise Aspera deployments• Integrated Management and Monitoring
APPLICATIONS AND USE CASE• High Performance Computing On Demand• Content Aggregation, Transformation and Distribution• Time-boxed event or project-based collaboration, ad-hoc distribution or content ingest
Aspera ConsoleGlobal transfer monitoring,
reporting & control
Aspera SharesGlobal Person-to-person file
transfer & exchange
Aspera faspexGlobal Person-to-person file
ingest & distribution
Aspera ServerUniversal file transfer server
supports desktop, web, mobile & embedded
Aspera software product & technology portfolio
Transport
Distribute
Complete portfolio of servers and end point clients for high-speed digital content delivery and distribution.
Enterprise and Connect Server• Universal file transfer server and web-based
interface and directory listing
Client and Point-to-point• Uni- and bi-directional transfer clients
Connect• Web browser plug-in for high-speed uploads
and downloads
Mobile• High-speed transfer for mobile devices
Sync• Highly scalable, multidirectional file replication
and synchronization
Collaborate
Global person-to-person and project-based exchange and collaboration of files and directories, of any size, over any distance, over any network.
faspex Server• Secure digital delivery and collaborative file
transfers with remote users and partners• Integrated e-mail notifications for delivery and
successful download• Comprehensive administration, user
management & access control
faspex Multi-Server / HA• Automated bi-directional relays between sites
and multiple servers• 3-tier architecture with support for clustering and
high availability
Cargo• Automated client downloads
Automate
Web-based application and SDK for creating and managing automated workflows, from simple file forwarding, to complex process orchestration.
Orchestrator• Intuitive graphical workflow designer• File processing decision tree and flow• Rich and flexible plug-in architecture for third-
party process integration• Comprehensive library of plug-ins for
transcoding, virus checking, quality checking, archive, notifications
• High volume processing• Detailed dashboard, workflow, and step-level
progress reporting.• Open development framework for designing
and integrating highly processing and automation pipelines
Our unique, patented transport technologies provide unparalleled speed, efficiency, concurrency and bandwidth control over any size, distance, and networkfasp™
Patented, file-based bulk data transportfasp-AIR™Uploads and downloads over 3G, LTE and Wi-Fi networks
fasp3™Next-gen protocol for any bulk datafasp-MC™High-speed delivery over multicast
Aspera On-Demand S3|DirectHigh-speed transfer direct to cloud storage (S3)Console transport managementCentralized web-based management, monitoring, and reporting
Aspera fasp™ software environment
ASPERA DEVELOPER NETWORK
A complete set of SDKs provides developers with guides, reference information, and sample code to assist them with integrating Aspera technology into their own applications. Aspera fasp™ technology can be used in desktop, network-based, and web applications in place of FTP, HTTP, or custom TCP-based copy protocols.
ASPERA MOBILE APIs
Android SDKAspera Android SDK provides a Java API to transfer files using fasp-AIR™.
iPhone SDKAspera iPhone SDK provides an Objective C API to transfer files using fasp-AIR.
ASPERA APPLICATION APIsfaspex™ Web APIThe Aspera faspex Web API provides a set of services that enables users to create and receive digital deliveries via a Web interface, while taking advantage of fasp high-speed transfer technology.
OTHER INFORMATION
Supporting Tools and LibrariesSupporting tools and libraries let you perform other common tasks surrounding file transfers.
General ReferenceReference on error codes, log file locations, configuration files and more.
ASPERA TRANSFER APIs
Aspera Web ServicesA SOAP based web service API that allows initiation, monitoring and controlling of fasp based file transfers.
Aspera WebJavascript API exposed by Aspera Connect client. It allows integration of fasp based file transfers into web applications.
Connect 2.8 developer Preview 2Introducing the new Connect 2.8 developer preview! Integrate the functionality of Aspera Connect 2.8, a fasp-based file transfer client, into your own web applications, while customizing it to your unique brand.
fasp ManagerA class library that allows intiations, monitoring and controlling of fasp based file transfers.
Aspera Multicast SDKA Java class library that allows initiation and management of IP multicast based data transmissions using Aspera fasp-MC™.
Aspera software product & technology portfolio
Transport
Distribute
Complete portfolio of servers and clients for high-speed data delivery and distribution.
Enterprise and Connect Server• Universal file transfer server and web-based
interface and directory listing
Client and Point-to-point• Uni- and bi-directional transfer clients
Connect• Web browser plug-in
Mobile• High-speed transfer for mobile devices
Sync• Highly scalable, multidirectional file replication
and synchronization
Collaborate
Global person-to-person and project-based exchange and collaboration of files and directories.
faspex™ Server
• Secure digital delivery and collaborative file transfers with remote users and partners
• Web, email, mobile client options
• Comprehensive administration, user management & access control
faspex™ Multi-Server / HA
• Automated bi-directional relays between sites
• 3-tier architecture with support for clustering, HA
Cargo• Automated package downloads
Automate
Web-based application and SDK for creating and managing automated file-based workflows.
Orchestrator
• Intuitive graphical workflow designer
• File processing decision tree and flow
• Rich and flexible plug-in architecture for third- party process integration
• Comprehensive library of plug-ins for transcoding, A/V, QC, archive, notifications
• High volume processing
• Detailed dashboard, workflow, and step-level progress reporting.
• Open development framework for designing and integrating automation pipelines
Our unique, patented transport technologies provide unparalleled speed, efficiency, concurrency and bandwidth control over any size, distance, and networkfasp™
Patented, file-based bulk data transportfasp-AIR™Uploads and downloads over 3G, LTE and Wi-Fi networks
fasp3™Next-gen protocol for any bulk datafasp-MC™High-speed delivery over multicast
Aspera On-Demand S3|DirectHigh-speed transfer direct to cloud storage (S3)Console transport managementCentralized web-based management, monitoring, and reporting
APIs APIs APIs
API’s
BT-ASPERA DISCUSSION
THANK YOU!
Daniel KumiDirector, New Market [email protected]