Talend Data Fabric...Talend API Tester lets users automatically generate test cases from API...

27
Talend Data Fabric Security architecture overview

Transcript of Talend Data Fabric...Talend API Tester lets users automatically generate test cases from API...

  • Talend Data Fabric

    Security architecture overview

  • ContentsContents 2

    Summary 4

    Talend architecture 5

    Talend Data Fabric comprises the following applications: 5

    Here is an overview of Talend’s functional architecture. . . . . . . . . . . . . . . . . . . . . . . . . . 5

    Talend Management Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    Talend Data Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Talend Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    Talend Data Stewardship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Talend API Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Talend API Tester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Talend Pipeline Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    Hybrid infrastructure 12

    Talend Data Fabric infrastructure 13

    Computation resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    Data that we collect 14

    Data that customers process with Talend Data Fabric 14

    Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    Data flows 15

    Data flows between Talend Studio and Talend Data Fabric . . . . . . . . . . . . . . . . . . . . . . 15

    Metadata is transferred to Talend Data Fabric in the cloud via the following URLs: 15

    API designs are retrieved using the following secured endpoints: 15

    Talend Studio defaults to uploads of Talend Jobs

    using the following pre-signed URLs: 16

    Data flows between Talend Studio jobs and Talend Data Fabric . . . . . . . . . . . . . . . . . . . 16

    Data flows between Remote Engine and Talend Data Fabric . . . . . . . . . . . . . . . . . . . . . . 17

    2 Talend Data Fabric Security Architecture Overview

  • MSG service 18

    Repository service 18

    Remote Engine pairing service 19

    Data transfer service 19

    Log transfer service 19

    Vault gateway service 20

    Remote Engine Gen2 service 20

    Data flows in hybrid deployment between Talend Data Preparation,

    Talend Data Stewardship, Talend Dictionary Service, and Talend Data Fabric . . . . . . . . . . . 21

    Public APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    Security at Talend 22

    Physical security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    Security training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    Secure software development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    Cloud workload protection and monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Authentication, authorization, and access control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Standard access 23

    Administrative access 24

    Password management 24

    Key management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    On AWS 24

    On Azure 25

    Vulnerability management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    Disaster recovery and business continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    Security certifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    3 Talend Data Fabric Security Architecture Overview

  • Talend Data Fabric (https://www.talend.com/products/data-fabric/) is a managed cloud integration platform that makes it easy for developers and data constituents to collect, transform, and clean data. Talend leverages security and privacy best practices to protect both the Talend platform and Talend, the company. Talend implements a combination of policies, procedures, and technologies to ensure your data is protected and secured. Talend’s chief information security officer (CISO) defines the Talend security strategy, architecture, and program. This document provides an overview of the Talend internal architecture and our policies and procedures as they pertain to employee, physical, network, infrastructure, platform, architecture, and data security.

    Summary

    4 Talend Data Fabric Security Architecture Overview

    https://www.talend.com/products/data-fabric/

  • Talend Data Fabric is a multitenant integration environment that allows businesses to collect, govern, transform, and share data. All managed components are hosted on either Amazon Web Services (AWS) or Microsoft Azure, according to customer preference.

    Talend Data Fabric comprises the following applications:

    • Talend Management Console

    • Talend Data Inventory

    • Talend Data Preparation

    • Talend Data Stewardship

    • Talend API Designer

    • Talend API Tester

    • Talend Pipeline Designer

    Additionally, Talend Studio, which runs on a local workstation, allows users to design data integration flows (or Talend Jobs) and publish them to Talend Data Fabric.

    Here is an overview of Talend’s functional architecture.

    Talend architecture

    Figure 1: Talend functional architecture

    5 Talend Data Fabric Security Architecture Overview

  • The table below summarizes where each application is available or can be installed. All Talend Data Fabric applications are available on AWS and Azure. Some components can optionally be installed in a hybrid configuration, residing on customer infrastructure. Please refer to the Hybrid Infrastructure section below for more details.

    Component Amazon Web Services Azure Hybrid Installation

    Talend Management Console Yes Yes N/ATalend Data Inventory Yes Yes N/ATalend Data Preparation Yes Yes YesTalend Data Stewardship Yes Yes YesTalend API Designer Yes Yes N/ATalend API Tester Yes Yes YesTalend Pipeline Designer Yes Yes N/A

    Each of the following sections briefly describes a Talend Data Fabric application and gives an overview of its functional architecture. Please refer to our website at www.talend.com for more details about each application and terms used throughout the document.

    Talend Management Console

    Talend Management Console (TMC) is a browser-based application that provides access to all Talend Data Fabric applications and components, as well as the administrative features and configurations that surround them.

    TMC lets users schedule the execution of Talend Jobs via discrete components called execution engines. There are two types of engines:

    • Cloud Engines are fully managed components that are provisioned, deployed, and controlled by Talend within our platform. Cloud Engines do not share jobs from multiple tenants; they are provisioned at execution time (per job schedule), per tenant.

    • Remote Engines are execution agents deployed and managed by customers on their own systems, within their own physical or virtual (cloud) networks.

    6 Talend Data Fabric Security Architecture Overview

    www.talend.com

  • Talend Data Inventory

    Talend Data Inventory provides automated tools for dataset documentation, quality proofing, and promotion. It identifies data silos across data sources and targets to provide visualization of reusable and shareable data assets.

    Figure 2: Talend Data Inventory functional architecture

    7 Talend Data Fabric Security Architecture Overview

  • Figure 3: Talend Data Preparation functional architecture

    Figure 4: Talend Data Preparation functional architecture in hybrid deployment

    Talend Data Preparation

    Talend Data Preparation (TDP) allows customers to simplify and speed up the process of preparing data for analysis and other tasks. TDP allows customers to create, update, remove, and share datasets, then create preparations on top of the datasets that can be incorporated into Talend Jobs with Talend Studio.

    8 Talend Data Fabric Security Architecture Overview

  • Talend Data Stewardship

    Talend Data Stewardship (TDS) allows customers to collaboratively curate, validate, and resolve conflicts in data, as well as address potential data integrity issues.

    Figure 5: Talend Data Stewardship functional architecture

    Figure 6: Talend Data Stewardship functional architecture in hybrid deployment

    9 Talend Data Fabric Security Architecture Overview

  • Figure 7: Talend API Services functional architecture

    Talend API Designer

    Talend API Designer lets users design APIs collaboratively and visually, then run simulations to test APIs and generate reference documentation.

    Talend API Tester

    Talend API Tester lets users automatically generate test cases from API contracts, then field test APIs by grouping tests together that simulate real-world examples. Users can integrate unit tests into a managed CI/CD process to ensure quality.

    10 Talend Data Fabric Security Architecture Overview

  • Figure 8: Talend Pipeline Designer functional architecture

    Talend Pipeline Designer

    Talend Pipeline Designer (TPD) allows customers to design and run data pipelines in the cloud.

    • A data pipeline is a data integration process: a series of transformation steps applied to data. It extracts data from customer-specified sources, transforms it step by step using prebuilt processors, and loads it into other datasets (destinations).

    • Data pipelines can be started directly from TPD or scheduled in Talend Management Console.

    • Data pipelines can be executed on Cloud Engines or Remote Engines.

    11 Talend Data Fabric Security Architecture Overview

  • Organizations can deploy Talend in a hybrid configuration, with some components running on-premises and others running on cloud platforms. The only required component for running Talend in a hybrid environment is the Talend Studio development environment, which is installed on local workstations. Users may install additional applications or components in a hybrid configuration: • Talend Data Preparation

    • Talend Data Stewardship

    • Talend API Tester — web browser extension

    • Remote Engine — Java-based runtime to execute Talend Jobs on-premises or on a cloud platform that the customer control

    • Remote Engine Gen2 — a Docker-based runtime to execute Talend Pipeline Designer data pipelines on-premises or on a cloud platform that the customer controls

    Hybrid infrastructure

    12 Talend Data Fabric Security Architecture Overview

  • Each Talend Data Fabric customer has its own account to access the Talend environment. The account contains the number of users defined by the customer’s license. In the following section, “tenant” is equivalent to account; we use the terms interchangeably.

    Talend Data Fabric infrastructure

    Computation resources

    Talend Data Fabric is multitenant platform, and customers can set up isolated execution environments for computation resources.

    • Remote Engines are deployed by customers on their own systems and therefore serve as computation resources that they manage and control.

    • Cloud Engines are deployed within Talend Data Fabric as separate tenant-specific AWS EC2 or Azure VM instances and never shared with other tenants. Each tenant gets its own Cloud Engine instance on AWS or Azure.

    The live preview feature of Talend Pipeline Designer, which allows users to preview the output of processors while designing a pipeline, is executed in a dedicated Remote Engine or Cloud Engine.

    Talend Management Console, Talend Data Inventory, and Talend Pipeline Designer give separate computation resources to each tenant.

    13 Talend Data Fabric Security Architecture Overview

  • Data storage

    Talend works with two general types of data: data that we collect and data that customers process with the software.

    Data that we collect

    Talend, across its cloud applications, collects only customer information that it needs to provide its services or to manage customer accounts.

    All personally identifiable information that we collect (such as name, country, and email address) is protected with encryption at rest via AES-256 and in transit via HTTPS TLS 1.2.

    Secrets such as passwords, keys, and certificates are managed via third-party technologies and products. We go into more detail about this in the Key Management section below.

    No payment information is stored in Talend Data Fabric. We rely on third-party vendors to collect and manage payment information.

    Data that customers process with Talend Data Fabric

    Whether customers use Remote Engines or Cloud Engines, their datasets remain on systems and data repositories that they manage. Metadata, Designs, Talend Jobs, Artifacts, and any other objects that Talend stores to provide services or for security reasons are isolated via tenant-specific schemas and tenant-specific data encryption keys.

    Network

    To function properly and deliver its services, Talend Data Fabric may need to communicate with external third-party solutions. All communications between Talend Data Fabric and such external solutions need to be authorized and initiated by Talend Data Fabric. No external solution can communicate with Talend Data Fabric unless the communication was initiated by Talend Data Fabric.

    Talend networks and systems are protected via network and application firewalling, visibility mechanisms, and micro segmentation strategies.

    14 Talend Data Fabric Security Architecture Overview

  • This section gives an overview of the data flows between Talend Data Fabric applications and components.

    Data flows between Talend Studio and Talend Data Fabric

    The types of data that can be exchanged between Talend Studio and Talend Data Fabric include:

    a) Task artifact binaries

    b) Task artifact metadata (such as context variables and parameters)

    c) Talend API Designer definitions

    Users’ credentials (login name and password or API token generated in TMC) are required to authorize the exchange.

    Metadata is transferred to Talend Data Fabric in the cloud via the following URLs:

    Cloud Region Talend Inventory service URL

    AWS US https://tmc.us.cloud.talend.com/inventory

    Europe https://tmc.eu.cloud.talend.com/inventory

    Asia-Pacific https://tmc.ap.cloud.talend.com/inventory

    Azure US https://tmc.us-west.cloud.talend.com/inventory

    API designs are retrieved using the following secured endpoints:

    Cloud Region API Design service URL

    AWS US https://api-apid-service.us.cloud.talend.com/external/projects https://api-apid-service.us.cloud.talend.com/external/projects/{projectId}

    Europe https://api-apid-service.eu.cloud.talend.com/external/projects https://api-apid-service.eu.cloud.talend.com/external/projects/{projectId}

    Asia-Pacific https://api-apid-service.ap.cloud.talend.com/external/projects https://api-apid-service.ap.cloud.talend.com/external/projects/{projectId}

    Azure US https://api-apid-service.us-west.cloud.talend.com/external/projects https://api-apid-service.us-west.cloud.talend.com/external/projects/{projectId}

    Data flows

    15 Talend Data Fabric Security Architecture Overview

  • Talend Studio defaults to uploads of Talend Jobs using the following pre-signed URLs:

    Cloud Region S3 pre-signed URL

    AWS US https://*-talend-com.s3.us-east-1.amazonaws.com

    Europe https://*-talend-com.s3.eu-central-1.amazonaws.com

    Asia-Pacific https://*-talend-com.s3.ap-northeast-1.amazonaws.com

    Azure US https://minio.us-west.cloud.talend.com

    Data flows between Talend Studio jobs and Talend Data Fabric Talend Studio has the following components that can communicate with Talend Data Fabric:

    Data Preparation Job components — more details here

    Cloud Region Data Preparation URL

    AWS US https://tdp.us.cloud.talend.com

    Europe https://tdp.eu.cloud.talend.com

    Asia-Pacific https://tdp.ap.cloud.talend.com

    Azure US https://tdp.us-west.cloud.talend.com

    Data Stewardship Job components – more details here

    Cloud Region Data Stewardship URL

    AWS US https://tds.us.cloud.talend.com

    Europe https://tds.eu.cloud.talend.com

    Asia-Pacific https://tds.ap.cloud.talend.com

    Azure US https://tds.us-west.cloud.talend.com

    16 Talend Data Fabric Security Architecture Overview

    https://help.talend.com/r/lQLpAh4etogKGGo7r4elSA/ZSb6PvTcL8AJXA9ZmdRSOQhttps://help.talend.com/r/vi44A6xmH9sBDA5cCsHeEA/VZn4ylMFggH6nDwf6zBzqw

  • Data flows between Remote Engine and Talend Data Fabric

    Talend Data Fabric never initiates connections to Remote Engines. Remote Engines always initiate outbound connections to Talend. Once a connection is established, all data is sent encrypted over HTTPS.

    In 2021, Talend is progressively rolling out support for AWS and Azure PrivateLink connectivity between Talend Data Fabric and Remote Engines, adding an extra layer of security by ensuring traffic is not exposed to the public internet. Talend private endpoints are available in AWS and Azure.

    Here are the types of data that can be exchanged between Remote Engines and Talend:

    a) Status information and metrics

    b) Lifecycle commands

    c) Task artifact metadata

    d) Job logs (optional)

    e) Task artifact binaries

    The next sections discuss each service endpoint and the type of data being transmitted.

    Figure 10: Talend data flows when using Remote Engines

    17 Talend Data Fabric Security Architecture Overview

  • MSG service

    This HTTPS-only service is used to send metadata of types a) to d) in the list above.

    This path is a control path to schedule flow deployments and capture execution status (success, fail). Other information transferred is the number of rows successfully processed or being rejected. This also includes the final success message.

    Cloud Region Msg service URL

    AWS US https://msg.us.cloud.talend.com

    Europe https://msg.eu.cloud.talend.com

    Asia-Pacific https://msg.ap.cloud.talend.com

    Azure US https://msg.us-west.cloud.talend.com

    Repository service

    This HTTPS-only service is used to fetch artifacts stored in Talend’s repositories (Nexus). Unique credentials are generated per tenant.

    Cloud Region Repository service URL

    AWS US https://repo.us.cloud.talend.com

    Europe https://repo.eu.cloud.talend.com

    Asia-Pacific https://repo.ap.cloud.talend.com

    Azure US https://repo.us-west.cloud.talend.com

    18 Talend Data Fabric Security Architecture Overview

  • Remote Engine pairing service

    This HTTPS-only service is used during the initial pairing of the Remote Engine to its tenant and the status exchange such as heartbeats, availability, and status of the engine.

    Cloud Region Remote Engine pairing service URL

    AWS US https://pair.us.cloud.talend.com

    Europe https://pair.eu.cloud.talend.com

    Asia-Pacific https://pair.ap.cloud.talend.com

    Azure US https://pair.us-west.cloud.talend.com

    Data transfer service

    This HTTPS-only service is used to create one-time ephemeral presigned URLs to authorize resource file uploads from the Remote Engine to Talend.

    Cloud Region DTS service URL

    AWS US https://dts.us.cloud.talend.com

    Europe https://dts.eu.cloud.talend.com

    Asia-Pacific https://dts.ap.cloud.talend.com

    Azure US https://dts.us-west.cloud.talend.com

    Log transfer service

    This HTTPS-only service is used to create one-time ephemeral presigned URLs to authorize logfile uploads from the Remote Engine to Talend.

    Cloud Region Remote Engine service URL

    AWS US https://lts.us.cloud.talend.com

    Europe https://lts.eu.cloud.talend.com

    Asia-Pacific https://lts.ap.cloud.talend.com

    Azure US https://lts.us-west.cloud.talend.com

    19 Talend Data Fabric Security Architecture Overview

  • Vault gateway service

    This HTTPS-only service is used by Remote Engine Gen2 to connect to Talend’s Vault.

    Cloud Region Vault gateway service URL

    AWS US https://vault-gateway.us.cloud.talend.com

    Europe https://vault-gateway.eu.cloud.talend.com

    Asia-Pacific https://vault-gateway.ap.cloud.talend.com

    Azure US https://vault-gateway.us-west.cloud.talend.com

    Remote Engine Gen2 service

    This HTTPS-only service is used by Remote Engine Gen2 to set up one-time token generation authentication and WebSocket enablement.

    Cloud Region Remote Engine Gen2 service URL

    AWS US engine.us.cloud.talend.com

    Europe engine.eu.cloud.talend.com

    Asia-Pacific engine.ap.cloud.talend.com

    Azure US engine.us-west.cloud.talend.com

    20 Talend Data Fabric Security Architecture Overview

  • Data flows in hybrid deployment between Talend Data Preparation, Talend Data Stewardship, Talend Dictionary Service, and Talend Data Fabric

    Guiding principle — Talend applications and components always initiate outbound HTTPS connections. Talend Data Fabric never initiates any inbound connection to these applications.

    Here are the types of data that can be exchanged between hybrid applications and Talend Data Fabric:

    a) During user login: Client ID and client secret (as defined in the OIDC specification) of the installed application is used to authorize its communication with Talend Data Fabric.

    b) After user login: A JSON Web Token (JWT) that represents the user’s identity, metadata, and claims is transferred back to the application.

    Public APIs

    In addition to the data flows between Talend applications, Talend exposes public APIs that let developers automate workflows. Access to these APIs is secured with Personal Access Tokens generated in Talend Management Console.

    21 Talend Data Fabric Security Architecture Overview

    https://openid.net/specs/openid-connect-core-1_0.html#CodeFlowAuth

  • Talend’s security organization consists of a dedicated team of security experts distributed across the company who work closely with the Talend CISO. Their mission is to protect Talend and its clients with security best practices. This team supports all aspects of Talend business, including Talend development and operations. The responsibility of Talend security rolls up to the CISO, who also defines Talend security strategy, architecture, and program.

    Physical security

    Talend maintains security controls to prevent unauthorized physical access to buildings and data centers and to protect its systems and software, and by extension the Talend environment, from damage, interruption, misuse, or theft.

    Authorizations are reviewed regularly, and access is monitored continuously.

    Security training

    All Talend employees are trained on security best practices. All Talend employees involved in the Talend development lifecycle, from creation to deployment and operation, are guided through trainings, reviews, and drills.

    Secure software development

    Talend’s security organization is involved throughout the creation of any new application, capability, or feature.

    Our security experts conduct architecture, design, and code reviews.

    Software composition analysis (SCA) and static security vulnerability (SAST) scans are integrated in the software development lifecycle.

    Talend implements a Top 10 Open Web Application Security Project (OWASP) awareness program during application development, and schedules regular internal and external audits to assess compliance with OWASP best practices.

    Security at Talend

    22 Talend Data Fabric Security Architecture Overview

  • Cloud workload protection and monitoring

    We use a combination of security services from third-party vendors to protect Talend Data Fabric.

    Our security experts use external scanning tools to ensure that systems and containers are hardened, configured, and patched according to Talend guidelines and best practices.

    Talend uses NIST Cybersecurity Framework as part of its global security strategy.

    Our deployments leverage the built-in segmentation capabilities of AWS EC2 Security groups and Microsoft Azure Network Security groups to restrict inter-resource communication.

    Talend Data Fabric’s perimeter security is composed of (but not limited to):

    • Web Application Firewall (WAF) — validates, monitors and filters all web application and API traffic

    • Network-based intrusion detection system (IDS) and intrusion prevention system (IPS) — alert on rogue activity and protect against threats such as zero-day attacks

    • Security information and event management system (SIEM) — monitoring and observability of system status and performance and detection of rogue processes

    Authentication, authorization, and access control

    Standard access

    Tenant users are authenticated with their own unique credentials: username plus password by default. Talend also supports integration with external SAML-based single-sign-on and multifactor authentication (MFA) providers. In addition, source IP-based access control can be applied to restrict access to Talend Data Fabric from unauthorized locations.

    Talend issues X.509 public key certificates, which must be used to secure and encrypt all communications between user systems and Talend Data Fabric. Talend Data Fabric supports HTTPS over TLS.

    In 2020, Talend introduced a new identity manager based on the Auth0 platform. Auth0 is a third-party service provider that complies with Talend Security standards and certifications. This migration is part of a global security strategy to better enable Talend to concentrate on our core domain by working with trusted third-party security vendors.

    23 Talend Data Fabric Security Architecture Overview

    https://auth0.com/security

  • In 2021, within each operational region, Talend will use CDN with local points of presence paired with a global Auth0 private instance located in multiple countries, ensuring best performance and compliance with data sovereignty laws.

    Administrative access

    Talend Data Fabric administrative access requires management review and approval. Elevated privilege access requires the same level of approval by management.

    Access to any management console, Talend Data Fabric, AWS, or Azure requires multifactor authentication (credentials plus secret keys).

    Access to the AWS console is restricted to select members of the Talend Site Reliability Engineering (SRE) or Information Security teams. New account creation follows a strict approval process. Accounts are reviewed quarterly.

    System access is provided via SSH private keys. Public keys are automatically deployed with the Talend configuration management tool.

    Password management

    Talend maintains a password management policy that all employees must comply with. It ensures the creation of strong passwords, the protection of those passwords, and that passwords are never reused.

    Key management

    On AWS

    Talend relies on AWS-managed Customer Master Keys (CMK) for encryption. Talend uses its own AWS CMK to generate unique data encryption keys (DEK).

    Most DEKs are tenant-specific and are managed (including rotation) by Talend. DEKs that do not need to be tenant-specific are managed via the AWS Encryption SDK.

    Front-end TLS endpoints are managed through the AWS Certificate Manager (ACM). The private key is generated by Talend and the associated certificate signed by Talend’s approved Certificate Authority (CA), GoDaddy. The certificates are then published as part of the Certificate Transparency program and uploaded to the ACM.

    24 Talend Data Fabric Security Architecture Overview

  • On Azure

    Talend applications and components deployed on Azure obtain and use tenant-specific master keys from HashiCorp Vault to encrypt tenant-related data.

    Front-end TLS endpoints are managed through Traefik (edge router running as a Kubernetes service) and Kubernetes Secrets. Private keys are generated by Talend and certificates are signed by Talend’s approved Certificate Authority (CA), GoDaddy. The certificates are then published as part of the Certificate Transparency program and uploaded to Traefik configuration as Kubernetes secrets.

    Vulnerability management

    All applications are tested by Talend’s security experts (dynamic application security testing (DAST) and penetration tests) at least twice a year.

    In addition, Talend leverages internal and third-party security services to perform external penetration tests.

    Third-party penetration tests are scheduled twice a year and prior to any new Talend Data Fabric release and deployment. The penetration tests cover a wide range of security aspects of the application and address modern web best practices.

    All detected vulnerabilities are logged by the Talend Quality Assurance team and analyzed y the Talend Information Security team, which then supports, tracks, and tests their remediation.

    Talend follows the Security Content Automation Protocol (SCAP) framework. Vulnerabilities are rated according to the Common Vulnerability Scoring System (CVSS) v3.0 equation. Vulnerabilities are resolved depending on their severity rating and their potential impact on the infrastructure.

    Third-party penetration test reports are available upon request at Talend’s discretion.

    Backups

    Talend uses various AWS and Azure data storage services. All data storage services are regularly and automatically backed up and mirrored to a remote site. Most backups occur hourly.

    Backup executions are monitored.

    Integrity checks are systematically made one week following every new production deployment.

    25 Talend Data Fabric Security Architecture Overview

  • Disaster recovery and business continuity

    Talend maintains disaster recovery/business continuity (DR/BC) plans that are reviewed, updated, and tested at least annually.

    Talend operates in multiple AWS and Azure regions globally. The redundant infrastructure has Primary and disaster recovery data centers in each of the Talend Cloud regions, and multiple Availability Zone (AZ) architecture per region.

    We are in close contact with both vendors and carefully monitor their service levels to make sure that they meet our required service levels. Latest uptime per region is available on https://trust.talend.com.

    Talend R&D and Operational teams span multiple geographical locations: US, Europe, and Asia. Every function and duty can be fulfilled by at least two people.

    Security certifications

    Talend is SOC 2 Type 2 compliant and eligible to sign HIPAA (Health Insurance Portability and Accountability) Business Associate Agreements (BAA).

    We use the Cloud Security Alliance (CSA) Security Trust Assurance and Risk (STAR) program to assess our security practices and validate the security posture of our cloud offerings.

    A comprehensive list of security certifications and privacy compliance is available on https://www.talend.com/security/.

    Refer to AWS and Azure websites for more details about their security certifications and compliance information.

    26 Talend Data Fabric Security Architecture Overview

    https://trust.talend.comhttps://www.talend.com/security/https://www.talend.com/security/

  • 27

    About TalendTalend, a leader in data integration and data integrity, enables every company to find clarity amidst the chaos.

    Talend Data Fabric brings together in a single platform all the necessary capabilities that ensure enterprise data is complete, clean, compliant, and readily available to everyone who needs it throughout the organization. It simplifies all aspects of working with data for analysis and use, driving critical business outcomes.

    From Domino’s to L’Oréal, over 4,250 organizations across the globe rely on Talend to deliver exceptional customer experiences, make smarter decisions in the moment, drive innovation, and improve operations. Talend has been recognized as a leader in its field by leading analyst firms and industry publications including Forbes, InfoWorld and SD Times.

    Talend is Nasdaq listed (TLND) and based in Redwood City, California.

    For more information, please visit www.talend.com and follow us on Twitter: @Talend.

    www.talend.comhttps://twitter.com/talend

    ContentsSummaryTalend architectureTalend Data Fabric comprises the following applications: Here is an overview of Talend’s functional architecture. Talend Management Console Talend Data InventoryTalend Data PreparationTalend Data StewardshipTalend API DesignerTalend API TesterTalend Pipeline Designer

    Hybrid infrastructureTalend Data Fabric infrastructureComputation resourcesData storageData that we collectData that customers process with Talend Data Fabric

    Network

    Data flowsData flows between Talend Studio and Talend Data Fabric Metadata is transferred to Talend Data Fabric in the cloud via the following URLs: API designs are retrieved using the following secured endpoints: Talend Studio defaults to uploads of Talend Jobs using the following pre-signed URLs:

    Data flows between Talend Studio jobs and Talend Data Fabric Data flows between Remote Engine and Talend Data FabricMSG serviceRepository serviceRemote Engine pairing service Data transfer service Log transfer service Vault gateway service Remote Engine Gen2 service

    Data flows in hybrid deployment between Talend Data Preparation, Talend Data Stewardship, Talend Dictionary Service, and Talend Data Fabric Public APIs

    Security at TalendPhysical security Security training Secure software development Cloud workload protection and monitoring Authentication, authorization, and access control Standard access Administrative access Password management

    Key management On AWS On Azure

    Vulnerability management Backups Disaster recovery and business continuity Security certifications