Skip to content

Latest commit

 

History

History
309 lines (225 loc) · 11.2 KB

us-21-Bridging-Security-Infrastructure-Between-The-Data-Center-And-Aws-Lambda.pdf.md

File metadata and controls

309 lines (225 loc) · 11.2 KB

Bridging Security Infrastructure Between the Data Center and AWS Lambda Michael Weissbacher Square, Inc. @mweissbacher Black Hat USA 2021 August 4th

Overview Background & Context Goals Solution Pt 1 Enabling Lambda to call microservices in the DC AWS EKS Solution Pt 2 Syncing application secrets Key Learnings 2

Bio Michael Weissbacher, PhD Infrastructure Security Team @ Square in NYC Subteam focusing on Cryptographic Identity and Secrets Previously: Security Research @ Northeastern University 3

Background & Context

Why do developers choose Lambda? Benefits of serverless or Lambda specifically: Focus: business logic, rather than infrastructure Speed: it's fast! Scalability: both up and down. No need to maintain idle servers Compatibility: can be triggered through close integration with AWS APIs 5

How does it work? Lambdas are code without permanent infrastructure Small VMs on Amazon Linux Firecracker MicroVM, Lambda Sandbox On demand, infrastructure is allocated: cold start Subsequent invocations: reuse state Demand fades: infrastructure is deallocated 6

The problem with serverless Serverless architecture is not compatible with security infrastructure in the DC Reasons Lambda is not compatible: Geared towards short lived workloads Deploy process different Immediate response core feature Serverless scalability cannot be held back 7

Goal

Goal at Square Square has been migrating to the cloud to achieve higher flexibility and scalability We need Lambdas to be treated the same as other workloads DC Kubernetes-like platform AWS Kubernetes on EKS Connections: Envoy service mesh What do we need for Lambda? Communicate securely Access application secrets

Lambda can't be an island! 9

System Goals Equal footing with DC security infrastructure ...

Communication: Connect to DC/AWS EKS mesh Application secrets: Access to protected data

...while still maintaining Lambda benefits

Speed: Must support fast response times Scalability: Must scale with Lambda demand Compatibility: Must plug into DC infrastructure Availability: Must be high 10

Solution Pt 1 Enabling Lambda to call microservices in DC/EKS

Workload identity at Square Everything is a micoservice mTLS at Square since 2012 a.k.a. "Zero trust networking" or "Identity is the new perimeter" Traffic via Envoy service mesh, sidecar handles connection Workloads in DC AWS EKS Identity is tuple (service name, environment) E.g.: (service1, staging), or (service2, production) 14

SPIFFE Workload identity standard Influenced by industry usage of workload and service identity at Google, Square, ... SPIRE is reference implementation for SPIFFE Square started migrating to SPIRE 3 years ago Deprecate homemade identity issuance... ... onboard open source solution We use SPIFFE identity in all environments 15

Shape of identity for Lambda Multi-account architecture Mapping ( service1, staging ) AWS Account 12345 ( service1, production ) AWS Account 23456 Multiple Lambdas in one account: one service We use accounts as a security boundary per service and environment We have different roles within an account (read only, execution role, ...) 16

Identity issuance: what options were available?

SPIRE

Bootstrap from Account Credentials

Externally signal account ownership

DC identity is solved problem Construct via KMS building bearer Works well for long-lived services token

Signed request

No support for serverless

  • Requires agent No sidecar support in Lambda

Could carry cloud implementation details into DC Doesn't fit into architecture picture

17

Identity issuance decision BUILD Existing options didn't meet our goals 18

Architecture for identity issuance

Pull

vs.

Push

Identity Issuance Security

Identity would be generated as a Lambda is invoked Create agent that operates similar to a sidecar Security model equivalent to DC

Availability For Lambda, agent creates blocking dependency for invocation

Issue identity and make sure it is readily available Anti-pattern compared to SPIRE using pull No security downsides Identity controlled by IAM and SCP Flexible

19

Architecture Decision

PUSH

PULL

No security downsides Enables us to be more flexible with SLA Lambda never has to wait for identity issuance

20

System components Identity Governance and Administration (IGA Square internal service Provides (service, environment, account ID with enabled Lambda identity Issuance Generate Certificate with SPIFFE URI per-service Short lived, 24h Implemented as Lambda AWS Private CA PCA Used by issuance HSM backed CA service Audit capabilities AWS Secrets Manager Centralized resource Access via IAM and SCP 21

Amazon Web Services, the "Powered by AWS" logo, "AWS", and "AWS Lambda" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Amazon Web Services, the "Powered by AWS" logo, "AWS", and "AWS Lambda" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Amazon Web Services, the "Powered by AWS" logo, "AWS", and "AWS Lambda" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Amazon Web Services, the "Powered by AWS" logo, "AWS", and "AWS Lambda" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Amazon Web Services, the "Powered by AWS" logo, "AWS", and "AWS Lambda" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Amazon Web Services, the "Powered by AWS" logo, "AWS", and "AWS Lambda" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Amazon Web Services, the "Powered by AWS" logo, "AWS Lambda", and "AWS" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries

Hello! Lambda calling Lambda layer Pulls identity from Secrets Manager Golang process listening on localhost Overloads VerifyPeerCertificate in TLS to perform SPIFFE URI validation Entire process transparent to developers Meshproxy Modified version of envoy Pass-through TLS Routes by SNI between DC and EKS DC/EKS services ACL check against SPIFFE URI 29

System Goals Equal footing with DC security infrastructure ...

Communication: Connect to DC/AWS EKS mesh Application secrets: Access to protected data

...while still maintaining Lambda benefits

Speed: Must support fast response times Scalability: Must scale with Lambda demand Compatibility: Must plug into DC infrastructure Availability: Must be high 30

Risk Mitigation Stealing the root Attacker would be able to issue certs offline Root contained in AWS Private CA No intermediates used, leaf issuance off of root Attacking issuance Account locked down Audit trail Stealing identity from a Lambda Blast radius limited to 24h, ACL system limits damage IAM and SCP 31

Influenced SPIRE serverless architecture RFC posted after we published our system description RFC initially pull-style issuance, but has adopted push-style based on our implementation Implementation of serverless issuance in progress, target release: SPIRE v1.1 September/October 2021 Square looking to migrate to open source implementation 32

Solution Pt 2 Syncing application secrets

Application Secrets: Keywhiz What are secrets? API keys, GPG keys, ... Open source: github.com/square/keywhiz Secret ownership mapped to microservices DC Parallel PKI Syncer daemonset on each server node Syncer has access to all applications' secrets deployed on node Integrated web tooling in "Square Console" Self-serve adding secrets Tracking of expiration 36

Full decentralization? Evaluating using Secrets Manager directly We decided against it - Why? Security teams have expertise in handling centralized secrets Conflicting versions of secrets, e.g.: in multi-cloud + DC scenario No centralized expiration tracking Centralized tooling, such as GPG integration Bottom line: too risky, and wanted to do better Full centralization? Evaluating DC equivalent No deploy moment Can't block on invoke Bottom line: not compatible 37

Application secrets decision BUILD DC benefits + cloud native features 38

Security Boundaries DC Node syncers with wide ranging access Lambda An opportunity to reduce blast radius No concept of "node" An Idea SPIFFE identity work unlocked infrastructure capabilities (! Added SPIFFE support to Keywhiz Client-side syncer uses service identity Reduce exposure Opt-in to secrets vs. opt-out Action required via Square Console 39

Secrets Availability Observation: Secrets are updated rarely Majority of syncing operations: no-op Reliable cache > blocking on updates Unscheduled update: Trigger syncer Storage Fast reads: Secrets Manager No DC dependencies Default encryption key enforces account boundaries 40

Motivation

Amazon Web Services, the "Powered by AWS" logo, and "AWS" are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries 41

How to onboard Terraform module To apply: 24 lines Secrets Manager: direct access Implemented as Lambda Uses Lambda workload identity Client operates in each enabled account 42

1 4

2 3

44

System Goals Equal footing with DC security infrastructure ...

Communication: Connect to DC/AWS EKS mesh Application secrets: Access to protected data

...while still maintaining Lambda benefits

Speed: Must support fast response times Scalability: Must scale with Lambda demand Compatibility: Must plug into DC infrastructure Availability: Must be high 46

Risk Mitigation Access all secrets Attack Keywhiz ACL system blocks access Compromise syncer CI/CD pipeline S3 object version Access individual secrets Secrets tied to identity Lambda secret access service's secrets 47

Present and Future Secrets in sync between DC and Lambda in production SPIFFE support in Keywhiz offers interoperability with new environments 48

Key Learnings

How we knew it worked Identity issuance and secrets used in production SPIRE is implementing serverless support following our model 50

Summary Developers want Lambdas, whether you're ready or not Support your developers with infrastructure they already know mTLS Envoy service mesh Keep secrets in sync between multiple environments Hybrid environments are hard "Moving" to the cloud means operating in two environments This challenging interim state can last years Services in the cloud will rely on DC Use best of both: environments should support each other, do not block 51

Thank you Michael Weissbacher @mweissbacher

References Square blog posts covering this presentation Providing mTLS Identities to Lambdas Expanding Secrets Infrastructure to AWS Lambda Related Square blog posts Enabling Serverless Applications at Square Using AWS Lambda Extensions to Accelerate AWS Secrets Manager Access Adopting AWS VPC Endpoints at Square 53

References Service identity Envoy SPIFFE SPIRE RFC for serverless architecture AWS Certificate Manager Private Certificate Authority What is mutual TLS? Application secrets What is AWS Secrets Manager? Keywhiz 54

References Lambda Understanding Container Reuse in AWS Lambda Behind the scenes, AWS Lambda Firecracker - AWS Blog 55