Designing for Privacy in Amazon Web Services

KrzysztofKkol1 53 views 28 slides May 27, 2024
Slide 1
Slide 1 of 28
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28

About This Presentation

Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.

Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: d...


Slide Content

Designing for Privacy in AWS Krzysztof Kąkol – Chief of Data Engineering @ Xebia Poland

Krzysztof Kąkol Chief of Data Engineering and Solutions Architect in Xebia Poland AWS Community Builder & AWS Ambassador https://www.linkedin.com/in/krzysztofkakol/ Other stuff: Classical and jazz pianist PhD in AI-driven sound processing

HRM systems store user personal data and lots of sensitive data like salary. Clients want to know where the data are stored, who has access to it, what we do to avoid compromising. Clients have their checklists with requirements. Strong vendor responsibility comes with great risk that must be handled. HR is about privacy Clients want to be secure Business requirements Vendor’s risk Data Privacy by Design in HRM system

Prepare Theory is King

Privacy by Design is a concept developed by Ann Cavoukian in 90’s. It advances the view that the future of privacy cannot be assured solely by compliance with regulatory frameworks ; rather , privacy assurance must ideally become an organization’s default mode of operation . Data Privacy by Design

Privacy as the Default Setting Privacy Embedded into Design Full Functionality – Positive -Sum Data Privacy by Design Proactive not Reactive Visibility and Trans- parency – Keep it Open Respect for User Privacy – Keep it User- Centric End-to-End Security – Full Lifecycle Protection

The Norwegian Data Protection Authority has developed these guidelines to help organisations understand and comply with the requirement of data protection by design and by default in article 25 of the General Data Protection Regulation . Data Protection by Design and by Default

Setting requirements for data protection and information security for the final product. Requirements Ensuring that requirements for data protection and information security are reflected in the design. Design Writing secure code by implementing the requirements for data protection and security. Coding Data Protection by Design and by Default Planning specific trainings for management and employees Training Comprehensive and final security review should be done before the software is released. Release Planning for incident response handling (prepared during the release activity) and following it. Maintenance Checking that the requirements for data protection and information security have been implemented as planned. Testing

Put your shoes on! Prerequisites

STRIDE   is a model for identifying   computer security   threats developed by Praerit Garg and  Loren Kohnfelder   at  Microsoft. It provides a  mnemonic  for security threats in six categories : Spoofing Tampering Repudiation Information Disclosure Denial of Service Elevation of Privileges Prerequisites – Risk analysis https :// www.eccouncil.org / threat -modeling/

General review of the infrastructure and workload (including the software architecture used) should be done. It should be a high-level review, not a detailed project review: General architecture description Security setup (firewalls, network ACLs etc.) High-level recommendations Prerequisites – Workload review This is WRONG !

Well-Architected Review (WAR) should be done periodically. It can be performed using the Well-Architected Tool in AWS WAR describes key concepts, design principles, and best practices for creating and running workloads in the cloud. Most important pillars from Data Privacy perspective: Security and Reliability. Prerequisites – Well-Architected Review

Draw the borders of your "world of concerns" first – model your threats with simple yet comprehensive tools like STRIDE. Once you have done it, you know what you need to focus on! Use available tools and sources to obtain as much knowledge about security and reliability perspectives as possible. One of the best tools (if not the best) in this area is the AWS Well-Architected Tool. Prerequisites – Summary

Let’s dance! Implementation

Confidentiality Integrity Accessibility Resilience Traceability

Limited access to production infra and DB Using IaaC to manage infrastructure Managed access to codebase Secrets stored in vault ( eg . Secrets Manager ) Good password policy 2FA used and enforced wherever possible Access restriction policies ( like bastion hosts , VPNs etc.) Resources encryption ( EBS, RDS, S3 etc.) Content encryption Communication encryption Control of the key – KMS CMK Permission model must be flexible enough Sensitive data – special behavior Elevating privileges should be impossible Tested permission model Principle of least privilege Authentication policies Encryption Well-designed features Confidentiality Confidentiality refers to protecting sensitive data from unauthorized access

Versioning is a default setting Used for all files , including contracts , internal documents , agreements etc. No long-living AWS credentials Only roles applied to resources Semantic roles Validation for all incoming data Defined allowed values for most fields ( eg . min/max for numeric fields ) Cloud Trail for API activity Logs for S3, database , workload – sent to Cloudwatch Audit trail in the application S3 object versioning IAM roles Data validation Monitoring and logging Integrity Integrity means that data is protected against unwanted alteration , destruction , or loss .

Using Identity Center Managing access through IaaC Limited and defined access to production resources Flexible permission model Using RBAC, ABAC, ACL and access policies Data and infrastructure redundancy Data lifecycle policies ( eg . in S3) Documentation of the incident management process Well-described scenarios for potential incidents (STRIDE) Access to infrastructure Permission model Availability of data Incident management Accessibility Personal data must be available to authorised personnel who require it for their work .

Workload in multiple AZs Self-healing infrastructure All components highly-available : ALB, multi -AZ RDS, EKS nodes in ASG, replicated NATs Using managed services Well-chosen strategy for DR (B&R, pilot light , warm standby , multisite ) Testing DR plan Backup for database – automatic snapshots , PITR, manual snapshots Object replication in S3 ( multi -region) Backup retention Using DDoS protection ( Cloudfront , Shield ) Resources firewall setup – using semantic Security Groups Application firewalls – WAF Workload in private subnets High availability Disaster recovery Backup strategy Resilience to attacks Resilience Software that is processing personal data must be able to resist vulnerabilities , attacks , and accidents .

VPC flow logs , S3 logs , ALB logs etc. Meaningful application logs Request tracking (X-Ray) Analyze logs – OpenSearch + Kibana Notify when something is detected (SNS) History of inserts , updates and deletes Every trail record contains ” before ” and „ after ” state Who and when made the change Some data have historical changes and planned changes ( eg . employees ) Implementing SCD is difficult but builds a history of the record Log everything Analyse & notify Audit trail Slowly Changing Dimensions Traceability Traceability is documentation of changes made within the software, infrastructure and to personal data.

Record the memories Documentation

Security documentation OWASP Top 10 review Personal Data Access – document describing the processes to access personal/private data, who has access to it, where they are stored etc. Personal Data Management rules, based on Requirements from NDPA checklist – generally speaking, what is the reason of storing personal data, what is the legal basis of storing them, what preventive measures have been implemented in terms of securing the data, how the data should be accessed, what are the requirements of data backup etc. Documentation

Implementation practices documentation Coding practices – describing the current coding practices in the project, including libraries used, vulnerability scanning, code review process, branching model and all other components mentioned in NDPA checklist. Testing practices – describing the current testing implemented in the project – testing frameworks, unit, integration, end-to-end, system, performance testing practices etc. (NDPA checklist). Documentation

Maintenance practices documentation Disaster Recovery plan (with test) Backup strategy Incident management plan - what to do after an incident occurs in terms of formal and implementation practices Release management – what are the rules of release process , who’s responsible for it , how the process is generally handled , since it usually implies accessing personal data directly or indirectly Maintenance process – what are the rules of accessing production environment in case of failure or hotfix . Documentation

Summary

Prepare well – define your world of concerns Analyse the risks Recognize best practices Document everything Evolution over revolution Key takeaways

Questions ? https://www.linkedin.com/in/krzysztofkakol/
Tags