Azure architecture design patterns - proven solutions to common challenges

ivoandreev 4,435 views 21 slides Aug 31, 2020
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

Building a reliable, scalable, secure applications could happen either following verified design patterns or the hard way - following the trial and error approach. Azure architecture patterns are a tested and accepted solutions of common challenges thus reducing the technical risk to the project by ...


Slide Content

Cloud Architecture Design Patterns Proven Solutions to Common Challenges

Thanks to our Sponsors: General Sponsors: Technology Partner:

Software Architect @ 18+ years professional experience Microsoft Azure MVP External Expert Horizon 2020, Eurostars-Eureka External Expert InnoFund Denmark, RIF Cyprus Business Interests Web Development, SOA, Integration IoT , Machine Learning, Computer Intelligence Security & Performance Optimization Contact [email protected] www.linkedin.com/in/ivelin www.slideshare.net/ivoandreev Speaker Bio

Takeaways Azure Application Architecture Guide https ://docs.microsoft.com/en-us/azure/architecture/guide / Cloud Design Patterns for Azure (Series) Design and Implementation Availability and Resilience Data Management and Performance

Architecture be like “bird’s eye view” Implementation be like “close-up”

Shortly after build completion “the results”

Cloud Architecture Design Challenges Availability Data Management Consistency Messaging Management & Monitoring Performance & Scalability Recover from Failures Security The time a system is functional and working ( SLA Uptime % ) Data on multiple locations , performance and consistency Predictable behaviour , reusable decisions, maintainability Required by the distributed loosely coupled nature of the cloud Expose runtime and debug information Responsiveness within time unit, handle load w/o impact Detecting failures, and recovering quickly and efficiently Prevent malicious or accidental issues outside the designed usage

Management & Monitoring Patterns Anti-corruption Layer Adapter layer between 2 subsystems to isolate and translate semantics Consider Data consistency Extra maintenance point Permanent vs retiring layers Possible overhead and scalability challenges Interoperation with legacy system requires shared semantics Not Suitable No significant semantic differences between systems Gateway Offloading Pattern Offload shared or specialized service functionality to an API gateway / proxy Consider Central handling of shared features Maintain shared features in multiple places Delegate certain responsibilities to specialized team (i.e. security) Secure appropriate scaling , avoid bottlenecks Not Suitable When introducing coupling between services No business logic shall be offloaded (keep reusable)

When to use Messaging Services in Azure? Event Grid Event-driven reactive programming Key Points Cheap ( €0.5 per 1M ) At least once delivery Does not deliver data Order delivery not possible Push-based delivery Reason to Use React on event Event Hub Big data and telemetry pipeline Key Points Semi-cheap ( €9.2/month ) At least once delivery Low latency, millions of events per sec. Reason to Use Telemetry streaming Service Bus High value secure messaging and control Key Points 13M free , €0.0114 unit/h Batches, filters, duplicate detection, transactions Pull-based delivery Order delivery Reason to Use X-system transactions https://docs.microsoft.com/en-us/azure/event-grid/compare-messaging-services

Claim-Check Pattern Split large messages into claim-payload to protect message bus from overwhelming Consider Custom logic to apply pattern for large messages only Delete the message data after consuming Not Suitable Overhead for small messages Messaging Patterns Asynch Request-Reply Decouple backend processing from a frontend host Consider Validate request prior to starting a long running task API endpoint shall return: Location – a place where to poll, includes CorrelationID Retry Interval – when to retry for new status to reduce unnecessary load Callback endpoint could be used instead Not Suitable When response latency is important When callback , WebSockets or SignalR are possible https:// github.com/SeanFeldman/ServiceBus.AttachmentPlugin

Availability Patterns Throttling Control resource consumption to allow system functioning under extreme load Consider Reject requests from some users (service time or last) Degrade Quality of Service (bandwidth, compression, time) Delay operation (i.e. queue processing time) Provide specific Throttling error code to denied user Quickly detect high demand and apply throttling Health Endpoint M onitoring Functional check of an application using external endpoint on given interval Consider Tools (App Insights Web Tests , System Center Ops Manager) Get response time to a health verification endpoint Analyse results (Alive != Working, response body) Consider action in case of failure (i.e. restart) Secure the endpoint – authentication, IP filter

Two Types of Azure Queue Services? Storage Queue P art of Azure storage infrastructure Simple Get/Put/Peek interface Key Points No message ordering guarantee At least once delivery Intended for decoupling for scalability scenarios Scheduled delivery and poison message support No duplicate detection No session support Max message size 64KB Service Bus Queue Part of Azure Messaging infrastructure Publish–subscribe model Key Points Message ordering guarantee At least once / At most once delivery Transactions within a queue Scheduled delivery and poison message support Duplicate detection based on MessageId Session support for message processing affinity Max message size 1MB https://docs.microsoft.com/en-us/azure/event-grid/compare-messaging-services

Pipes and Filters Integration Decompose a complex task in separate individually scalable elements Suitable – reusable pipeline filters; avoid bottleneck filters; breakable processing; flexibility to reorder steps; context sharing Consistent Design Patterns Sample Implementation Message Queue (i.e. AZ Service Bus) receives raw message Filter task (i.e. AZ Function App) listens and transforms message Message enqueued on next queue Until final message is built Not suitable – processing steps are not independent ( i.e. bad design ) or transactional; huge context may make the process inefficient; not sufficient scalability of underlying resources (i.e. DB) https://github.com/mspnp/cloud-design-patterns/tree/master/pipes-and-filters

Gateway Aggregation Aggregate individual requests to one. Improve performance on high-latency network Suitable CorrelationID identification of original calls Partial response on service failure Caching No service coupling for backend services Near to backend to reduce latency Not Suitable Need to reduce calls to backend (i.e. with batch handling) Application is near to backend and latency is practically zero Consistent Design Patterns Command and Query Request Segregation Separate Read and Create operations to a datastore for performance, scalability and security. Suitable Enqueue async commands Separate R/W databases, individual scaling R/W endpoints Not suitable Simple domain and business rules; When CRUD interface implementation is sufficient

Ambassador Proxy External process sends requests on behalf of a consumer service or application Suitable Offload common topics on same host as a sidecar ; Extend legacy or not modifiable apps Not suitable Latency overhead unacceptable Context sharing required Reusability cannot be achieved Consistent Design Patterns Strangler Façade Incrementally migrate a legacy system, gradually replacing pieces of functionality Suitable Avoid bottleneck façade; Assure common resources are accessible Not suitable Cannot intercept backend calls ; No complex wholesale replacement

Sidecar Pattern Deploy components in a separate process to provide isolation and encapsulation i.e. Infrastructure Sidecar - monitors main app Consider Suitable interprocess communication (reliability, performance) Service or daemon instead of sidecar Suitable Heterogeneous languages Different teams or entity owns a component Independent update of components shall be enabled Not Suitable Performance of communication is critical Small solutions where the design benefit is not worth Individual scaling requirements may require a service Consistent Design Patterns

Circuit Breaker Pattern P revent application from repeatedly trying to execute remote service , likely to fail . Suitable – temporary errors due to timeout, network issues, high resource utilization Closed – request routed to endpoint (Threshold) Open – request fails immediately (Timer) Half-open - limited number of requests are monitored to decide on Open/Closed Challenges R esource differentiation and resource abstraction Manual override to open state Enqueue failed requests for reprocessing Not Suitable - local/in-memory resources Resiliency Patterns

Resiliency Patterns Compensating Transaction Pattern Undo the work from an eventually consistent transaction from series of steps . Challenges Simple replace of previous state is rarely possible Record information on each step on how steps can be undone. Undo might not be doable in exactly the reverse order Consider retry logic to try avoiding compensating transactions Restore first the more sensitive to changes entities Compensating transaction shall be idempotent (repeatable) Suitable Avoid distributed transactions with eventual consistency Undo failed steps by performing the reverse action Not Suitable Try to avoid the complications if possible

Valet Key Pattern Token for restricted access to resource to offload workload from the main application Consider Manage key validation, use short key expiration Key only for the required operation Audit all operations; deliver the key securely Provide the client with a key or token that the data store can validate . Not Suitable When action is required before sending to datastore . Limit user behavior – i.e. subscribe to events of resource to validate Gatekeeper Pattern Limit attack surface by using dedicated instance to sanitize and validate requests Challenges The backend host shall not expose unprotected endpoints May introduce single point of failure or performance hit Shall not perform actions other than sanitization Suitable Services with high degree of sensitive data Centralize validation Security Patterns

Resiliency Patterns Queue-Based Load Leveling Use a buffer queue between a task and a service to smooth demand peaks. Suitable Maximize availability when overloading is expected Maximize scalability as Tasks and Services grow independently Reasonable cost – services scale on load, rather than max load Challenges Communication is one-way. When response is needed, use Async request-reply with correlation ID (i.e. sequence Nr ) Control the rate of consuming messages to avoid overload of underlying resources (i.e. scaling consumers and DB contention) Not Suitable When minimal latency is critical

Performance and Scalability Patterns Materialized View Pattern Prepopulated view in the necessary format to support efficient querying Suitable Performance improvement; Limited data access Query simplification – ignore data complexity RDBMS + NOSQL Challenges Reusability – likely to have multiple hits Disposability – can be regenerated at any time Variability – can vary on user or query parameters Consistency – data may become outdated Regeneration – update on new data, manual trigger Not Suitable Easy to read source data Data changes very quickly and requires lots of regenerations Consistency is a priority Domain Driven Design – behaviour w/o data (as in microservices )