Incident Response and Contingency Plan Training for Virtustream Healthcare Cloud March 2022

BrandonGuerrero47 73 views 45 slides Jun 29, 2024
Slide 1
Slide 1 of 45
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45

About This Presentation

Incident Response Training Objectives:
Define and describe the Virtustream Information Management System (VIMS) Security Incident Response Plan (SIRP) and Standard Operating Procedures (SOPs)

Review Security Intelligence and Operations Center (SIOC) Activities

Review Monitored Environments

Defin...


Slide Content

March 2022 Incident Response and Contingency Plan Training

VIMS Security Incident Response Plan (SIRP) SIOC SOP documentation NIST SP 800-61 Rev.2, Computer Security Incident Handling Guide NIST SP 800-34 Rev.1, Contingency Planning Guide for Federal Information Systems NIST SP 800-84, Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities Virtustream Federal Cloud (VFC) Information System Contingency Plan (ISCP) References

ISO (27001:2013, 27017:2015, 27018:2014, 22301:2012) NIST FedRAMP PCI-DSS DoD DISA Guidelines and Standards

Objectives: Incident Response Define and describe the Virtustream Information Management System (VIMS) Security Incident Response Plan (SIRP) and Standard Operating Procedures (SOPs) Review Security Intelligence and Operations Center (SIOC) Activities Review Monitored Environments Define the Incident Categories (CATs) and Response Types Define and review the Incident Response Lifecycle Define and review SIOC Tools & Capabilities

Objectives: Contingency Planning Describe Contingency Planning and Terms Describe Contingency Plan Phases Describe Backup and Site Types Describe Contingency Plan Invocation Contingency Plan Test Types and Goals

SIRP & SOPs

Security Incident Response Plan The VIMS Security Incident Response Plan is a document which provides overarching definitions for activities and processes carried out by the SIOC. This document can be reference on the Dell intranet through the following link: VIMS SIRP Or on the SIOC Wiki Page: SIOC Wiki

The SIOC has developed SOP playbooks for incident-specific scenarios such as: Malicious Code detected on a Linux/Win host Domain Admin Creation Host/Network IDS signatures Information Spillage The SOPs can be referenced in SNOW, under Security Incident - Knowledge SOPs

SIOC Activities

SIOC Activities: Investigations During the investigations, the analyst will attempt to collect the following information and document them in the ticket. The nature of the incident The time and date of occurrence The affected users and assets Who the incident was reported to Any vulnerabilities identified Follow-up action items Lessons learned

Alerts The SOC utilizes dozens of custom alerts Provides immediate detection of a variety of threats Alerts automatically open a ticket Allows incident to be tracked from beginning to end Alerts are regularly updated and expanded Driven by our strong threat intelligence program Alerts can be customized based on customer need

SIOC Activities: Containment and Escalation During incident escalation, the following should be considered: How widespread is the incident? What is the impact to business operations? How difficult is it to contain the incident? Is it affecting just one or several clients? Is it related to Federal Missions? How fast is the incident propagating? What is the estimated financial impact to Virtustream or its customers? Will this affect to Virtustream’s image negatively?

SIOC Activities: PagerDuty The SIOC is staffed 24/7 and tier-3 members participate in an on-call rotation for one week at a time. Check schedule and coverage on PagerDuty Acknowledge the page Ticket opened by other teams, typically opened as an Incident Communications bridge opened via Zoom and possibly Slack Internal / Customer facing? Customer paying for security services? Follow SIOC Incident Response methodology Lessons Learned

SIOC Monitored Environments

SIOC Monitored Environments The SIOC performs security monitoring within the following environments: VEC (Virtustream Enterprise Cloud, OKL, NEB) VFC (Virtustream Federal Cloud) MORO (DEWA/ENOC) APJ (Australia and Japan) ITAR (International Traffic in Arms Regulations ) For an updated version, visit: SIOC Coverage Security tools used for monitoring are discussed in the “SIOC Tools and Capabilities” section of this presentation.

CATs & Response Types

Incident Categories Category Name Description CAT 0 Tabletop IR Exercise/Network Defense Training This category is used during Virtustream and/or tenant security response testing and training CAT 1 Root Level Intrusion (Incident) This category is used when an individual or entity gains unauthorized privileged access to a system within the Virtustream or tenant network. This category includes unauthorized access to information or unauthorized access to account credentials that could be used to perform administrative functions (e.g., domain administrator). CAT 2 User Level Intrusion (Incident) Unauthorized non-privileged access to a Virtustream or tenant network. Non-privileged access, often referred to as user-level access, provides restricted access to the Information Systems based on the privileges granted to the user. CAT 3 Unsuccessful Activity Attempt (Event) Deliberate attempts to gain unauthorized access to a System that are defeated by normal defensive mechanisms. Attacker fails to gain access to the Virtustream or tenant network(i.e., attacker attempts valid or potentially valid username and password combinations) and the activity cannot be characterized as exploratory scanning. CAT 4 Denial of Service (Incident) An attack that successfully prevents or impairs the normal authorized functionality of networks, systems, or applications by exhausting resources. This activity includes being the victim or participating in the DOS attack.

Incident Categories (Cont.) Category Name Description CAT 5 Non-Compliance Activity (Event) Activity that potentially exposes Virtustream or tenant networks to increased risk as a result of the action or inaction of authorized users. This includes administrative and user actions such as failure to apply security patches, connections across security domains, installation of vulnerable applications, and other breaches according to existing policies. CAT 6 Reconnaissance (Event) This category includes any activity that seeks to access or identify a computer, open ports, protocols, services, or any combination for further exploit. This activity does not directly result in a compromise or denial of service. CAT 7 Malicious Logic(Incident) Successful installation of malicious software (e.g. Worm, Trojan Horse, Virus, or other code based malicious entity) that infects the operating system or application. Virtustream does not report malicious logic that has been successfully quarantined by anti-malware software. CAT 8 Investigating (Event) Unconfirmed incidents that are potentially malicious or anomalous activity deemed by the SOC for further investigation. CAT 9 Explained Anomaly (Event) Suspicious events that after further investigation are determined to be non-malicious activity and do not fir the criteria for any other categories. This includes events such as IS malfunctions and false alarms.

Functional Impact Categories Category Definition Assigned Ticket Priority Response Time None No effect to the organization’s ability to provide all services to all users 4 8 business hours Low Minimal effect; the organization can still provide all critical services to all users but has lost efficiency 3 4 hours Medium Organization has lost the ability to provide a critical service to a subset of system users 2 60 minutes High Organization is no longer able to provide some critical services to any users 1 30 minutes

Information Impact Categories Category Definition Assigned Ticket Priority Response Time None No information was removed, changed, deleted, or otherwise compromised 4 8 business hours Privacy Breach Sensitive personally identifiable information (PII) of taxpayers, employees, beneficiaries, customers, etc , was accessed or removed. 1 30 minutes Proprietary Breach Unclassified proprietary information, such as protected critical infrastructure information (PCII), was accessed or removed 1 30 minutes Integrity Loss Sensitive or proprietary information was changed or deleted 1 30 minutes

Incident Response Lifecycle

Process Workflow Incident Ticket Opened Assigned CAT8 Initial Investigation

Incident Response Lifecycle The Incident Response Lifecycle contains four major stages. Analyze Contain Eradicate Review

Incident Response Lifecycle (cont.) Analysis The SIOC Analyst initiates the investigation to determine any affect on the confidentiality, integrity, and availability of Virtustream information. Artifacts are collected and uploaded to the SNOW incident ticket. The Analyst applies proper categorization and investigates the root cause. Additional actions may also be coordinated with other internal support teams to gather artifacts, take any necessary actions and determine whether the incident is a false positive. Identification Steps A001US034SLG001 Determine if it’s a false positive: A001- Unique RID a)Determine if it’s a scanning engine (see wiki) US- Geolocation (AU/UK/EU/DE/FR/NL) b)Host and Network IDS generating false alarm 03 – Geo Instance c)Possible Pentest in Progress 4 – Linux (2 is Windows and 3 is esx ) SLG- 3 character id for what’s on the box (Syslog) If you get an IP address then you can: a) Search in SNOW , Slack or Trust Platform b)You can contact NetOps for more information

Incident Response Lifecycle (cont.) Analysis Intelligence Development When did the alert occur? What type of attack could this be? What are the src and dst ips ? What are the src and dst ports? What is the operating system? What is the direction of the event? Internal to Internal Internal to External External to Internal Rule out false positives Is it a vuln scanner? Is there a pen test going on? Logging into Splunk/Sentinel and analyze the traffic (use Security Essentials App queries as a reference). Checking Trust Platform to gather the latest vulnerability results

Incident Response Lifecycle (cont.) Analysis Infected System Perform a thorough AV scan on the system if possible (internal systems only) Submit a ticket to Compute team for a snapshot of the system Perform a Memory Analysis with Volatility (see Wiki for Forensic Procedures and Volatility Scripts) a) Identify Rogue Processes ( pslist,psscan plugins) b) Analyze process DLL’s and Handles ( dllist,cmdscan,ldr plugins) c) Review Network Artifacts ( netscan plugin) d) Look for code Injection ( malsysproc,malfind ) e) Signs of Rootkits ( psxview,apihooks plugins f) Suspicious Drivers ( driverbl,servicebl plugins) g) Registry Analysis ( autoruns,shimcache plugins)

Incident Response Lifecycle (cont.) Contain Support teams act to isolate affected systems, such as those infected with malicious code or accessed by an intruder. During this stage, an assessment is performed to determine the impact and next course of action. In the event that the determined impact results in a disruption, outage, or disaster which extends beyond the established Recovery Time Objective for the system (RTO), activation of the Information System Contingency Plan should occur. SIOC communication tools throughout this process will be a private channel in Slack with the stakeholders and a Zoom session. We will also reach out via email to support teams for follow-up and ensure continuity during these phases. Additionally, the SNOW ticket will be regularly updated until closure.

Incident Response Lifecycle (cont.) Contain Contact list: Compute Team: [email protected] Monitoring Team: [email protected] Database Team: [email protected] OS Team: [email protected] Platform Team: [email protected] US Network Team: [email protected] On-shore OS: [email protected] Non SAP (mostly offshore): [email protected] CCS/AMS includes DB and OS (onshore): [email protected]

Incident Response Lifecycle (cont.) Eradicate Extent of any damage is determined. Remediation steps are prepared and communicated to all stakeholders. Coordination with support teams is ongoing during this stage. Remove the presence of all infected systems. Deny access by: Disconnecting the VMNICS or the Virtual Switch. Blocking malicious IP addresses or Domain Names Restrict access to known compromised accounts Reset compromised accounts and passwords Coordinate with support teams to rebuild the compromised systems System restored from backup, patched and updated tooling Decommission System

Incident Response Lifecycle (cont.) Review Artifacts are reviewed. After-action response and resolution steps are taken to ensure that impacted systems risks have been mitigated and vulnerabilities have been addressed. A lessons learned may also be shared with the team. Restore Normal operations (Recovery Mode) Improve the overall security: Lesson Learned Patch management Ad Hoc SIEM alerts Security Awareness program Network Visibility Enhancements Audits/Pentest

SIOC Tools and Capabilities

SIOC Tools by Environment These are some of the core tools relied upon during our monitoring activities: ITAR - Splunk, ServiceNow, Trend, Fortinet, Nessus VEC – Splunk, Sentinel ServiceNow, Trend, Fortinet, Nessus VFC - Splunk, Trend, ServiceNow, ACAS, Fortinet, Nessus RCSnet – Splunk, ServiceNow MORO – Splunk, Trend, ServiceNow, Fortinet, ServiceNow, Nessus

SIOC Tools (cont.) These tools are also used throughout the IR stages: Splunk – Aggregates data from several sources Sentinel – Hosted in Azure and will be used to aggregate logs for IZ Trust Platform – Customer facing portal that aggerates logs and vulnerability data SNOW – Ticket system and can help with finding out more details about systems or users during investigations SIFT/ REmnux – Linux Distro for Forensics Nessus – Vulnerability Scanner Volatility – Memory Analysis Tool Wireshark – Packet Analyzer MISP – Threat Information Sharing Slack – Collaboration, System Owners

Contingency Plan

Contingency Planning: Definition and Terms Contingency planning refers to “interim measures to recover information system services after a disruption. Interim measures may include relocation of information systems and operations to an alternate site, recovery of information system functions using alternate equipment, or performance of information system functions using manual methods.” (NIST SP 800-34, Rev 1, Contingency Planning Guide for Federal Information Systems). Recovery Time Objective (RTO): the maximum amount of time that a system resource can remain unavailable before there is an unacceptable impact on other system resources and supported mission/business processes. Maximum Tolerable Down-Time (MTD): the maximum period of time that a given business process can be inoperative. Recovery Point Objective (RPO): the target time set for the recovery of the system after a disaster has struck.

Contingency Plan: Three Phases Activation and Notification Activation of the ISCP occurs after a disruption, outage, or disaster that may reasonably extend beyond the RTO established for the system. Once the ISCP is activated, the information system stakeholders are notified of a possible long-term outage, and a thorough outage assessment is performed for the information system. Information from the outage assessment is analyzed and may be used to modify recovery procedures specific to the cause of the outage. Recovery Details the activities and procedures for recovery of the affected system. Activities and procedures are written at a level such that an appropriately skilled technician can recover the system without intimate system knowledge. This phase includes notification and awareness escalation procedures for communication of recovery status to system stakeholders. Reconstitution Defines the actions taken to test and validate system capability and functionality at the original or new permanent location. This phase consists of two major activities: validating data and operational functionality followed by deactivation of the plan.

Backup Types

A common understanding of data backup definitions is necessary in order to ensure that data restoration is successful. Virtustream recognizes different types of backups, which have different purposes. Full Backup A full backup is the starting point for all other types of backup and contains all the data in the folders and files that are selected to be backed up. Because full backup stores all files and folders, frequent full backups result in faster and simpler restore operations. Differential Backup Differential backup contains all files that have changed since the last FULL backup. The advantage of a differential backup is that it shortens restore time compared to a full backup or an incremental backup. However, if you perform the differential backup too many times, the size of the differential backup might grow to be larger than the baseline full backup. Incremental Backup Incremental backup stores all files that have changed since the last FULL, DIFFERENTIAL OR INCREMENTAL backup. The advantage of an incremental backup is that it takes the least time to complete. However, during a restore operation, each incremental backup must be processed, which could result in a lengthy restore job. Mirror Backup Mirror backup is identical to a full backup, with the exception that the files are not compressed in zip files and they cannot be protected with a password. A mirror backup is most frequently used to create an exact copy of the source data.

Site Types

The location from which the recovery will take place is known as the backup or alternate site . There are different types of alternate sites: Cold Sites Cold Sites are typically facilities with adequate space and infrastructure (electric power, telecommunications connections, and environmental controls) to support information system recovery activities. Warm Sites Warm Sites are partially equipped office spaces that contain some or all of the system hardware, software, telecommunications, and power sources. Hot Sites Hot Sites are facilities appropriately sized to support system requirements and configured with the necessary system hardware, supporting infrastructure, and support personnel. Mirrored Sites Mirrored Sites are fully redundant facilities with automated real-time information mirroring. Mirrored sites are identical to the primary site in all technical respects.

Invoking The Contingency Plan

What type of incidents require invoking the Contingency Plan? Security-related incidents Platform-related incidents Physical security incidents High voltage equipment failure Natural disaster Terrorist or Government disruption

Contingency Plan Tests and Goals What type of test – functional or tabletop? What are the differences between the two types of test? What are some possible scenarios? What are the goals of the test? What are the expected test outcomes? What systems are involved? Who are the participants? How should “Lessons Learned” be used to updated and improve the existing Contingency Plan CP?

Questions