Incident Response and Contingency Plan Training for Virtustream Healthcare Cloud March 2022
BrandonGuerrero47
73 views
45 slides
Jun 29, 2024
Slide 1 of 45
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
About This Presentation
Incident Response Training Objectives:
Define and describe the Virtustream Information Management System (VIMS) Security Incident Response Plan (SIRP) and Standard Operating Procedures (SOPs)
Review Security Intelligence and Operations Center (SIOC) Activities
Review Monitored Environments
Defin...
Incident Response Training Objectives:
Define and describe the Virtustream Information Management System (VIMS) Security Incident Response Plan (SIRP) and Standard Operating Procedures (SOPs)
Review Security Intelligence and Operations Center (SIOC) Activities
Review Monitored Environments
Define the Incident Categories (CATs) and Response Types
Define and review the Incident Response Lifecycle
Define and review SIOC Tools & Capabilities
Contingency Planning Training Objectives
Describe Contingency Planning and Terms
Describe Contingency Plan Phases
Describe Backup and Site Types
Describe Contingency Plan Invocation
Contingency Plan Test Types and Goals
Size: 5.52 MB
Language: en
Added: Jun 29, 2024
Slides: 45 pages
Slide Content
March 2022 Incident Response and Contingency Plan Training
VIMS Security Incident Response Plan (SIRP) SIOC SOP documentation NIST SP 800-61 Rev.2, Computer Security Incident Handling Guide NIST SP 800-34 Rev.1, Contingency Planning Guide for Federal Information Systems NIST SP 800-84, Guide to Test, Training, and Exercise Programs for IT Plans and Capabilities Virtustream Federal Cloud (VFC) Information System Contingency Plan (ISCP) References
ISO (27001:2013, 27017:2015, 27018:2014, 22301:2012) NIST FedRAMP PCI-DSS DoD DISA Guidelines and Standards
Objectives: Incident Response Define and describe the Virtustream Information Management System (VIMS) Security Incident Response Plan (SIRP) and Standard Operating Procedures (SOPs) Review Security Intelligence and Operations Center (SIOC) Activities Review Monitored Environments Define the Incident Categories (CATs) and Response Types Define and review the Incident Response Lifecycle Define and review SIOC Tools & Capabilities
Objectives: Contingency Planning Describe Contingency Planning and Terms Describe Contingency Plan Phases Describe Backup and Site Types Describe Contingency Plan Invocation Contingency Plan Test Types and Goals
SIRP & SOPs
Security Incident Response Plan The VIMS Security Incident Response Plan is a document which provides overarching definitions for activities and processes carried out by the SIOC. This document can be reference on the Dell intranet through the following link: VIMS SIRP Or on the SIOC Wiki Page: SIOC Wiki
The SIOC has developed SOP playbooks for incident-specific scenarios such as: Malicious Code detected on a Linux/Win host Domain Admin Creation Host/Network IDS signatures Information Spillage The SOPs can be referenced in SNOW, under Security Incident - Knowledge SOPs
SIOC Activities
SIOC Activities: Investigations During the investigations, the analyst will attempt to collect the following information and document them in the ticket. The nature of the incident The time and date of occurrence The affected users and assets Who the incident was reported to Any vulnerabilities identified Follow-up action items Lessons learned
Alerts The SOC utilizes dozens of custom alerts Provides immediate detection of a variety of threats Alerts automatically open a ticket Allows incident to be tracked from beginning to end Alerts are regularly updated and expanded Driven by our strong threat intelligence program Alerts can be customized based on customer need
SIOC Activities: Containment and Escalation During incident escalation, the following should be considered: How widespread is the incident? What is the impact to business operations? How difficult is it to contain the incident? Is it affecting just one or several clients? Is it related to Federal Missions? How fast is the incident propagating? What is the estimated financial impact to Virtustream or its customers? Will this affect to Virtustream’s image negatively?
SIOC Activities: PagerDuty The SIOC is staffed 24/7 and tier-3 members participate in an on-call rotation for one week at a time. Check schedule and coverage on PagerDuty Acknowledge the page Ticket opened by other teams, typically opened as an Incident Communications bridge opened via Zoom and possibly Slack Internal / Customer facing? Customer paying for security services? Follow SIOC Incident Response methodology Lessons Learned
SIOC Monitored Environments
SIOC Monitored Environments The SIOC performs security monitoring within the following environments: VEC (Virtustream Enterprise Cloud, OKL, NEB) VFC (Virtustream Federal Cloud) MORO (DEWA/ENOC) APJ (Australia and Japan) ITAR (International Traffic in Arms Regulations ) For an updated version, visit: SIOC Coverage Security tools used for monitoring are discussed in the “SIOC Tools and Capabilities” section of this presentation.
CATs & Response Types
Incident Categories Category Name Description CAT 0 Tabletop IR Exercise/Network Defense Training This category is used during Virtustream and/or tenant security response testing and training CAT 1 Root Level Intrusion (Incident) This category is used when an individual or entity gains unauthorized privileged access to a system within the Virtustream or tenant network. This category includes unauthorized access to information or unauthorized access to account credentials that could be used to perform administrative functions (e.g., domain administrator). CAT 2 User Level Intrusion (Incident) Unauthorized non-privileged access to a Virtustream or tenant network. Non-privileged access, often referred to as user-level access, provides restricted access to the Information Systems based on the privileges granted to the user. CAT 3 Unsuccessful Activity Attempt (Event) Deliberate attempts to gain unauthorized access to a System that are defeated by normal defensive mechanisms. Attacker fails to gain access to the Virtustream or tenant network(i.e., attacker attempts valid or potentially valid username and password combinations) and the activity cannot be characterized as exploratory scanning. CAT 4 Denial of Service (Incident) An attack that successfully prevents or impairs the normal authorized functionality of networks, systems, or applications by exhausting resources. This activity includes being the victim or participating in the DOS attack.
Incident Categories (Cont.) Category Name Description CAT 5 Non-Compliance Activity (Event) Activity that potentially exposes Virtustream or tenant networks to increased risk as a result of the action or inaction of authorized users. This includes administrative and user actions such as failure to apply security patches, connections across security domains, installation of vulnerable applications, and other breaches according to existing policies. CAT 6 Reconnaissance (Event) This category includes any activity that seeks to access or identify a computer, open ports, protocols, services, or any combination for further exploit. This activity does not directly result in a compromise or denial of service. CAT 7 Malicious Logic(Incident) Successful installation of malicious software (e.g. Worm, Trojan Horse, Virus, or other code based malicious entity) that infects the operating system or application. Virtustream does not report malicious logic that has been successfully quarantined by anti-malware software. CAT 8 Investigating (Event) Unconfirmed incidents that are potentially malicious or anomalous activity deemed by the SOC for further investigation. CAT 9 Explained Anomaly (Event) Suspicious events that after further investigation are determined to be non-malicious activity and do not fir the criteria for any other categories. This includes events such as IS malfunctions and false alarms.
Functional Impact Categories Category Definition Assigned Ticket Priority Response Time None No effect to the organization’s ability to provide all services to all users 4 8 business hours Low Minimal effect; the organization can still provide all critical services to all users but has lost efficiency 3 4 hours Medium Organization has lost the ability to provide a critical service to a subset of system users 2 60 minutes High Organization is no longer able to provide some critical services to any users 1 30 minutes
Information Impact Categories Category Definition Assigned Ticket Priority Response Time None No information was removed, changed, deleted, or otherwise compromised 4 8 business hours Privacy Breach Sensitive personally identifiable information (PII) of taxpayers, employees, beneficiaries, customers, etc , was accessed or removed. 1 30 minutes Proprietary Breach Unclassified proprietary information, such as protected critical infrastructure information (PCII), was accessed or removed 1 30 minutes Integrity Loss Sensitive or proprietary information was changed or deleted 1 30 minutes
Incident Response Lifecycle
Process Workflow Incident Ticket Opened Assigned CAT8 Initial Investigation
Incident Response Lifecycle The Incident Response Lifecycle contains four major stages. Analyze Contain Eradicate Review
Incident Response Lifecycle (cont.) Analysis The SIOC Analyst initiates the investigation to determine any affect on the confidentiality, integrity, and availability of Virtustream information. Artifacts are collected and uploaded to the SNOW incident ticket. The Analyst applies proper categorization and investigates the root cause. Additional actions may also be coordinated with other internal support teams to gather artifacts, take any necessary actions and determine whether the incident is a false positive. Identification Steps A001US034SLG001 Determine if it’s a false positive: A001- Unique RID a)Determine if it’s a scanning engine (see wiki) US- Geolocation (AU/UK/EU/DE/FR/NL) b)Host and Network IDS generating false alarm 03 – Geo Instance c)Possible Pentest in Progress 4 – Linux (2 is Windows and 3 is esx ) SLG- 3 character id for what’s on the box (Syslog) If you get an IP address then you can: a) Search in SNOW , Slack or Trust Platform b)You can contact NetOps for more information
Incident Response Lifecycle (cont.) Analysis Intelligence Development When did the alert occur? What type of attack could this be? What are the src and dst ips ? What are the src and dst ports? What is the operating system? What is the direction of the event? Internal to Internal Internal to External External to Internal Rule out false positives Is it a vuln scanner? Is there a pen test going on? Logging into Splunk/Sentinel and analyze the traffic (use Security Essentials App queries as a reference). Checking Trust Platform to gather the latest vulnerability results
Incident Response Lifecycle (cont.) Analysis Infected System Perform a thorough AV scan on the system if possible (internal systems only) Submit a ticket to Compute team for a snapshot of the system Perform a Memory Analysis with Volatility (see Wiki for Forensic Procedures and Volatility Scripts) a) Identify Rogue Processes ( pslist,psscan plugins) b) Analyze process DLL’s and Handles ( dllist,cmdscan,ldr plugins) c) Review Network Artifacts ( netscan plugin) d) Look for code Injection ( malsysproc,malfind ) e) Signs of Rootkits ( psxview,apihooks plugins f) Suspicious Drivers ( driverbl,servicebl plugins) g) Registry Analysis ( autoruns,shimcache plugins)
Incident Response Lifecycle (cont.) Contain Support teams act to isolate affected systems, such as those infected with malicious code or accessed by an intruder. During this stage, an assessment is performed to determine the impact and next course of action. In the event that the determined impact results in a disruption, outage, or disaster which extends beyond the established Recovery Time Objective for the system (RTO), activation of the Information System Contingency Plan should occur. SIOC communication tools throughout this process will be a private channel in Slack with the stakeholders and a Zoom session. We will also reach out via email to support teams for follow-up and ensure continuity during these phases. Additionally, the SNOW ticket will be regularly updated until closure.
Incident Response Lifecycle (cont.) Eradicate Extent of any damage is determined. Remediation steps are prepared and communicated to all stakeholders. Coordination with support teams is ongoing during this stage. Remove the presence of all infected systems. Deny access by: Disconnecting the VMNICS or the Virtual Switch. Blocking malicious IP addresses or Domain Names Restrict access to known compromised accounts Reset compromised accounts and passwords Coordinate with support teams to rebuild the compromised systems System restored from backup, patched and updated tooling Decommission System
Incident Response Lifecycle (cont.) Review Artifacts are reviewed. After-action response and resolution steps are taken to ensure that impacted systems risks have been mitigated and vulnerabilities have been addressed. A lessons learned may also be shared with the team. Restore Normal operations (Recovery Mode) Improve the overall security: Lesson Learned Patch management Ad Hoc SIEM alerts Security Awareness program Network Visibility Enhancements Audits/Pentest
SIOC Tools and Capabilities
SIOC Tools by Environment These are some of the core tools relied upon during our monitoring activities: ITAR - Splunk, ServiceNow, Trend, Fortinet, Nessus VEC – Splunk, Sentinel ServiceNow, Trend, Fortinet, Nessus VFC - Splunk, Trend, ServiceNow, ACAS, Fortinet, Nessus RCSnet – Splunk, ServiceNow MORO – Splunk, Trend, ServiceNow, Fortinet, ServiceNow, Nessus
SIOC Tools (cont.) These tools are also used throughout the IR stages: Splunk – Aggregates data from several sources Sentinel – Hosted in Azure and will be used to aggregate logs for IZ Trust Platform – Customer facing portal that aggerates logs and vulnerability data SNOW – Ticket system and can help with finding out more details about systems or users during investigations SIFT/ REmnux – Linux Distro for Forensics Nessus – Vulnerability Scanner Volatility – Memory Analysis Tool Wireshark – Packet Analyzer MISP – Threat Information Sharing Slack – Collaboration, System Owners
Contingency Plan
Contingency Planning: Definition and Terms Contingency planning refers to “interim measures to recover information system services after a disruption. Interim measures may include relocation of information systems and operations to an alternate site, recovery of information system functions using alternate equipment, or performance of information system functions using manual methods.” (NIST SP 800-34, Rev 1, Contingency Planning Guide for Federal Information Systems). Recovery Time Objective (RTO): the maximum amount of time that a system resource can remain unavailable before there is an unacceptable impact on other system resources and supported mission/business processes. Maximum Tolerable Down-Time (MTD): the maximum period of time that a given business process can be inoperative. Recovery Point Objective (RPO): the target time set for the recovery of the system after a disaster has struck.
Contingency Plan: Three Phases Activation and Notification Activation of the ISCP occurs after a disruption, outage, or disaster that may reasonably extend beyond the RTO established for the system. Once the ISCP is activated, the information system stakeholders are notified of a possible long-term outage, and a thorough outage assessment is performed for the information system. Information from the outage assessment is analyzed and may be used to modify recovery procedures specific to the cause of the outage. Recovery Details the activities and procedures for recovery of the affected system. Activities and procedures are written at a level such that an appropriately skilled technician can recover the system without intimate system knowledge. This phase includes notification and awareness escalation procedures for communication of recovery status to system stakeholders. Reconstitution Defines the actions taken to test and validate system capability and functionality at the original or new permanent location. This phase consists of two major activities: validating data and operational functionality followed by deactivation of the plan.
Backup Types
A common understanding of data backup definitions is necessary in order to ensure that data restoration is successful. Virtustream recognizes different types of backups, which have different purposes. Full Backup A full backup is the starting point for all other types of backup and contains all the data in the folders and files that are selected to be backed up. Because full backup stores all files and folders, frequent full backups result in faster and simpler restore operations. Differential Backup Differential backup contains all files that have changed since the last FULL backup. The advantage of a differential backup is that it shortens restore time compared to a full backup or an incremental backup. However, if you perform the differential backup too many times, the size of the differential backup might grow to be larger than the baseline full backup. Incremental Backup Incremental backup stores all files that have changed since the last FULL, DIFFERENTIAL OR INCREMENTAL backup. The advantage of an incremental backup is that it takes the least time to complete. However, during a restore operation, each incremental backup must be processed, which could result in a lengthy restore job. Mirror Backup Mirror backup is identical to a full backup, with the exception that the files are not compressed in zip files and they cannot be protected with a password. A mirror backup is most frequently used to create an exact copy of the source data.
Site Types
The location from which the recovery will take place is known as the backup or alternate site . There are different types of alternate sites: Cold Sites Cold Sites are typically facilities with adequate space and infrastructure (electric power, telecommunications connections, and environmental controls) to support information system recovery activities. Warm Sites Warm Sites are partially equipped office spaces that contain some or all of the system hardware, software, telecommunications, and power sources. Hot Sites Hot Sites are facilities appropriately sized to support system requirements and configured with the necessary system hardware, supporting infrastructure, and support personnel. Mirrored Sites Mirrored Sites are fully redundant facilities with automated real-time information mirroring. Mirrored sites are identical to the primary site in all technical respects.
Invoking The Contingency Plan
What type of incidents require invoking the Contingency Plan? Security-related incidents Platform-related incidents Physical security incidents High voltage equipment failure Natural disaster Terrorist or Government disruption
Contingency Plan Tests and Goals What type of test – functional or tabletop? What are the differences between the two types of test? What are some possible scenarios? What are the goals of the test? What are the expected test outcomes? What systems are involved? Who are the participants? How should “Lessons Learned” be used to updated and improve the existing Contingency Plan CP?