Wednesday, June 19, 2019

Data Protection Solutions and Prevention

How to prevent breach?

  • Firewall (filtering, deep inspection, intrusion detection and prevention)
  • Antivirus protection
  • Data encryption
  • Email protection and filtering
  • A program of ongoing patching and updating of
  • systems (vulnerability management plan, maintenance plan, change and release management plan, configuration control board)
  • Continuing education of users and employees
Vendors:
  • Varonis. Varonis Data Security Platform. Data protection. Analyzes users, their role(s) and establishes behavioral patterns. Detects unusual activity and automatically respond by disabling account and revoking access to resources. Provides quick report of what data a user has access to, so in the event of a breach that access can be removed (e.g. folders, mailboxes, SharePoint sites). Supported products and services:  Active Directory, Windows, SharePoint, Exchange, UNIX/Linux, Office 365, Dell EMC, NetApp, Nasuni, HPE, Box.
  • StealthBITS. Data protection.
  • Quest Software. Data protection.
  • Druva. On-premise and in-cloud backup.
  • Cohesity. On-premise and in-cloud backup.
  • Rubrik. On-premise backup to cloud.
  • N2WS (a Veeam company). Cloud Protection Manager (CPM).  Cloud backup.
Consider creating a "No Access" group and adding that as a "deny" permission or right everywhere. In the event the account is compromised, assign membership to this group.

As with any product to be deployed in the federal government, be aware of Trade Agreements Act (TAA) compliance (FAR 52.225-5). For a list of TAA Designated Countries, see:  https://gsa.federalschedules.com/resources/taa-designated-countries/.

A data protection solution needs to cover not only the data but systems which provide access to the data. Documentation:  Business Impact Analysis (BIA), Business Continuity Plan (BCP), Disaster Recovery Plan (DRP), Data Catalog.

What if you suffer a breach? Disaster Recovery Plan (DRP). 
Reactive:
- Logging. AWS CloudWatch, Splunk, Loggly
- Monitoring. Datadog
- Alerting
- Incident Management. PagerDuty.
Proactive
Site Reliability Engineering (SRE) (3 books, O'Reilly - Site Reliability Engineering, The Site Reliability Workbook, Seeking SRE)

Pyramid:
Base->Top
Monitoring and Observability
Incident Response
Post-incident Analysis
Testing & Release Procedures
Capacity Planning
Development
Product
sre.google.com/books
  • Gremlin. Inject incidents for testing. Chaos engineering?

Considerations:
  • General.
    • Full versus incremental backups. Incremental is fast but could take a while in a restore (multiple incremental).
    • 3-2-1 backup strategy. 3 copies of data, 2 different media types, 1 offsite. How does "offsite" change in the cloud? Another AZ, Region, VPC? Need replication to COOP site (or another cloud AZ, Region, VPC or cloud provider). Consider dumping this readonly into another account.
    • How quickly can you recover (RTO)? How much data can you stand to lose (RPO)?
    • What to recovery first? What is critical, what is dependent on what? BIA and BCP will help with this.
    • Policy(ies). Do you need different policies for different reasons? For example:
      • Production versus development, testing, QA? Maybe only production gets backed up. Implement this via tag on VM/instance (e.g. "Environment" = "Production"). 
      • Roles. Who needs access to backup/recovery console? Keep roles separated (separation of duties).
    • Make DR separate from backups/snapshots. Only need 1 copy of data for DR? Delete/purge old? Perform DR every "n" backups. Although your backups may be onsite or in the same Region as your production data (to avoid cross Region data transfer charges), keep your DR offsite or in another Region and account.
  • Image backups. Backing up a snapshot of an instance (e.g. virtual machine) so that it can be "snapped back" to a prior state. Define:
    • What's my recovery point objective (RPO)? This determines how often snapshots need to be made.
    • How long does it take to make a snapshot? Does this operation impact VM/instance performance?
    • How often are snapshots taken? How often does the state of the machine change? Should there be a combination of manual and automatic backups (e.g. nightly automatic, manual when there's a configuration change? Or trigger backup).
    • How many snapshots should be retained?
    • How to purge old snapshots?
  • Data backups. Backup up user data (e.g. single file or volume). Do I need to restore an entire volume in order to recover a single file (e.g. spreadsheet). Does my backup allow me to do this? Can users do this themselves?
  • Database backups.
Other
Data Access Vendors:
  • Sonrai Security.
Scenarios (what could go wrong, how does it happen?). Need to assign a likelihood and impact to each of these so as to assign risk and prioritize preparation (protection, response, testing).
  • Misconfiguration, hack. Human error allows attacker access or to exploit a vulnerability. Result is most likely administrator access, and then denial of service, exfiltration of data, malware insertion/ransomware.
  • Hack. No human error, possibly hacker exploits a zero-day vulnerability, but same result as above.
  • User accidentally deletes data
  • Insider Threat.
  • Service provider outage.
Case Studies: