The past 12-18 months we have seen a lot of activity in the area of breach response. We not only launched our Big Data Security Analytics platform using ELK, began doing active threat hunting as a service, but we also significantly strengthened our breach response capabilities. I spent most of my consulting hours responding to incidents – mostly security-related but at least one (BA-level) system crash that was not a cybersecurity incident. Beyond the foundation blocks of breach response skills, tools, and process maturity, here are the top 10 lessons learnt from these experiences:
- The incident is not the incident. Whatever you are being told in the first place about the incident may be completely wrong. A perceived cybersecurity incident may turn out to be an IT operational lapse or vice-versa. “Our email server was hacked” may turn out to be a simple business email compromise. On the other hand an incident that is being downplayed may turn out to be much larger than initially thought. This is often because the person contacting you is a few levels away from the ground situation. So be prepared to challenge the incident itself and learn that it is much more or much less or of a completely different nature than what was originally discussed.
- Build Trust. In any breach, there are many stakeholders and many aspects at play. People are trying to protect their own turf as jobs are at stake. They see any external investigating agency as a potential adversary. So your first goal is to build trust with the key people within the organization who’s help you are going to need to carry out the investigation. Also, assume that most people will be trying to some extent or the other obfuscate facts, not provide all the details, or stall sharing relevant information. Besides the technical challenges of any investigation, it is these political and psychological battles that can get exhausting. So be patient as you work your way through the people and the technology.
- Build Timelines. As soon as the investigation begins, your first task is to build timelines. A simple Excel sheet will also do. Building the timeline of the incident is a collaborative effort. Get everyone involved in providing written inputs to the timesheet you are building. Get the stakeholders to agree to the sequence of events. You will need to build the timeline using not just system logs, but also interviews with stakeholders.
- Establish communication protocols. There is immense pressure during an investigation as most people are in panic mode. Depending on the size and scale of the incident, it may attract attention not just from the senior-most management of the company but also the regulators and the media. Remember that you are bound by strict confidentiality agreements. So do not speak to anyone outside the direct communication points that have been discussed and agreed upon. Even off the record conversations with regulators or interested parties within the organization can take a life of their own. Also, clients might expect to be notified every hour what is happening. Whereas, you might not get a single vital clue in the first 3-4 days. So explaining to them how the process works, what your approach is going to be, and what an effective communication protocol looks like is important to ensure you are able to focus on the task at hand rather than providing hourly updates in PPT format.
- Establish information sharing protocols. Will you be allowed to connect your laptop to their network? Will you be allowed to take logs back to your forensics lab? If you have to carry out disk imaging, how will that be done and will the image be permitted out of their premises so you can load it onto Encase or FTK back in your lab? If no, have they arranged systems with sufficient computing capacity for you to do most of the work onsite?
- Ask stupid questions. One investigation we were involved in, the IT team had themselves discovered the malware, taken a backup of it and then deleted it. We determined this when we took the server’s image for offline analysis and found suspicious files during the un-delete operation. Upon asking the IT team why they never told us about these malware, the response was – you never asked! So, it is best to ask the stupidest of questions rather than work with any assumptions that send your investigation completely off-track.
- Build and tear down hypotheses. Document your hypotheses from Day One. And be prepared for all your initial hypotheses to break down and be proven wrong. Do not bring your ego in the picture. In any incident, what you think in the first 72 hours is likely to be proven wrong in the next 72. So maintain a running account of your hypotheses, what assumptions they are built on, where your investigation stands vis-a-vis each one of them, and if you have abandoned them, then what are your reasons for doing so. Again, getting all stakeholders to review your hypotheses and challenge them helps the investigation move along.
- Build flexibility in your toolkit. You might love doing log analysis on your high-end laptop using Splunk or ELK. But what happens when you land onsite and the client says we can’t give you the logs on your laptop. But you can analyze them on ours. But wait, the system we give you does not have administrator rights, and has very limited Internet access. Not communicating you requirements properly can result in you having to use grep/sed/awk instead of your dream toolkit. There are investigations I have had to analyze gigabytes of logs using Notepad++, the Windows findstr command and Excel!
- Keep the larger picture in mind. Always ask for the IT asset inventory, IT organizational chart, all possible network diagrams, all vendors involved, list of all connectivity into the compromised system or network. The attack vector could be any system or network connections. You must always keep the larger picture in mind. Keep going back to this larger picture as you move along on your hypotheses.
- Set client expectations right. There is always the possibility that your investigation might reach a dead-end. Often this happens due to the absence of logs. Be prepared for such an eventuality and explain to the client that this is a distinct possibility. In cases like ransomware or business email compromise, it is best to explain to the client upfront that getting back their data or money may not be remotely possible. Also, in most cybersecurity incidents, attribution is an imaginary goal – you may never get there. Explaining these aspects upfront, reduces a lot of unnecessary heartburn later on.
The way cybersecurity incidents are happening, incident response has now become the norm. No matter which aspect of cybersecurity you specialize in, understanding the kill chain, keeping abreast of various types of fraud, and being able to advise clients properly when there is an incident are non-negotiable skills. Organizations should focus on implementing formal incident management processes, developing incident runbooks, conducting cybersecurity drills and training both the security and IT teams on incident response.