Prospects for network flow analysis in the field of enterprise network security

In a world where cybercrime is constantly growing and the question is no longer whether you will be hacked, but rather when you will be hacked, organizations are constantly trying to protect their digital assets by any means available. In a cyber attack, valuable assets can be damaged, stolen, or denied access, resulting in monetary losses and loss of intellectual property.

Visibility and logging

One of the most important aspects of detecting cyber attacks and breaches is the ability to determine who did what, when, and where. Similarly, when conducting any analysis of reports of a detected attack or violation, it is very important to have a history that allows you to go back in time and analyze what happened in detail. This history can be used to develop new strategies and measures to prevent similar incidents in the future and to verify that the new measure could have prevented the incident from occurring for the first time. In addition, this history can also be used to document or verify compliance with and compliance with any service level agreements within the framework of concluded contracts.

Data Analysis and validation

The main tool for analyzing and verifying data is packet capture. With a full packet Flow record, you can study and analyze the incident to the smallest level of detail. Each packet can be detected and checked using DPI (Deep Packet Inspection), and any stream can be recreated up to the exact time of passage of each packet. With an accurate stream of packets at hand, you can apply new and more advanced types of analysis, as well as search for any trends, patterns, or anomalies. Any package can be analyzed manually, and as a result, you can find some small details until the main cause of the incident is discovered. Finally, when any new security measure is being developed, you can test it in a lab environment with the exact packet flow from the old attack to make sure that the protection actually works as expected before implementing it on the real network. This is where the benefits of using DPI and network flow analysis become apparent.

Storage requirements for all network packets

To understand how much storage capacity we need to analyze network packets and what packet capture levels we need to use, we need to determine how much packet data the average organization or enterprise generates. Let's assume that a small / medium enterprise will generate an average network load of 750 Mbit / s over a 24-hour window, and a large enterprise will generate 5 Gbit / s under the same conditions. Based on two specific businesses, we can calculate the minimum amount of data storage required for 1 day, 1 week, 1 month, and 1 year of package storage.

Small business (750 Mbit / s):

Day - 8 TB
Month-243 TB
Year - 2957 TB

Large enterprise (5 Gbit / s):

Day-54 TB
Month-1620 TB
Year-19710 TB

Most of the packet capture solutions available today can scale up to 1000 TB of data storage, which is approximately 4 months of batch data history for a small / medium enterprise and less than a month for a large enterprise. To further expand the history of packet data, it can be compressed before being written to the packet capture database. The effect of batch data compression depends on the chosen algorithm and the contents of the data packet, since certain data is more suitable for compression than others. Standard network packet data has a typical compression ratio of 3, so compression can triple the packet capture history.

The Next approach is to store the entire network stream

But as network speed continues to increase along with the number of network packets, DPI is no longer a viable solution for live analysis of a real network on the fly. As a result, many network monitoring and analysis tools focus on Packet Flow records that they collect from NetFlow/IPFIX data on the network. Based on these stream records, you can detect strange behavior and anomalies that require further analysis. Whenever an incident is detected, you can still analyze the underlying network packets, along with the entire stream.

Many of the most modern network security products are implemented in this way: they include both packet capture and packet inspection, including DPI, which allows you to record the network stream along with metadata. Streams are generated and forwarded to a separate node that performs more advanced Analytics, usually using artificial intelligence and machine learning to detect anomalies and other security issues. Whenever a problem is detected, the AI has the ability to go back to the desired time and extract the base packet from the packet capture device for further verification and analysis.

The figure below shows an example of a configuration in which the same scheme is implemented with or without packet capture, depending on the need for detailed analysis and documentation of incidents.

Any of the blocks in this scheme can be accelerated using a smart network card SmartNIC. For packet capture, SmartNIC can guarantee zero packet loss and add accurate time and metadata to each packet for easier indexing and further packet analysis. To validate packets, SmartNICs can extract any forwarded metadata from the packet capture block and help by performing some level of packet decoding; it can also speed up operations such as decryption and/or regular expression search. For the flow mining engine, acceleration is required more at the machine learning stage, but even here SmartNIC can speed up some operations with look-aside mechanisms.

Currently, network flow processing in security gateways is performed at the expense of CPU resources, but in the future these operations may be transferred to SmartNIC. This requires that SmartNIC has hardware acceleration of the output of artificial intelligence models, which can quickly calculate the existing model to detect traffic anomalies. Similar developments exist in the company Napatech: most of the network flow processing is performed on the smart network card, and the server processor is only used for recording the network flow and training models. This allows you to process network streams at speeds up to 200 Gbit / s on a single 24-core server.

Clearly, in the future, SmartNIC will add artificial intelligence mechanisms to speed up flow analysis and anomaly detection based on machine learning models without using CPU resources.

Ron Amadeo
02/05.2019