Obfuscation of Sensitive Data for Incremental Release of Network Flows
Large datasets of real network flows acquired from the Internet are an invaluable resource for the research community. Applications include network modelling and simulation, identification of security attacks, and validation of research results. Unfortunately, network flows carry extremely sensitive information, and this discourages the publication of those datasets. Indeed, existing techniques for network flow sanitization are vulnerable to different kinds of attacks, and solutions proposed for micro data anonymity cannot be directly applied to network traces.
In our previous research, we proposed an obfuscation technique for network flows, providing formal confidentiality guarantees under realistic assumptions about the adversary’s knowledge. In this paper, we identify the threats posed by the incremental release of network flows, we propose a novel defense algorithm, and we formally prove the achieved confidentiality guarantees. An extensive experimental evaluation of the algorithm for incremental obfuscation, carried out with billions of real Internet flows, shows that our obfuscation technique preserves the utility of flows for network traffic analysis.