USENIX Security '22 has three submission deadlines. Prepublication versions of the accepted papers from the fall submission deadline are available below. The full program will be available soon.
An Audit of Facebook's Political Ad Policy Enforcement
Victor Le Pochat, imec-DistriNet, KU Leuven; Laura Edelson, New York University; Tom Van Goethem and Wouter Joosen, imec-DistriNet, KU Leuven; Damon McCoy and Tobias Lauinger, New York University
Distinguished Paper Award Winner
Major technology companies strive to protect the integrity of political advertising on their platforms by implementing and enforcing self-regulatory policies that impose transparency requirements on political ads. In this paper, we quantify whether Facebook’s current enforcement correctly identifies political ads and ensures compliance by advertisers. In a comprehensive, large-scale analysis of 4.2 million political and 29.6 million non-political ads from 215,030 advertisers, we identify ads correctly detected as political (true positives), ads incorrectly detected (false positives), and ads missed by detection (false negatives). Facebook’s current enforcement appears imprecise: 61% more ads are missed than are detected worldwide, and 55% of U.S. detected ads are in fact non-political. Detection performance is uneven across countries, with some having up to 53 times higher false negative rates among clearly political pages than in the U.S. Moreover, enforcement appears inadequate for preventing systematic violations of political advertising policies: for example, advertisers were able to continue running political ads without disclosing them while they were temporarily prohibited in the U.S. We attribute these flaws to five gaps in Facebook’s current enforcement and transparency implementation, and close with recommendations to improve the security of the online political ad ecosystem.
Helping hands: Measuring the impact of a large threat intelligence sharing community
Xander Bouwman, Delft University of Technology; Victor Le Pochat, imec-DistriNet, KU Leuven; Pawel Foremski, Farsight Security, Inc. / IITiS PAN; Tom Van Goethem, imec-DistriNet, KU Leuven; Carlos H. Gañán, Delft University of Technology and ICANN; Giovane C. M. Moura, SIDN Labs; Samaneh Tajalizadehkhoob, ICANN; Wouter Joosen, imec-DistriNet, KU Leuven; Michel van Eeten, Delft University of Technology
We tracked the largest volunteer security information sharing community known to date: the COVID-19 Cyber Threat Coalition, with over 4,000 members. This enabled us to address long-standing questions on threat information sharing. First, does collaboration at scale lead to better coverage? And second, does making threat data freely available improve the ability of defenders to act? We found that the CTC mostly aggregated existing industry sources of threat information. User-submitted domains often did not make it to the CTC's blocklist as a result of the high threshold posed by its automated quality assurance using VirusTotal. Although this ensured a low false positive rate, it also caused the focus of the blocklist to drift away from domains related to COVID-19 (1.4%-3.6%) to more generic abuse, such as phishing, for which established mitigation mechanisms already exist. However, in the slice of data that was related to COVID-19, we found promising evidence of the added value of a community like the CTC: just 25.1% of these domains were known to existing abuse detection infrastructures at time of listing, as compared to 58.4% of domains on the overall blocklist. From the unique experiment that the CTC represented, we draw three lessons for future threat data sharing initiatives.
Back-Propagating System Dependency Impact for Attack Investigation
Pengcheng Fang, Case Western Reserve University; Peng Gao, Virginia Tech; Changlin Liu and Erman Ayday, Case Western Reserve University; Kangkook Jee, University of Texas at Dallas; Ting Wang, Penn State University; Yanfang (Fanny) Ye, Case Western Reserve University; Zhuotao Liu, Tsinghua University; Xusheng Xiao, Case Western Reserve University
Causality analysis on system auditing data has emerged as an important solution for attack investigation. Given a POI (Point-Of-Interest) event (e.g., an alert fired on a suspicious file creation), causality analysis constructs a dependency graph, in which nodes represent system entities (e.g., processes and files) and edges represent dependencies among entities, to reveal the attack sequence. However, causality analysis often produces a huge graph (> 100,000 edges) that is hard for security analysts to inspect. From the dependency graphs of various attacks, we observe that (1) dependencies that are highly related to the POI event often exhibit a different set of properties (e.g., data flow and time) from the less-relevant dependencies; (2) the POI event is often related to a few attack entries (e.g., downloading a file). Based on these insights, we propose DEPIMPACT, a framework that identifies the critical component of a dependency graph (i.e., a subgraph) by (1) assigning discriminative dependency weights to edges to distinguish critical edges that represent the attack sequence from less-important dependencies, (2) propagating dependency impacts backward from the POI event to entry points, and (3) performing forward causality analysis from the top-ranked entry nodes based on their dependency impacts to filter out edges that are not found in the forward causality analysis. Our evaluations on the 150 million real system auditing events of real attacks and the DARPA TC dataset show that DEPIMPACT can significantly reduce the large dependency graphs (∼ 1,000,000 edges) to a small graph (∼ 234 edges), which is 4611× smaller. The comparison with the other state-of-the-art causality analysis techniques shows that DEPIMPACT is 106× more effective in reducing the dependency graphs while preserving the attack sequences.
SecSMT: Securing SMT Processors against Contention-Based Covert Channels
Mohammadkazem Taram, University of California San Diego; Xida Ren and Ashish Venkat, University of Virginia; Dean Tullsen, University of California San Diego
This paper presents the first comprehensive analysis of contention-based security vulnerabilities in a high-performance simultaneous mulithreaded (SMT) processor. It features a characterization of contention throughout the shared pipeline, and potential resulting leakage channels for each resource. Further, it presents a set of unified mitigation/isolation strategies that dramatically cut that leakage while preserving most of the performance of a full, insecure SMT implementation. These results lay the groundwork for considering SMT execution, with its performance benefits, a reasonable choice even for security-sensitive applications.
Increasing Adversarial Uncertainty to Scale Private Similarity Testing
Yiqing Hua and Armin Namavari, Cornell Tech and Cornell University; Kaishuo Cheng, Cornell University; Mor Naaman and Thomas Ristenpart, Cornell Tech and Cornell University
Social media and other platforms rely on automated detection of abusive content to help combat disinformation, harassment, and abuse. One common approach is to check user content for similarity against a server-side database of problematic items. However, this method fundamentally endangers user privacy. Instead, we target client-side detection, notifying only the users when such matches occur to warn them against abusive content.
Our solution is based on privacy-preserving similarity testing. Existing approaches rely on expensive cryptographic protocols that do not scale well to large databases and may sacrifice the correctness of the matching. To contend with this challenge, we propose and formalize the concept of similarity-based bucketization~(SBB). With SBB, a client reveals a small amount of information to a database-holding server so that it can generate a bucket of potentially similar items. The bucket is small enough for efficient application of privacy-preserving protocols for similarity. To analyze the privacy risk of the revealed information, we introduce a framework for measuring an adversary's confidence in inferring a predicate about the client input correctly. We develop a practical SBB protocol for image content, and evaluate its client privacy guarantee with real-world social media data. We then combine SBB with various similarity protocols, showing that the combination with SBB provides a speedup of at least 29x on large-scale databases compared to that without, while retaining correctness of over 95%.
"How Do You Not Lose Friends?": Synthesizing a Design Space of Social Controls for Securing Shared Digital Resources Via Participatory Design Jams
Eyitemi Moju-Igbene, Hanan Abdi, Alan Lu, and Sauvik Das, Georgia Institute of Technology
Digital resources (streaming services, banking accounts, collaborative documents, etc.) are commonly shared among small, social groups. Yet, the security and privacy (S&P) controls for these resources map poorly onto the reality of shared access and ownership (e.g., one shared Netflix password for roommates). One challenge is that the design space for social S&P controls remains unclear. We bridged this gap by engaging end-users in participatory design workshops to envision social solutions to S&P challenges common to their groups. In analyzing the generated ideas and group discussions, we identified four design considerations salient to social S&P controls: social transparency; structures of governance; stakes and responsibility; and, promoting pro-group S&P behaviors. Additionally, we discovered trade-offs and challenges that arise when designing social S&P controls: balancing group security versus individual privacy; combating social friction; mitigating social herding behaviors; and, minimizing coordination costs.
Your Microphone Array Retains Your Identity: A Robust Voice Liveness Detection System for Smart Speakers
Yan Meng and Jiachun Li, Shanghai Jiao Tong University; Matthew Pillari, Arjun Deopujari, Liam Brennan, and Hafsah Shamsie, University of Virginia; Haojin Zhu, Shanghai Jiao Tong University; Yuan Tian, University of Virginia
Though playing an essential role in smart home systems, smart speakers are vulnerable to voice spoofing attacks. Passive liveness detection, which utilizes only the collected audio rather than the deployed sensors to distinguish between live-human and replayed voices, has drawn increasing attention. However, it faces the challenge of performance degradation under the different environmental factors as well as the strict requirement of the fixed user gestures.
In this study, we propose a novel liveness feature, array fingerprint, which utilizes the microphone array inherently adopted by the smart speaker to determine the identity of collected audios. Our theoretical analysis demonstrates that by leveraging the circular layout of microphones, compared with existing schemes, array fingerprint achieves a more robust performance under the environmental change and user's movement. Then, to leverage such a fingerprint, we propose ARRAYID, a lightweight passive detection scheme, and elaborate a series of features working together with array fingerprint. Our evaluation on the dataset containing 32,780 audio samples and 14 spoofing devices shows that ARRAYID achieves an accuracy of 99.84%, which is superior to existing passive liveness detection schemes.
Aardvark: An Asynchronous Authenticated Dictionary with Applications to Account-based Cryptocurrencies
Derek Leung, MIT CSAIL; Yossi Gilad, Hebrew University of Jerusalem; Sergey Gorbunov, University of Waterloo; Leonid Reyzin, Boston University; Nickolai Zeldovich, MIT CSAIL
We design Aardvark, a novel authenticated dictionary with short proofs of correctness for lookups and modifications. Our design reduces storage requirements for transaction validation in cryptocurrencies by outsourcing data from validators to untrusted servers, which supply proofs of correctness of this data as needed. In this setting, short proofs are particularly important because proofs are distributed to many validators, and the transmission of long proofs can easily dominate costs.
A proof for a piece of data in an authenticated dictionary may change whenever any (even unrelated) data changes. This presents a problem for concurrent issuance of cryptocurrency transactions, as proofs become stale. To solve this problem, Aardvark employs a versioning mechanism to safely accept stale proofs for a limited time.
On a dictionary with 100 million keys, operation proof sizes are about 1KB in a Merkle Tree versus 100–200B in Aardvark. Our evaluation shows that a 32-core validator processes 1492–2941 operations per second, saving about 800× in storage costs relative to maintaining the entire state.
OVRseen: Auditing Network Traffic and Privacy Policies in Oculus VR
Rahmadi Trimananda, Hieu Le, Hao Cui, and Janice Tran Ho, University of California, Irvine; Anastasia Shuba, Independent Researcher; Athina Markopoulou, University of California, Irvine
Virtual reality (VR) is an emerging technology that enables new applications but also introduces privacy risks. In this paper, we focus on Oculus VR (OVR), the leading platform in the VR space and we provide the first comprehensive analysis of personal data exposed by OVR apps and the platform itself, from a combined networking and privacy policy perspective. We experimented with the Quest 2 headset and tested the most popular VR apps available on the official Oculus and the SideQuest app stores. We developed OVRseen, a methodology and system for collecting, analyzing, and comparing network traffic and privacy policies on OVR. On the networking side, we captured and decrypted network traffic of VR apps, which was previously not possible on OVR, and we extracted data flows, defined as〈app, data type, destination〉. Compared to the mobile and other app ecosystems, we found OVR to be more centralized and driven by tracking and analytics, rather than by third-party advertising. We show that the data types exposed by VR apps include personally identifiable information (PII), device information that can be used for fingerprinting, and VR-specific data types. By comparing the data flows found in the network traffic with statements made in the apps' privacy policies, we found that approximately 70% of OVR data flows were not properly disclosed. Furthermore, we extracted additional context from the privacy policies, and we observed that 69% of the data flows were used for purposes unrelated to the core functionality of apps.
Lumos: Identifying and Localizing Diverse Hidden IoT Devices in an Unfamiliar Environment
Rahul Anand Sharma, Elahe Soltanaghaei, Anthony Rowe, and Vyas Sekar, Carnegie Mellon University
Hidden IoT devices are increasingly being used to snoop on users in hotel rooms or AirBnBs. We envision empowering users entering such unfamiliar environments to identify and locate (e.g., hidden camera behind plants) diverse hidden devices (e.g., cameras, microphones, speakers) using only their personal handhelds.
What makes this challenging is the limited network visibility and physical access that a user has in such unfamiliar environments, coupled with the lack of specialized equipment.
This paper presents Lumos, a system that runs on commodity user devices (e.g., phone, laptop) and enables users to identify and locate WiFi-connected hidden IoT devices and visualize their presence using an augmented reality interface. Lumos addresses key challenges in: (1) identifying diverse devices using only coarse-grained wireless layer features, without IP/DNS layer information and without knowledge of the WiFi channel assignments of the hidden devices; and (2) locating the identified IoT devices with respect to the user using only phone sensors and wireless signal strength measurements. We evaluated Lumos across 44 different IoT devices spanning various types, models, and brands across six different environments. Our results show that Lumos can identify hidden devices with 95% accuracy and locate them with a median error of 1.5m within 30 minutes in a two-bedroom, 1000 sq. ft. apartment.
AMD Prefetch Attacks through Power and Time
Moritz Lipp and Daniel Gruss, Graz University of Technology; Michael Schwarz, CISPA Helmholtz Center for Information Security
Modern operating systems fundamentally rely on the strict isolation of user applications from the kernel. This isolation is enforced by the hardware. On Intel CPUs, this isolation has been shown to be imperfect, for instance, with the prefetch side-channel. With Meltdown, it was even completely circumvented. Both the prefetch side channel and Meltdown have been mitigated with the same software patch on Intel. As AMD is believed to be not vulnerable to these attacks, this software patch is not active by default on AMD CPUs.
In this paper, we show that the isolation on AMD CPUs suffers from the same type of side-channel leakage. We discover timing and power variations of the prefetch instruction that can be observed from unprivileged user space. In contrast to previous work on prefetch attacks on Intel, we show that the prefetch instruction on AMD leaks even more information. We demonstrate the significance of this side channel with multiple case studies in real-world scenarios. We demonstrate the first microarchitectural break of (fine-grained) KASLR on AMD CPUs. We monitor kernel activity, e.g., if audio is played over Bluetooth, and establish a covert channel. Finally, we even leak kernel memory with 52.85 B/s with simple Spectre gadgets in the Linux kernel. We show that stronger page table isolation should be activated on AMD CPUs by default to mitigate our presented attacks successfully.
ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models
Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, and Michael Backes, CISPA Helmholtz Center for Information Security; Emiliano De Cristofaro, UCL and Alan Turing Institute; Mario Fritz and Yang Zhang, CISPA Helmholtz Center for Information Security
Inference attacks against Machine Learning (ML) models allow adversaries to learn sensitive information about training data, model parameters, etc. While researchers have studied, in depth, several kinds of attacks, they have done so in isolation. As a result, we lack a comprehensive picture of the risks caused by the attacks, e.g., the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of possible defenses. In this paper, we fill this gap by presenting a first-of-its-kind holistic risk assessment of different inference attacks against machine learning models. We concentrate on four attacks -- namely, membership inference, model inversion, attribute inference, and model stealing -- and establish a threat model taxonomy.
Our extensive experimental evaluation, run on five model architectures and four image datasets, shows that the complexity of the training dataset plays an important role with respect to the attack's performance, while the effectiveness of model stealing and membership inference attacks are negatively correlated. We also show that defenses like DP-SGD and Knowledge Distillation can only mitigate some of the inference attacks. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models, and equally serves as a benchmark tool for researchers and practitioners.
Jenny: Securing Syscalls for PKU-based Memory Isolation Systems
David Schrammel, Samuel Weiser, Richard Sadek, and Stefan Mangard, Graz University of Technology
Effective syscall filtering is a key component for withstanding the numerous exploitation techniques and privilege escalation attacks we face today. For example, modern browsers use sandboxing techniques with syscall filtering in order to isolate critical code. Cloud computing heavily uses containers, which virtualize the syscall interface. Recently, cloud providers are switching to in-process containers for performance reasons, calling for better isolation primitives. A new isolation primitive that has the potential to fill this gap is called Protection Keys for Userspace (PKU). Unfortunately, prior research highlights severe deficiencies in how PKU-based systems manage syscalls, questioning their security and practicability.
In this work, we comprehensively investigate syscall filtering for PKU-based memory isolation systems. First, we identify new syscall-based attacks that can break a PKU sandbox. Second, we derive syscall filter rules necessary for protecting PKU domains and show efficient ways of enforcing them. Third, we do a comparative study on different syscall interposition techniques with respect to their suitability for PKU, which allows us to design a secure syscall interposition technique that is both fast and flexible.
We design and prototype Jenny– a PKU-based memory isolation system that provides powerful syscall filtering capabilities in userspace. Jenny supports various interposition techniques (e.g., seccomp and ptrace), and allows for domain-specific syscall filtering in a nested way. Furthermore, it handles asynchronous signals securely. Our evaluation shows a minor performance impact of 0–5% for nginx.
DoubleStar: Long-Range Attack Towards Depth Estimation based Obstacle Avoidance in Autonomous Systems
Ce Zhou, Qiben Yan, and Yan Shi, Michigan State University; Lichao Sun, Lehigh University
Depth estimation-based obstacle avoidance has been widely adopted by autonomous systems (drones and vehicles) for safety purpose. It normally relies on a stereo camera to automatically detect obstacles and make flying/driving decisions, e.g., stopping several meters ahead of the obstacle in the path or moving away from the detected obstacle. In this paper, we explore new security risks associated with the stereo vision-based depth estimation algorithms used for obstacle avoidance. By exploiting the weaknesses of the stereo matching in depth estimation algorithms and the lens flare effect in optical imaging, we propose DoubleStar, a long-range attack that injects fake obstacle depth by projecting pure light from two complementary light sources.
DoubleStar includes two distinctive attack formats: beams attack and orbs attack, which leverage projected light beams and lens flare orbs respectively to cause false depth perception. We successfully attack two commercial stereo cameras designed for autonomous systems (ZED and Intel RealSense). The visualization of fake depth perceived by the stereo cameras illustrates the false stereo matching induced by DoubleStar. We further use Ardupilot to simulate the attack and demonstrate its impact on drones. To validate the attack on real systems, we perform a real-world attack towards a commercial drone equipped with state-of-the-art obstacle avoidance algorithms. Our attack can continuously bring a flying drone to a sudden stop or drift it away across a long distance under various lighting conditions, even bypassing sensor fusion mechanisms. Specifically, our experimental results show that DoubleStar creates fake depth up to 15 meters in distance at night and up to 8 meters during the daytime. To mitigate this newly discovered threat, we provide discussions on potential countermeasures to defend against DoubleStar.
PrivGuard: Privacy Regulation Compliance Made Easier
Lun Wang, UC Berkeley; Usmann Khan, Georgia Tech; Joseph Near, University of Vermont; Qi Pang, Zhejiang University; Jithendaraa Subramanian, NIT Tiruchirappalli; Neel Somani, UC Berkeley; Peng Gao, Virginia Tech; Andrew Low and Dawn Song, UC Berkeley
Continuous compliance with privacy regulations, such as GDPR and CCPA, has become a costly burden for companies from small-sized start-ups to business giants. The culprit is the heavy reliance on human auditing in today's compliance process, which is expensive, slow, and error-prone. To address the issue, we propose PrivGuard, a novel system design that reduces human participation required and improves the productivity of the compliance process. PrivGuard is mainly comprised of two components: (1) PrivAnalyzer, a static analyzer based on abstract interpretation for partly enforcing privacy regulations, and (2) a set of components providing strong security protection on the data throughout its life cycle. To validate the effectiveness of this approach, we prototype PrivGuard and integrate it into an industrial-level data governance platform. Our case studies and evaluation show that PrivGuard can correctly enforce the encoded privacy policies on real-world programs with reasonable performance overhead.
DeepDi: Learning a Relational Graph Convolutional Network Model on Instructions for Fast and Accurate Disassembly
Sheng Yu, University of California Riverside and Deepbits Technology Inc.; Yu Qu, University of California Riverside; Xunchao Hu, Deepbits Technology Inc.; Heng Yin, University of California Riverside and Deepbits Technology Inc.
Disassembly is the cornerstone of many binary analysis tasks. Traditional disassembly approaches (e.g., linear and recursive) are not accurate enough, while more sophisticated approaches (e.g., Probabilistic Disassembly, Datalog Disassembly, and XDA) have high overhead, which hinders them from being widely used in time-critical security practices. In this paper, we propose DEEPDI, a novel approach that achieves both accuracy and efficiency. The key idea of DEEPDI is to use a graph neural network model to capture and propagate instruction relations. Specifically, DEEPDI firstly uses superset disassembly to get a superset of instructions. Then we construct a graph model called Instruction Flow Graph to capture different instruction relations. Then a Relational Graph Convolutional Network is used to propagate instruction embeddings for accurate instruction classification. DEEPDI also provides heuristics to recover function entrypoints. We evaluate DEEPDI on several large-scale datasets containing real-world and obfuscated binaries. We show that DEEPDI is comparable or superior to the state-of-the-art disassemblers in terms of accuracy, and is robust against unseen binaries, compilers, platforms, obfuscated binaries, and adversarial attacks. Its CPU version is two times faster than IDA Pro, and its GPU version is 350 times faster.
Understanding and Improving Usability of Data Dashboards for Simplified Privacy Control of Voice Assistant Data
Vandit Sharma and Mainack Mondal, Indian Institute of Technology Kharagpur
Today, intelligent voice assistant (VA) software like Amazon's Alexa, Google's Voice Assistant (GVA) and Apple's Siri have millions of users. These VAs often collect and analyze huge user data for improving their functionality. However, this collected data may contain sensitive information (e.g., personal voice recordings) that users might not feel comfortable sharing with others and might cause significant privacy concerns. To counter such concerns, service providers like Google present their users with a personal data dashboard (called 'My Activity Dashboard'), allowing them to manage all voice assistant collected data. However, a real-world GVA-data driven understanding of user perceptions and preferences regarding this data (and data dashboards) remained relatively unexplored in prior research.
To that end, in this work we focused on Google Voice Assistant (GVA) users and investigated the perceptions and preferences of GVA users regarding data and dashboard while grounding them in real GVA-collected user data. Specifically, we conducted an 80-participant survey-based user study to collect both generic perceptions regarding GVA usage as well as desired privacy preferences for a stratified sample of their GVA data. We show that most participants had superficial knowledge about the type of data collected by GVA. Worryingly, we found that participants felt uncomfortable sharing a non-trivial 17.7% of GVA-collected data elements with Google. The current My Activity dashboard, although useful, did not help long-time GVA users effectively manage their data privacy. Our real-data-driven study found that showing users even one sensitive data element can significantly improve the usability of data dashboards. To that end, we built a classifier that can detect sensitive data for data dashboard recommendations with a 95% F1-score and shows 76% improvement over baseline models.
A Large-scale Temporal Measurement of Android Malicious Apps: Persistence, Migration, and Lessons Learned
Yun Shen and Pierre-Antoine Vervier, Norton Research Group; Gianluca Stringhini, Boston University
We study the temporal dynamics of potentially harmful apps (PHAs) on Android by leveraging 8.8M daily on-device detections collected among 11.7M customers of a popular mobile security product between 2019 and 2020. We show that the current security model of Android, which limits security products to run as regular apps and prevents them from automatically removing malicious apps opens a significant window of opportunity for attackers. Such apps warn users about the newly discovered threats, but users do not promptly act on this information, allowing PHAs to persist on their device for an average of 24 days after they are detected. We also find that while app markets remove PHAs after these become known, there is a significant delay between when PHAs are identified and when they are removed: PHAs persist on Google Play for 77 days on average and 34 days on third party marketplaces. Finally, we find evidence of PHAs migrating to other marketplaces after being removed on the original one. This paper provides an unprecedented view of the Android PHA landscape, showing that current defenses against PHAs on Android are not as effective as commonly thought, and identifying multiple research directions that the security community should pursue, from orchestrating more effective PHA takedowns to devising better alerts for mobile security products.
Midas: Systematic Kernel TOCTTOU Protection
Atri Bhattacharyya, EPFL; Uros Tesic, Nvidia; Mathias Payer, EPFL
Double-fetch bugs are a plague across all major operating system kernels. They occur when data is fetched twice across the user/kernel trust boundary while allowing concurrent modification. Such bugs enable an attacker to illegally access memory, cause denial of service, or to escalate privileges. So far, the only protection against double-fetch bugs is to detect and fix them. However, they remain incredibly hard to find. Similarly, they fundamentally prohibit efficient, kernel-based stateful system call filtering. We propose Midas to mitigate double-fetch bugs. Midas creates on-demand snapshots and copies of accessed data, enforcing our key invariant that throughout a syscall's lifetime, every read to a userspace object will return the same value.
Midas shows no noticeable drop in performance when evaluated on compute-bound workloads. On system call heavy workloads, Midas incurs 0.2-14% performance overhead, while protecting the kernel against any TOCTTOU attacks. On average, Midas shows a 3.4% overhead on diverse workloads across two benchmark suites.
Repurposing Segmentation as a Practical LVI-NULL Mitigation in SGX
Lukas Giner, Andreas Kogler, and Claudio Canella, Graz University of Technology; Michael Schwarz, CISPA Helmholtz Center for Information Security; Daniel Gruss, Graz University of Technology
Load Value Injection (LVI) uses Meltdown-type data flows in Spectre-like confused-deputy attacks. LVI has been demonstrated in practical attacks on Intel SGX enclaves, and consequently, mitigations were deployed that incur tremendous overheads of factor 2 to 19. However, as we discover, on fixed hardware LVI-NULL leakage is still present. Hence, to mitigate LVI-NULL in SGX enclaves on LVI-fixed CPUs, the expensive mitigations would still be necessary.
In this paper, we propose a lightweight mitigation focused on LVI-NULL in SGX, LVI-NULLify. We systematically analyze and categorize LVI-NULL variants. Our analysis reveals that previously proposed mitigations targeting LVI-NULL are not effective. Our novel mitigation addresses this problem by repurposing segmentation, a fast legacy hardware mechanism that x86 already uses for every memory operation. LVI-NULLify consists of a modified SGX-SDK and a compiler extension which put the enclave in control of LVI-NULL-exploitable memory locations. We evaluate LVI-NULLify on the LVI-fixed Comet Lake CPU and observe a performance overhead below 10% for the worst case, which is substantially lower than previous defenses with a prohibitive overhead of 1220% in the worst case. We conclude that LVI-NULLify is a practical solution to protect SGX enclaves against LVI-NULL today.
Orca: Blocklisting in Sender-Anonymous Messaging
Nirvan Tyagi and Julia Len, Cornell University; Ian Miers, University of Maryland; Thomas Ristenpart, Cornell Tech
Sender-anonymous end-to-end encrypted messaging allows sending messages to a recipient without revealing the sender's identity to the messaging platform. Signal recently introduced a sender anonymity feature that includes an abuse mitigation mechanism meant to allow the platform to block malicious senders on behalf of a recipient.
We explore the tension between sender anonymity and abuse mitigation. We start by showing limitations of Signal's deployed mechanism, observing that it results in relatively weak anonymity properties and showing a new griefing attack that allows a malicious sender to drain a victim's battery. We therefore design a new protocol, called Orca, that allows recipients to register a privacy-preserving blocklist with the platform. Without learning the sender's identity, the platform can check that the sender is not on the blocklist and that the sender can be identified by the recipient. We construct Orca using a new type of group signature scheme, for which we give formal security notions. Our prototype implementation showcases Orca's practicality.
Rendering Contention Channel Made Practical in Web Browsers
Shujiang Wu and Jianjia Yu, Johns Hopkins University; Min Yang, Fudan University; Yinzhi Cao, Johns Hopkins University
Browser rendering utilizes hardware resources shared within and across browsers to display web contents, thus inevitably being vulnerable to side channel attacks. Prior works have studied rendering side channels that are caused by rendering time differences of one frame, such as URL color change. However, it still remains unclear how rendering contentions play a role in side-channel attacks and covert communications.
In this paper, we design a novel rendering contention channel. Specifically, we stress the browser's rendering resource with stable, self-adjustable pressure and measure the time taken to render a sequence of frames. The measured time sequence is further used to infer any co-rendering event of the browser.
To better understand the channel, we study its cause via a method called single variable testing. That is, we keep all variables the same but only change one to test whether the changed variable contributes to the contention. Our results show that CPU, GPU and screen buffer are all part of the contention.
To demonstrate the channel's feasibility, we design and implement a prototype, open-source framework, called SIDER, to launch four attacks using the rendering contention channel, which are (i) cross-browser, cross-mode cookie synchronization, (ii) history sniffing, (iii) website fingerprinting, and (iv) keystroke logging. Our evaluation shows the effectiveness and feasibility of all four attacks.
OpenSSLNTRU: Faster post-quantum TLS key exchange
Daniel J. Bernstein, University of Illinois at Chicago and Ruhr University Bochum; Billy Bob Brumley, Tampere University; Ming-Shing Chen, Ruhr University Bochum; Nicola Tuveri, Tampere University
Google's CECPQ1 experiment in 2016 integrated a post-quantum key-exchange algorithm, newhope1024, into TLS 1.2. The Google-Cloudflare CECPQ2 experiment in 2019 integrated a more efficient key-exchange algorithm, ntruhrss701, into TLS 1.3.
This paper revisits the choices made in CECPQ2, and shows how to achieve higher performance for post-quantum key exchange in TLS 1.3 using a higher-security algorithm, sntrup761. Previous work had indicated that ntruhrss701 key generation was much faster than sntrup761 key generation, but this paper makes sntrup761 key generation much faster by generating a batch of keys at once.
Batch key generation is invisible at the TLS protocol layer, but raises software-engineering questions regarding the difficulty of integrating batch key exchange into existing TLS libraries and applications. This paper shows that careful choices of software layers make it easy to integrate fast post-quantum software, including batch key exchange, into TLS with minor changes to TLS libraries and no changes to applications.
As a demonstration of feasibility, this paper reports successful integration of its fast sntrup761 library, via a lightly patched OpenSSL, into an unmodified web browser and an unmodified TLS terminator. This paper also reports TLS 1.3 handshake benchmarks, achieving more TLS 1.3 handshakes per second than any software included in OpenSSL.
"OK, Siri" or "Hey, Google": Evaluating Voiceprint Distinctiveness via Content-based PROLE Score
Ruiwen He, Xiaoyu Ji, and Xinfeng Li, Zhejiang University; Yushi Cheng, Tsinghua University; Wenyuan Xu, Zhejiang University
A voiceprint is the distinctive pattern of human voices that is spectrographically produced and has been widely used for authentication in the voice assistants. This paper investigates the impact of speech contents on the distinctiveness of voiceprint, and has obtained answers to three questions by studying 2457 speakers and 14,600,000 test samples: 1) What are the influential factors that determine the distinctiveness of voiceprints? 2) How to quantify the distinctiveness of voiceprints for given words, e.g., wake-up words in commercial voice assistants? 3) How to construct wake-up words whose voiceprints have high distinctiveness levels. To answer those questions, we break down voiceprint into phones, and experimentally obtain the correlation between the false recognition rates and the richness of the phone types, the order, the length, and the elements of the phones. Then, we define PROLE Score that can be easily calculated based on speech content yet can reflect the voice distinctiveness. Under the guidance of PROLE Score, we tested 30 wake-up words of 19 commercial voice assistant products, e.g., "Hey, Siri'', "OK, Google'' and "Nihao, Xiaona'' in both English and Chinese. Finally, we provide recommendations for both users and manufacturers, on selecting secure voiceprint words.
PISTIS: Trusted Computing Architecture for Low-end Embedded Systems
Michele Grisafi, University of Trento; Mahmoud Ammar, Huawei Technologies; Marco Roveri and Bruno Crispo, University of Trento
Recently, several hardware-assisted security architectures have been proposed to mitigate the ever-growing cyber-attacks on Internet-connected devices. However, such proposals are not compatible with a large portion of the already deployed resource-constrained embedded devices due to hardware limitations. To fill this gap, we propose PISTIS, a pure-software trusted computing architecture for bare-metal low-end embedded devices. PISTIS enables several security services, such as memory isolation, remote attestation and secure code update, while fully supporting critical features such as Direct Memory Access (DMA) and interrupts. PISTIS targets a wide range of embedded devices including those that lack any hardware protection mechanisms, while only requiring a few kilobytes of Flash memory to store its root of trust (RoT) software. The entire architecture of PISTIS is built from the ground up by leveraging memory protection-enabling techniques such as assembly-level code verification and selective software virtualisation. Most importantly, PISTIS achieves strong security guarantees supported by a formally verified design. We implement and evaluate PISTIS on MSP430 architecture, showing a reasonable overhead in terms of runtime, memory footprint, and power consumption.
Stick It to The Man: Correcting for Non-Cooperative Behavior of Subjects in Experiments on Social Networks
Kaleigh Clary, University of Massachusetts Amherst; Emma Tosch and Jeremiah Onaolapo, University of Vermont; David D. Jensen, University of Massachusetts Amherst
A large body of research in network and social sciences studies the effects of interventions in network systems. Nearly all of this work assumes that network participants will respond to interventions in similar ways. However, in real-world systems, a subset of participants may respond in ways purposefully different than their true outcome. We characterize the influence of non-cooperative nodes and the bias these nodes introduce in estimates of average treatment effect (ATE). In addition to theoretical bounds, we empirically demonstrate estimation bias through experiments on synthetically generated graphs and a real-world network. We demonstrate that causal estimates in networks can be sensitive to the actions of non-cooperative members, and we identify network structures that are particularly vulnerable to non-cooperative responses.
Mining Node.js Vulnerabilities via Object Dependence Graph and Query
Song Li and Mingqing Kang, Johns Hopkins University; Jianwei Hou, Johns Hopkins University/Renmin University of China; Yinzhi Cao, Johns Hopkins University
Node.js is a popular non-browser JavaScript platform that provides useful but sometimes also vulnerable packages. On one hand, prior works have proposed many program analysis-based approaches to detect Node.js vulnerabilities, such as command injection and prototype pollution, but they are specific to individual vulnerability and do not generalize to a wide range of vulnerabilities on Node.js. On the other hand, prior works on C/C++ and PHP have proposed graph query-based approaches, such as Code Property Graph (CPG), to efficiently mine vulnerabilities, but they are not directly applicable to JavaScript due to the language's extensive use of dynamic features.
In the paper, we propose flow- and context-sensitive static analysis with hybrid branch-sensitivity and points-to information to generate a novel graph structure, called Object Dependence Graph (ODG), using abstract interpretation. ODG represents JavaScript objects as nodes and their relations with Abstract Syntax Tree (AST) as edges, and accepts graph queries—especially on object lookups and definitions—for detecting Node.js vulnerabilities.
We implemented an open-source prototype system, called ODGEN, to generate ODG for Node.js programs via abstract interpretation and detect vulnerabilities. Our evaluation of recent Node.js vulnerabilities shows that ODG together with AST and Control Flow Graph (CFG) is capable of modeling 13 out of 16 vulnerability types. We applied ODGEN to detect six types of vulnerabilities using graph queries: ODGEN correctly reported 180 zero-day vulnerabilities, among which we have received 70 Common Vulnerabilities and Exposures (CVE) identifiers so far.
Security and Privacy Perceptions of Third-Party Application Access for Google Accounts
David G. Balash, Xiaoyuan Wu, and Miles Grant, The George Washington University; Irwin Reyes, Two Six Technologies; Adam J. Aviv, The George Washington University
Online services like Google provide a variety of application programming interfaces (APIs). These online APIs enable authenticated third-party services and applications (apps) to access a user's account data for tasks such as single sign-on (SSO), calendar integration, and sending email on behalf of the user, among others. Despite their prevalence, API access could pose significant privacy and security risks, where a third-party could have unexpected privileges to a user's account. To gauge users' perceptions and concerns regarding third-party apps that integrate with online APIs, we performed a multi-part online survey of Google users. First, we asked n = 432 participants to recall if and when they allowed third-party access to their Google account: 89% recalled using at least one SSO and 52% remembered at least one third-party app. In the second survey, we re-recruited n = 214 participants to ask about specific apps and SSOs they've authorized on their own Google accounts. We collected in-the-wild data about users' actual SSOs and authorized apps: 86% used Google SSO on at least one service, and 67% had at least one third-party app authorized. After examining their apps and SSOs, participants expressed the most concern about access to personal information like email addresses and other publicly shared info. However, participants were less concerned with broader---and perhaps more invasive---access to calendars, emails, or cloud storage (as needed by third-party apps). This discrepancy may be due in part to trust transference to apps that integrate with Google, forming an implied partnership. Our results suggest opportunities for design improvements to the current third-party management tools offered by Google; for example, tracking recent access, automatically revoking access due to app disuse, and providing permission controls.
Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World
Giovanni Cherubin, Alan Turing Institute; Rob Jansen, U.S. Naval Research Laboratory; Carmela Troncoso, EPFL SPRING Lab
Distinguished Paper Award Winner and Second Prize Winner (tie) of the 2022 Internet Defense Prize
Website fingerprinting (WF) attacks on Tor allow an adversary who can observe the traffic patterns between a victim and the Tor network to predict the website visited by the victim. Existing WF attacks yield extremely high accuracy. However, the conditions under which these attacks are evaluated raises questions about their effectiveness in the real world. We conduct the first evaluation of website fingerprinting using genuine Tor traffic as ground truth and evaluated under a true open world. We achieve this by adapting the state-of-the-art Triplet Fingerprinting attack to an online setting and training the WF models on data safely collected on a Tor exit relay—a setup an adversary can easily deploy in practice. By studying WF under realistic conditions, we demonstrate that an adversary can achieve a WF classification accuracy of above 95% when monitoring a small set of 5 popular websites, but that accuracy quickly degrades to less than 80% when monitoring as few as 25 websites. We conclude that, although WF attacks may be possible, it is likely infeasible to carry them out in the real world while monitoring more than a small set of websites.
Rapid Prototyping for Microarchitectural Attacks
Catherine Easdon, Dynatrace Research and Graz University of Technology; Michael Schwarz, CISPA Helmholtz Center for Information Security; Martin Schwarzl and Daniel Gruss, Graz University of Technology
In recent years, microarchitectural attacks have been demonstrated to be a powerful attack class. However, as our empirical analysis shows, there are numerous implementation challenges that hinder discovery and subsequent mitigation of these vulnerabilities. In this paper, we examine the attack development process, the features and usability of existing tools, and the real-world challenges faced by practitioners. We propose a novel approach to microarchitectural attack development, based on rapid prototyping, and present two open-source software frameworks, libtea and SCFirefox, that improve upon state-of-the-art tooling to facilitate rapid prototyping of attacks.
libtea demonstrates that native code attacks can be abstracted sufficiently to permit cross-platform implementations while retaining fine-grained control of microarchitectural behavior. We evaluate its effectiveness by developing proof-of-concept Foreshadow and LVI attacks. Our LVI prototype runs on x86-64 and ARMv8-A, and is the first public demonstration of LVI on ARM. SCFirefox is the first tool for browser-based microarchitectural attack development, providing the functionality of libtea in JavaScript. This functionality can then be used to iteratively port a prototype to unmodified browsers. We demonstrate this process by prototyping the first browser-based ZombieLoad attack and deriving a vanilla JavaScript and WebAssembly PoC running in an unmodified recent version of Firefox. We discuss how libtea and SCFirefox contribute to the security landscape by providing attack researchers and defenders with frameworks to prototype attacks and assess their feasibility.
Caring about Sharing: User Perceptions of Multiparty Data Sharing
Bailey Kacsmar, Kyle Tilbury, Miti Mazmudar, and Florian Kerschbaum, University of Waterloo
Data sharing between companies is typically regarded as one-size-fits-all in practice and in research. For instance, the main source of information available to users about how a company shares their data is privacy policies. Privacy policies use ambiguous terms such as ‘third-parties' and ‘partners' with regard to who data is shared with. In the real-world, data sharing has more nuance than is captured by these overarching terms. We investigate whether users perceive different data sharing scenarios differently through an online survey with scenarios that describe specific types of multiparty data sharing practices. We determine users' perceptions when explicitly presented with how their data is shared, who it is shared with, and why. We show that users have preferences and that variations in acceptability exist which depend on the nature of the data sharing collaboration. Users caring about sharing, necessitates more transparent sharing practices and regulations.
Spoki: Unveiling a New Wave of Scanners through a Reactive Network Telescope
Raphael Hiesgen, HAW Hamburg; Marcin Nawrocki, Freie Universität Berlin; Alistair King, Kentik; Alberto Dainotti, CAIDA, UC San Diego and Georgia Institute of Technology; Thomas C. Schmidt, HAW Hamburg; Matthias Wählisch, Freie Universität Berlin
Large-scale Internet scans are a common method to identify victims of a specific attack. Stateless scanning like in ZMap has been established as an efficient approach to probing at Internet scale. Stateless scans, however, need a second phase to perform the attack. This remains invisible to network telescopes, which only capture the first incoming packet, and is not observed as a related event by honeypots, either. In this work, we examine Internet-wide scan traffic through Spoki, a reactive network telescope operating in real-time that we design and implement. Spoki responds to asynchronous TCP SYN packets and engages in TCP handshakes initiated in the second phase of two-phase scans. Because it is extremely lightweight it scales to large prefixes where it has the unique opportunity to record the first data sequence submitted within the TCP handshake ACK. We analyze two-phase scanners during a three months period using globally deployed Spoki reactive telescopes as well as flow data sets from IXPs and ISPs. We find that a predominant fraction of TCP SYNs on the Internet has irregular characteristics. Our findings also provide a clear signature of today's scans as: (i) highly targeted, (ii) scanning activities notably vary between regional vantage points, and (iii) a significant share originates from malicious sources.
Holistic Control-Flow Protection on Real-Time Embedded Systems with Kage
Yufei Du, UNC Chapel Hill and University of Rochester; Zhuojia Shen, Komail Dharsee, and Jie Zhou, University of Rochester; Robert J. Walls, Worcester Polytechnic Institute; John Criswell, University of Rochester
This paper presents Kage: a system that protects the control data of both application and kernel code on microcontroller-based embedded systems. Kage consists of a Kage-compliant embedded OS that stores all control data in separate memory regions from untrusted data, a compiler that transforms code to protect these memory regions efficiently and to add forward-edge control-flow integrity checks, and a secure API that allows safe updates to the protected data. We implemented Kage as an extension to FreeRTOS, an embedded real-time operating system. We evaluated Kage's performance using the CoreMark benchmark. Kage incurred a 5.2% average runtime overhead and 49.8% code size overhead. Furthermore, the code size overhead was only 14.2% when compared to baseline FreeRTOS with the MPU enabled. We also evaluated Kage's security guarantees by measuring and analyzing reachable code-reuse gadgets. Compared to FreeRTOS, Kage reduces the number of reachable gadgets from 2,276 to 27, and the remaining 27 gadgets cannot be stitched together to launch a practical attack.
"I feel invaded, annoyed, anxious and I may protect myself": Individuals' Feelings about Online Tracking and their Protective Behaviour across Gender and Country
Kovila P.L. Coopamootoo and Maryam Mehrnezhad, Newcastle University; Ehsan Toreini, Durham University
Online tracking is a primary concern for Internet users, yet previous research has not found a clear link between the cognitive understanding of tracking and protective actions. We postulate that protective behaviour follows affective evaluation of tracking. We conducted an online study, with N=614 participants, across the UK, Germany and France, to investigate how users feel about third-party tracking and what protective actions they take. We found that most participants' feelings about tracking were negative, described as deeply intrusive - beyond the informational sphere, including feelings of annoyance and anxiety, that predict protective actions. We also observed indications of a ‘privacy gender gap', where women feel more negatively about tracking, yet are less likely to take protective actions, compared to men. And less UK individuals report negative feelings and protective actions, compared to those from Germany and France. This paper contributes insights into the affective evaluation of privacy threats and how it predicts protective behaviour. It also provides a discussion on the implications of these findings for various stakeholders, make recommendations and outline avenues for future work.
Mistrust Plugins You Must: A Large-Scale Study Of Malicious Plugins In WordPress Marketplaces
Ranjita Pai Kasturi, Jonathan Fuller, Yiting Sun, Omar Chabklo, Andres Rodriguez, Jeman Park, and Brendan Saltaformaggio, Georgia Institute of Technology
Modern websites owe most of their aesthetics and functionalities to Content Management Systems (CMS) plugins, which are bought and sold on widely popular marketplaces. Driven by economic incentives, attackers abuse the trust in this economy: selling malware on legitimate marketplaces, pirating popular plugins, and infecting plugins post-deployment. This research studied the evolution of CMS plugins in over 400K production webservers dating back to 2012. We developed YODA, an automated framework to detect malicious plugins and track down their origin. YODA uncovered 47,337 malicious plugins on 24,931 unique websites. Among these, $41.5K had been spent on 3,685 malicious plugins sold on legitimate plugin marketplaces. Pirated plugins cheated developers out of $228K in revenues. Post-deployment attacks infected $834K worth of previously benign plugins with malware. Lastly, YODA informs our remediation efforts, as over 94% of these malicious plugins are still active today.
On the Security Risks of AutoML
Ren Pang and Zhaohan Xi, Pennsylvania State University; Shouling Ji, Zhejiang University; Xiapu Luo, Hong Kong Polytechnic University; Ting Wang, Pennsylvania State University
Neural architecture search (NAS) represents an emerging machine learning (ML) paradigm that automatically searches for model architectures tailored to given tasks, which significantly simplifies the development of ML systems and propels the trend of ML democratization. Yet, thus far little is known about the potential security risks incurred by NAS, which is concerning given the increasing use of NAS-generated models in critical domains.
This work represents a solid initial step towards bridging the gap. First, through an extensive empirical study of 10 popular NAS methods, we show that compared with their manually designed counterparts, NAS-generated models tend to suffer greater vulnerabilities to various malicious manipulations (e.g., adversarial evasion, model poisoning, functionality stealing). Further, with both empirical and analytical evidence, we provide possible explanations for such phenomena: given the prohibitive search space and training cost, most NAS methods favor models that converge fast at early training stages; this preference results in architectural properties associated with attack vulnerabilities (e.g., high loss smoothness, low gradient variance). Our findings not only reveal the relationships between model characteristics and attack vulnerabilities but also suggest the inherent connections underlying different attacks. Finally, we discuss potential remedies to mitigate such drawbacks, including increasing cell depth and suppressing skip connects, which lead to several promising research directions.
Morphuzz: Bending (Input) Space to Fuzz Virtual Devices
Alexander Bulekov, Boston University and Red Hat; Bandan Das and Stefan Hajnoczi, Red Hat; Manuel Egele, Boston University
The security of the entire cloud ecosystem crucially depends on the isolation guarantees that hypervisors provide between guest VMs and the host system. To allow VMs to communicate with their environment, hypervisors provide a slew of virtual-devices including network interface cards and performance-optimized VIRTIO-based SCSI adapters. As these devices sit directly on the hypervisor's isolation boundary and accept potentially attacker controlled input (e.g., from a malicious cloud tenant), bugs and vulnerabilities in the devices' implementations have the potential to render the hypervisor's isolation guarantees moot. Prior works applied fuzzing to simple virtual-devices, focusing on a narrow subset of the vast input-space and the state-of-the-art virtual-device fuzzer, Nyx, requires precise, manually-written, specifications to exercise complex devices.
In this paper we present MORPHUZZ, a generic approach that leverages insights about hypervisor design combined with coverage-guided fuzzing to find bugs in virtual device implementations. Crucially MORPHUZZ does not rely on expert knowledge specific to each device. MORPHUZZ is the first approach that automatically elicits the complex I/O behaviors of the real-world virtual devices found in modern clouds. To demonstrate this capability, we implemented MORPHUZZ in QEMU and bhyve and fuzzed 33 different virtual devices (a superset of the 16 devices analyzed by prior work). Additionally, we show that MORPHUZZ is not tied to a specific CPU architecture, by fuzzing 3 additional ARM devices. MORPHUZZ matches or exceeds coverage obtained by Nyx, for 13/16 virtual devices, and identified a superset (110) of all crashes reported by Nyx (44). We reported all newly discovered bugs to the respective developers. Notably, MORPHUZZ achieves this without initial seed-inputs, or expert guidance.
Towards More Robust Keyword Spotting for Voice Assistants
Shimaa Ahmed, University of Wisconsin-Madison; Ilia Shumailov, University of Cambridge; Nicolas Papernot, University of Toronto and Vector Institute; Kassem Fawaz, University of Wisconsin-Madison
Voice assistants rely on keyword spotting (KWS) to process vocal commands issued by humans: commands are prepended with a keyword, such as "Alexa" or "Ok Google," which must be spotted to activate the voice assistant. Typically, keyword spotting is two-fold: an on-device model first identifies the keyword, then the resulting voice sample triggers a second on-cloud model which verifies and processes the activation. In this work, we explore the significant privacy and security concerns that this raises under two threat models. First, our experiments demonstrate that accidental activations result in up to a minute of speech recording being uploaded to the cloud. Second, we verify that adversaries can systematically trigger misactivations through adversarial examples, which exposes the integrity and availability of services connected to the voice assistant. We propose EKOS (Ensemble for KeywOrd Spotting) which leverages the semantics of the KWS task to defend against both accidental and adversarial activations. EKOS incorporates spatial redundancy from the acoustic environment at training and inference time to minimize distribution drifts responsible for accidental activations. It also exploits a physical property of speech---its redundancy at different harmonics---to deploy an ensemble of models trained on different harmonics and provably force the adversary to modify more of the frequency spectrum to obtain adversarial examples. Our evaluation shows that EKOS increases the cost of adversarial activations, while preserving the natural accuracy. We validate the performance of EKOS with over-the-air experiments on commodity devices and commercial voice assistants; we find that EKOS improves the precision of the KWS task in non-adversarial settings.
Web Cache Deception Escalates!
Seyed Ali Mirheidari, University of Trento & Splunk Inc.; Matteo Golinelli, University of Trento; Kaan Onarlioglu, Akamai Technologies; Engin Kirda, Northeastern University; Bruno Crispo, University of Trento
Web Cache Deception (WCD) tricks a web cache into erroneously storing sensitive content, thereby making it widely accessible on the Internet. In a USENIX Security 2020 paper titled "Cached and Confused: Web Cache Deception in the Wild", researchers presented the first systematic exploration of the attack over 340 websites. This state-of-the-art approach for WCD detection injects markers into websites and checks for leaks into caches. However, this scheme has two fundamental limitations: 1) It cannot probe websites that do not present avenues for marker injection or reflection. 2) Marker setup is a burdensome process, making large-scale measurements infeasible. More generally, all previous literature on WCD focuses solely on personal information leaks on websites protected behind authentication gates, leaving important gaps in our understanding of the full ramifications of WCD.
We expand our knowledge of WCD attacks, their spread, and implications. We propose a novel WCD detection methodology that forgoes testing prerequisites, and utilizes page identicality checks and cache header heuristics to test any website. We conduct a comparative experiment on 404 websites, and show that our scheme identifies over 100 vulnerabilities while "Cached and Confused" is capped at 18. Equipped with a technique unhindered by the limitations of the previous work, we conduct the largest WCD experiment to date on the Alexa Top 10K, and detect 1188 vulnerable websites. We present case studies showing that WCD has consequences well beyond personal information leaks, and that attacks targeting non-authenticated pages are highly damaging.
Exploring the Unchartered Space of Container Registry Typosquatting
Guannan Liu, Virginia Tech; Xing Gao, University of Delaware; Haining Wang, Virginia Tech; Kun Sun, George Mason University
With the increasing popularity of containerized applications, container registries have hosted millions of repositories that allow developers to store, manage, and share their software. Unfortunately, they have also become a hotbed for adversaries to spread malicious images to the public. In this paper, we present the first in-depth study on the vulnerability of container registries to typosquatting attacks, in which adversaries intentionally upload malicious images with an identification similar to that of a benign image so that users may accidentally download malicious images due to typos. We demonstrate that such typosquatting attacks could pose a serious security threat in both public and private registries as well as across multiple platforms. To shed light on the container registry typosquatting threat, we first conduct a measurement study and a 210-day proof-of-concept exploitation on public container registries, revealing that human users indeed make random typos and download unwanted container images. We also systematically investigate attack vectors on private registries and reveal that its naming space is open and could be easily exploited for launching a typosquatting attack. In addition, for a typosquatting attack across multiple platforms, we demonstrate that adversaries can easily self-host malicious registries or exploit existing container registries to manipulate repositories with similar identifications. Finally, we propose CRYSTAL, a lightweight extension to existing image management, which effectively defends against typosquatting attacks from both container users and registries.
Can one hear the shape of a neural network?: Snooping the GPU via Magnetic Side Channel
Henrique Teles Maia and Chang Xiao, Columbia University; Dingzeyu Li, Adobe Research; Eitan Grinspun, Columbia University & University of Toronto; Changxi Zheng, Columbia University
Neural network applications have become popular in both enterprise and personal settings. Network solutions are tuned meticulously for each task, and designs that can robustly resolve queries end up in high demand. As the commercial value of accurate and performant machine learning models increases, so too does the demand to protect neural architectures as confidential investments. We explore the vulnerability of neural networks deployed as black boxes across accelerated hardware through electromagnetic side channels.
We examine the magnetic flux emanating from a graphics processing unit's power cable, as acquired by a cheap $3 induction sensor, and find that this signal betrays the detailed topology and hyperparameters of a black-box neural network model. The attack acquires the magnetic signal for one query with unknown input values, but known input dimension and batch size. The network reconstruction is possible due to the modular layer sequence in which deep neural networks are evaluated. We find that each layer component's evaluation produces an identifiable magnetic signal signature, from which layer topology, width, function type, and sequence order can be inferred using a suitably trained classifier and a joint consistency optimization based on integer programming.
We study the extent to which network specifications can be recovered, and consider metrics for comparing network similarity. We demonstrate the potential accuracy of this side channel attack in recovering the details for a broad range of network architectures, including random designs. We consider applications that may exploit this novel side channel exposure, such as adversarial transfer attacks. In response, we discuss countermeasures to protect against our method and other similar snooping techniques.
Augmenting Decompiler Output with Learned Variable Names and Types
Qibin Chen and Jeremy Lacomis, Carnegie Mellon University; Edward J. Schwartz, Carnegie Mellon University Software Engineering Institute; Claire Le Goues, Graham Neubig, and Bogdan Vasilescu, Carnegie Mellon University
Distinguished Paper Award Winner
A common tool used by security professionals for reverse-engineering binaries found in the wild is the decompiler. A decompiler attempts to reverse compilation, transforming a binary to a higher-level language such as C. High-level languages ease reasoning about programs by providing useful abstractions such as loops, typed variables, and comments, but these abstractions are lost during compilation. Decompilers are able to deterministically reconstruct structural properties of code, but comments, variable names, and custom variable types are technically impossible to recover.
In this paper we present DIRTY (DecompIled variable ReTYper), a novel technique for improving the quality of decompiler output that automatically generates meaningful variable names and types. DIRTY is built on a Transformer-based neural network model and is trained on code automatically scraped from repositories on GitHub. DIRTY uses this model to postprocesses decompiled files, recommending variable types and names given their context. Empirical evaluation on a novel dataset of C code mined from GitHub shows that DIRTY outperforms prior work approaches by a sizable margin, recovering the original names written by developers 66.4% of the time and the original types 75.8% of the time.
Inference Attacks Against Graph Neural Networks
Zhikun Zhang, Min Chen, and Michael Backes, CISPA Helmholtz Center for Information Security; Yun Shen, Norton Research Group; Yang Zhang, CISPA Helmholtz Center for Information Security
Graph is an important data representation ubiquitously existing in the real world. However, analyzing the graph data is computationally difficult due to its non-Euclidean nature. Graph embedding is a powerful tool to solve the graph analytics problem by transforming the graph data into low-dimensional vectors. These vectors could also be shared with third parties to gain additional insights of what is behind the data. While sharing graph embedding is intriguing, the associated privacy risks are unexplored. In this paper, we systematically investigate the information leakage of the graph embedding by mounting three inference attacks. First, we can successfully infer basic graph properties, such as the number of nodes, the number of edges, and graph density, of the target graph with up to 0.89 accuracy. Second, given a subgraph of interest and the graph embedding, we can determine with high confidence that whether the subgraph is contained in the target graph. For instance, we achieve 0.98 attack AUC on the DD dataset. Third, we propose a novel graph reconstruction attack that can reconstruct a graph that has similar graph structural statistics to the target graph. We further propose an effective defense mechanism based on graph embedding perturbation to mitigate the inference attacks without noticeable performance degradation for graph classification tasks.
LinKRID: Vetting Imbalance Reference Counting in Linux kernel with Symbolic Execution
Jian Liu, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences; Lin Yi, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, Chinese Academy of Sciences; Weiteng Chen, Chengyu Song, and Zhiyun Qian, UC Riverside; Qiuping Yi, Beijing University of Posts and Telecommunications and Beijing Key Lab of Intelligent Telecommunication Software and Multimedia
Linux kernel employs reference counters, which record the number of references to a shared kernel object, to track its lifecycle and prevent memory errors like use-after-free. However, the usage of reference counters can be tricky and often error-prone, especially considering unique kernel conventions of managing reference counters (e.g., external vs. internal reference counters). In this paper, we aim to automatically discover incorrect usage of reference counters, overcoming two key challenges: (1) scalability and (2) the aforementioned unique kernel conventions. Specifically, we develop a tiered program analysis based solution to efficiently and precisely check the imbalances between the change in the actual number of references and the corresponding reference counter. We apply our tool to the 4.14.0 kernel (with allyesconfig) and find 118 bugs, out of which 87 are new. The result shows our tool is scalable and effective.
Total Eclipse of the Heart – Disrupting the InterPlanetary File System
Bernd Prünster, Institute of Applied Information Processing and Communications (IAIK), Graz University of Technology; Alexander Marsalek, A-SIT Secure Information Technology Center Austria; Thomas Zefferer, A-SIT Plus GmbH
Peer-to-peer networks are an attractive alternative to classical client-server architectures in several fields of application such as voice-over-IP telephony and file sharing. Recently, a new peer-to-peer solution called the InterPlanetary File System (IPFS) has attracted attention, with its promise of re-decentralising the Web. Being increasingly used as a stand-alone application, IPFS has also emerged as the technical backbone of various other decentralised solutions and was even used to evade censorship. Decentralised applications serving millions of users rely on IPFS as one of their crucial building blocks. This popularity also makes IPFS attractive for large-scale attacks. We have identified a conceptual issue in one of IPFS's core libraries and demonstrate its exploitation by means of a successful end-to-end attack. We evaluated this attack against the IPFS reference implementation on the public IPFS network, which is used by the average user to share and consume IPFS content. The results obtained from mounting this attack on live IPFS nodes show that arbitrary IPFS nodes can be eclipsed, i.e. isolated from the network, with moderate effort and limited resources. Compared to similar works, we show that our attack scales well even beyond current network sizes and can disrupt the entire public IPFS network with alarmingly low effort. The vulnerability set described in this paper has been assigned CVE-2020-10937. Responsible disclosure procedures have led to mitigations being deployed. The issues presented in this paper were publicly disclosed together with Protocol Labs, the company coordinating the IPFS development in October 2020.
Post-Quantum Cryptography with Contemporary Co-Processors: Beyond Kronecker, Schönhage-Strassen & Nussbaumer
Joppe W. Bos, Joost Renes, and Christine van Vredendaal, NXP Semiconductors
There are currently over 30 billion IoT (Internet of Things) devices installed worldwide. To secure these devices from various threats one often relies on public-key cryptographic primitives whose operations can be costly to compute on resource-constrained IoT devices. To support such operations these devices often include a dedicated co-processor for cryptographic procedures, typically in the form of a big integer arithmetic unit. Such existing arithmetic co-processors do not offer the functionality that is expected by upcoming post-quantum cryptographic primitives. Regardless, contemporary systems may exist in the field for many years to come.
In this paper we propose the Kronecker+ algorithm for polynomial multiplication in rings of the form Z[X]/(X^n +1): the arithmetic foundation of many lattice-based cryptographic schemes. We discuss how Kronecker+ allows for re-use of existing co-processors for post-quantum cryptography, and in particular directly applies to the various finalists in the post-quantum standardization effort led by NIST. We demonstrate the effectiveness of our algorithm in practice by integrating Kronecker+ into Saber: one of the finalists in the ongoing NIST standardization effort. On our target platform, a RV32IMC with access to a dedicated arithmetic co-processor designed to accelerate RSA and ECC, Kronecker+ performs the matrix multiplication 2.8 times faster than regular Kronecker substitution and 1.7 times faster than Harvey's negated-evaluation-points method.
MAGE: Mutual Attestation for a Group of Enclaves without Trusted Third Parties
Guoxing Chen, Shanghai Jiao Tong University; Yinqian Zhang, Southern University of Science and Technology
Remote attestation mechanism enables an enclave to attest its identity (which is usually represented by the enclave's initial code and data) to another enclave. To verify that the attested identity is trusted, one enclave usually includes the identity of the enclave it trusts into its initial data in advance assuming no trusted third parties are available during runtime to provide this piece of information. However, when mutual trust between these two enclaves is required, it is infeasible to simultaneously include into their own initial data the other's identities respectively as any change to the initial data will change their identities, making the previously included identities invalid. In this paper, we propose MAGE, a framework enabling a group of enclaves to mutually attest each other without trusted third parties. Particularly, we introduce a technique to instrument these enclaves so that each of them could derive the others' identities using information solely from its own initial data. We also provide an open-sourced prototype implementation based on Intel SGX SDK, to facilitate enclave developers to adopt this technique.
Debloating Address Sanitizer
Yuchen Zhang, Stevens Institute of Technology; Chengbin Pang, Nanjing University; Georgios Portokalidis, Nikos Triandopoulos, and Jun Xu, Stevens Institute of Technology
Address Sanitizer (ASan) is a powerful memory error detector. It can detect various errors ranging from spatial issues like out-of-bound accesses to temporal issues like use-after-free. However, ASan has the major drawback of high runtime overhead. With every functionality enabled, ASan incurs an overhead of more than 1x.
This paper first presents a study to dissect the operations of ASan and inspects the primary sources of its runtime overhead. The study unveils (or confirms) that the high overhead is mainly caused by the extensive sanitizer checks on memory accesses. Inspired by the study, the paper proposes ASan--, a tool assembling a group of optimizations to reduce (or "debloat") sanitizer checks and improve ASan's efficiency. Unlike existing tools that remove sanitizer checks with harm to the capability, scalability, or usability of ASan, ASan-- fully maintains those decent properties of ASan.
Our evaluation shows that ASan-- presents high promise. It reduces the overhead of ASan by 41.7% on SPEC CPU2006 and by 35.7% on Chromium. If only considering the overhead incurred by sanitizer checks, the reduction rates increase to 51.6% on SPEC CPU2006 and 69.6% on Chromium. In the context of fuzzing, ASan-- increases the execution speed of AFL by over 40% and the branch coverage by 5%. Combined with orthogonal, fuzzing-tailored optimizations, ASan-- can speed up AFL by 60% and increase the branch coverage by 9%. Running in Chromium to support our daily work for four weeks, ASan-- did not present major usability issues or significant slowdown and it detected all the bugs we reproduced from previous reports.
Synthetic Data – Anonymisation Groundhog Day
Theresa Stadler, EPFL; Bristena Oprisanu, UCL; Carmela Troncoso, EPFL
Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing that addresses the shortcomings of traditional anonymisation techniques. The promise is that synthetic data drawn from generative models preserves the statistical properties of the original dataset but, at the same time, provides perfect protection against privacy attacks. In this work, we present the first quantitative evaluation of the privacy gain of synthetic data publishing and compare it to that of previous anonymisation techniques.
Our evaluation of a wide range of state-of-the-art generative models demonstrates that synthetic data either does not prevent inference attacks or does not retain data utility. In other words, we empirically show that synthetic data does not provide a better tradeoff between privacy and utility than traditional anonymisation techniques. Furthermore, in contrast to traditional anonymisation, the privacy-utility tradeoff of synthetic data publishing is hard to predict. Because it is impossible to predict what signals a synthetic dataset will preserve and what information will be lost, synthetic data leads to a highly variable privacy gain and unpredictable utility loss. In summary, we find that synthetic data is far from the holy grail of privacy-preserving data publishing.
FReD: Identifying File Re-Delegation in Android System Services
Sigmund Albert Gorski III, Seaver Thorn, and William Enck, North Carolina State University; Haining Chen, Google
The security of the Android platform benefits greatly from a privileged middleware that provides indirect access to protected resources. This architecture is further enhanced by privilege separating functionality into many different services and carefully tuning file access control policy to mitigate the impact of software vulnerabilities. However, these services can become confused deputies if they improperly re-delegate file access to third-party applications through remote procedure call (RPC) interfaces. In this paper, we propose a static program analysis tool called FReD, which identifies a mapping between Java-based system service RPC interfaces and the file paths opened within the Java and C/C++ portions of the service. It then combines the Linux-layer file access control policy with the Android-layer permission policy to identify potential file re-delegation. We use FReD to analyze three devices running Android 10 and identify 12 confused deputies that are accessible from third-party applications. These vulnerabilities include five CVEs with moderate severity, demonstrating the utility of semi-automated approaches to discover subtle flaws in access control enforcement.
WebGraph: Capturing Advertising and Tracking Information Flows for Robust Blocking
Sandra Siby, EPFL; Umar Iqbal, University of Iowa; Steven Englehardt, DuckDuckGo; Zubair Shafiq, UC Davis; Carmela Troncoso, EPFL
Users rely on ad and tracker blocking tools to protect their privacy. Unfortunately, existing ad and tracker blocking tools are susceptible to mutable advertising and tracking content. In this paper, we first demonstrate that a state-of-the-art ad and tracker blocker, AdGraph, is susceptible to such adversarial evasion techniques that are currently deployed on the web. Second, we introduce WebGraph, the first ML-based ad and tracker blocker that detects ads and trackers based on their action rather than their content. By featurizing the actions that are fundamental to advertising and tracking information flows – e.g., storing an identifier in the browser or sharing an identifier with another tracker – WebGraph performs nearly as well as prior approaches, but is significantly more robust to adversarial evasions. In particular, we show that WebGraph achieves comparable accuracy to AdGraph, while significantly decreasing the success rate of an adversary from near-perfect for AdGraph to around 8% for WebGraph. Finally, we show that WebGraph remains robust to sophisticated adversaries that use adversarial evasion techniques beyond those currently deployed on the web.
Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning
Shubham Jain, Ana-Maria Crețu, and Yves-Alexandre de Montjoye, Imperial College London
End-to-end encryption (E2EE) by messaging platforms enable people to securely and privately communicate with one another. Its widespread adoption however raised concerns that illegal content might now be shared undetected. Following the global pushback against key escrow systems, client-side scanning based on perceptual hashing has been recently proposed by tech companies, governments and researchers to detect illegal content in E2EE communications. We here propose the first framework to evaluate the robustness of perceptual hashing-based client-side scanning to detection avoidance attacks and show current systems to not be robust. More specifically, we propose three adversarial attacks–a general black-box attack and two white-box attacks for discrete cosine transform-based algorithms–against perceptual hashing algorithms. In a large-scale evaluation, we show perceptual hashing-based client-side scanning mechanisms to be highly vulnerable to detection avoidance attacks in a black-box setting, with more than 99.9% of images successfully attacked while preserving the content of the image. We furthermore show our attack to generate diverse perturbations, strongly suggesting that straightforward mitigation strategies would be ineffective. Finally, we show that the larger thresholds necessary to make the attack harder would probably require more than one billion images to be flagged and decrypted daily, raising strong privacy concerns. Taken together, our results shed serious doubts on the robustness of perceptual hashingbased client-side scanning mechanisms currently proposed by governments, organizations, and researchers around the world.
Elasticlave: An Efficient Memory Model for Enclaves
Jason Zhijingcheng Yu, National University of Singapore; Shweta Shinde, ETH Zurich; Trevor E. Carlson and Prateek Saxena, National University of Singapore
Trusted execution environments (TEEs) isolate user-space applications into secure enclaves without trusting the OS. Existing TEE memory models are rigid — they do not allow an enclave to share memory with other enclaves. This lack of essential functionality breaks compatibility with several constructs such as shared memory, pipes, and fast mutexes that are frequently required in data intensive use-cases. In this work, we present Elasticlave, a new TEE memory model which allows sharing. Elasticlave strikes a balance between security and flexibility in managing access permissions. Our implementation of Elasticlave on RISC-V achieves performance overheads of about 10% compared to native (non-TEE) execution for data sharing workloads. In contrast, a similarly secure implementation on a rigid TEE design incurs 1-2 orders of magnitude overheads for these workloads. Thus, Elasticlave enables cross-enclave data sharing with much better performance.
Practical Data Access Minimization in Trigger-Action Platforms
Yunang Chen and Mohannad Alhanahnah, University of Wisconsin–Madison; Andrei Sabelfeld, Chalmers University of Technology; Rahul Chatterjee and Earlence Fernandes, University of Wisconsin–Madison
Trigger-Action Platforms (TAPs) connect disparate online services and enable users to create automation rules in diverse domains such as smart homes and business productivity. Unfortunately, the current design of TAPs is flawed from a privacy perspective, allowing unfettered access to sensitive user data. We point out that it suffers from two types of overprivilege: (1) attribute-level, where it has access to more data attributes than it needs for running user-created rules; and (2) token-level, where it has access to more APIs than it needs. To mitigate overprivilege and subsequent privacy concerns we design and implement minTAP, a practical approach to data access minimization in TAPs. Our key insight is that the semantics of a user-created automation rule implicitly specifies the minimal amount of data it needs. This allows minTAP to leverage language-based data minimization to apply the principle of least-privilege by releasing only the necessary attributes of user data to TAPs and fending off unrelated API access. Using real user-created rules on the popular IFTTT TAP, we demonstrate that minTAP sanitizes a median of 4 sensitive data attributes per rule, with modest performance overhead and without modifying IFTTT.
Bedrock: Programmable Network Support for Secure RDMA Systems
Jiarong Xing, Kuo-Feng Hsu, Yiming Qiu, Ziyang Yang, Hongyi Liu, and Ang Chen, Rice University
Remote direct memory access (RDMA) has gained popularity in cloud datacenters. In RDMA, clients bypass server CPUs and directly read/write remote memory. Recent findings have highlighted a host of vulnerabilities with RDMA, which give rise to attacks such as packet injection, denial of service, and side channel leakage, but RDMA defenses are still lagging behind. As the RDMA datapath bypasses CPU-based software processing, traditional defenses cannot be easily inserted without incurring performance penalty. Bedrock develops a security foundation for RDMA inside the network, leveraging programmable data planes in modern network hardware. It designs a range of defense primitives, including source authentication, access control, as well as monitoring and logging, to address RDMA-based attacks. Bedrock does not incur software overhead to the critical datapath, and delivers native RDMA performance in data transfers. Moreover, Bedrock operates transparently to legacy RDMA systems, without requiring RNIC, OS, or RDMA library changes. We present a comprehensive set of experiments on Bedrock and demonstrate its effectiveness.
VerLoc: Verifiable Localization in Decentralized Systems
Katharina Kohls, Radboud University Nijmegen; Claudia Diaz, imec-COSIC KU Leuven and Nym Technologies SA
We tackle the challenge of reliably determining the geolocation of nodes in decentralized networks, considering adversarial settings and without depending on any trusted landmarks. In particular, we consider active adversaries that control a subset of nodes, announce false locations and strategically manipulate measurements. To address this problem we propose, implement and evaluate VerLoc, a system that allows verifying the claimed geo-locations of network nodes in a fully decentralized manner. VerLoc securely schedules roundtrip time (RTT) measurements between randomly chosen pairs of nodes. Trilateration is then applied to the set of measurements to verify claimed geo-locations. We evaluate VerLoc both with simulations and in the wild using a prototype implementation integrated in the Nym network (currently run by thousands of nodes). We find that VerLoc can localize nodes in the wild with a median error of 60 km, and that in attack simulations it is capable of detecting and filtering out adversarial timing manipulations for network setups with up to 20 % malicious nodes.
Lamphone: Passive Sound Recovery from a Desk Lamp's Light Bulb Vibrations
Ben Nassi, Yaron Pirutin, and Raz Swisa, Ben-Gurion University of the Negev; Adi Shamir, Weizmann Institute of Science; Yuval Elovici and Boris Zadov, Ben-Gurion University of the Negev
In this paper, we introduce "Lamphone," an optical side-channel attack used to recover sound from desk lamp light bulbs; such lamps are commonly used in home offices, which became a primary work setting during the COVID-19 pandemic. We show how fluctuations in the air pressure on the surface of a light bulb, which occur in response to sound and cause the bulb to vibrate very slightly (a millidegree vibration), can be exploited by eavesdroppers to recover speech passively, externally, and using equipment that provides no indication regarding its application. We analyze a light bulb's response to sound via an electro-optical sensor and learn how to isolate the audio signal from the optical signal. We compare Lamphone to related methods presented in other studies and show that Lamphone can recover sound at high quality and lower volume levels that those methods. Finally, we show that eavesdroppers can apply Lamphone in order to recover speech at the sound level of a virtual meeting with fair intelligibility when the victim is sitting/working at a desk that contains a desk lamp with a light bulb from a distance of 35 meters.
Automating Cookie Consent and GDPR Violation Detection
Dino Bollinger, Karel Kubicek, Carlos Cotrini, and David Basin, ETH Zurich
Distinguished Artifact Award Winner
The European Union's General Data Protection Regulation (GDPR) requires websites to inform users about personal data collection and request consent for cookies. Yet the majority of websites do not give users any choices, and others attempt to deceive them into accepting all cookies. We document the severity of this situation through an analysis of potential GDPR violations in cookie banners in almost 30k websites. We identify six novel violation types, such as incorrect category assignments and misleading expiration times, and we find at least one potential violation in a surprising 94.7% of the analyzed websites.
We address this issue by giving users the power to protect their privacy. We develop a browser extension, called CookieBlock, that uses machine learning to enforce GDPR cookie consent at the client. It automatically categorizes cookies by usage purpose using only the information provided in the cookie itself. At a mean validation accuracy of 84.4%, our model attains a prediction quality competitive with expert knowledge in the field. Additionally, our approach differs from prior work by not relying on the cooperation of websites themselves. We empirically evaluate CookieBlock on a set of 100 randomly sampled websites, on which it filters roughly 90% of the privacy-invasive cookies without significantly impairing website functionality.
LTrack: Stealthy Tracking of Mobile Phones in LTE
Martin Kotuliak, Simon Erni, Patrick Leu, Marc Röschlin, and Srdjan Čapkun, ETH Zurich
We introduce LTrack, a new tracking attack on LTE that allows an attacker to stealthily extract user devices' locations and permanent identifiers (IMSI). To remain stealthy, the localization of devices in LTrack is fully passive, relying on our new uplink/downlink sniffer. Our sniffer records both the times of arrival of LTE messages and the contents of the Timing Advance Commands, based on which LTrack calculates locations. LTrack is the first to show the feasibility of a passive localization in LTE through implementation on software-defined radio.
Passive localization attacks reveal a user's location traces but can at best link these traces to a device's pseudonymous temporary identifier (TMSI), making tracking in dense areas or over a long time-period challenging. LTrack overcomes this challenge by introducing and implementing a new type of IMSI Catcher named IMSI Extractor. It extracts a device's IMSI and binds it to its current TMSI. Instead of relying on fake base stations like existing IMSI Catchers, which are detectable due to their continuous transmission, IMSI Extractor relies on our uplink/downlink sniffer enhanced with surgical message overshadowing. This makes our IMSI Extractor the stealthiest IMSI Catcher to date.
We evaluate LTrack through a series of experiments and show that in line-of-sight conditions, the attacker can estimate the location of a phone with less than 6m error in 90% of the cases. We successfully tested our IMSI Extractor against a set of 17 modern smartphones connected to our industry-grade LTE testbed. We further validated our uplink/downlink sniffer and IMSI Extractor in a test facility of an operator.
How to Abuse and Fix Authenticated Encryption Without Key Commitment
Ange Albertini and Thai Duong, Google Research; Shay Gueron, University of Haifa and Amazon; Stefan Kölbl, Atul Luykx, and Sophie Schmieg, Google Research
Authenticated encryption (AE) is used in a wide variety of applications, potentially in settings for which it was not originally designed. Recent research tries to understand what happens when AE is not used as prescribed by its designers. A question given relatively little attention is whether an AE scheme guarantees "key commitment": ciphertext should only decrypt to a valid plaintext under the key used to generate the ciphertext. Generally, AE schemes do not guarantee key commitment as it is not part of AE's design goal. Nevertheless, one would not expect this seemingly obscure property to have much impact on the security of actual products. In reality, however, products do rely on key commitment. We discuss three recent applications where missing key commitment is exploitable in practice. We provide proof-of-concept attacks via a tool that constructs AES-GCM ciphertext which can be decrypted to two plaintexts valid under a wide variety of file formats, such as PDF, Windows executables, and DICOM. Finally we discuss two solutions to add key commitment to AE schemes which have not been analyzed in the literature: a generic approach that adds an explicit key commitment scheme to the AE scheme, and a simple fix which works for AE schemes like AES-GCM and ChaCha20Poly1305, but requires separate analysis for each scheme.
How Long Do Vulnerabilities Live in the Code? A Large-Scale Empirical Measurement Study on FOSS Vulnerability Lifetimes
Nikolaos Alexopoulos, Manuel Brack, Jan Philipp Wagner, Tim Grube, and Max Mühlhäuser, Technical University of Darmstadt
How long do vulnerabilities live in the repositories of large, evolving projects? Although the question has been identified as an interesting problem by the software community in online forums, it has not been investigated yet in adequate depth and scale, since the process of identifying the exact point in time when a vulnerability was introduced is particularly cumbersome. In this paper, we provide an automatic approach for accurately estimating how long vulnerabilities remain in the code (their lifetimes). Our method relies on the observation that while it is difficult to pinpoint the exact point of introduction for one vulnerability, it is possible to accurately estimate the average lifetime of a large enough sample of vulnerabilities, via a heuristic approach.
With our approach, we perform the first large-scale measurement of Free and Open Source Software vulnerability lifetimes, going beyond approaches estimating lower bounds prevalent in previous research. We find that the average lifetime of a vulnerability is around 4 years, varying significantly between projects (~2 years for Chromium, ~7 years for OpenSSL). The distribution of lifetimes can be approximately described by an exponential distribution. There are no statistically significant differences between the lifetimes of different vulnerability types when considering specific projects. Vulnerabilities are getting older, as the average lifetime of fixed vulnerabilities in a given year increases over time, influenced by the overall increase of code age. However, they live less than non-vulnerable code, with an increasing spread over time for some projects, suggesting a notion of maturity that can be considered an indicator of quality. While the introduction of fuzzers does not significantly reduce the lifetimes of memory-related vulnerabilities, further research is needed to better understand and quantify the impact of fuzzers and other tools on vulnerability lifetimes and on the security of codebases.
When Sally Met Trackers: Web Tracking From the Users' Perspective
Savino Dambra, EURECOM and Norton Research Group; Iskander Sanchez-Rola and Leyla Bilge, Norton Research Group; Davide Balzarotti, EURECOM
Web tracking has evolved to become a norm on the Internet. As a matter of fact, the web tracking market has grown to raise billions of dollars. Privacy cautious web practitioners and researchers extensively studied the phenomenon proving how widespread this practice is, and providing effective solutions to give users the option of feeling private while freely surfing the web. However, because all those studies looked at this trend only from the trackers' perspective, still there are a lot of unknowns regarding what the real impact of tracking is on real users. Our goal with this paper is to fill this gap in the web tracking topic. Thanks to logs of web browsing telemetry, we were able to look at this trend from the users' eyes. Precisely, we measure how fast a user encounters trackers and research on options to reduce her privacy risk. Moreover, we also estimate the fraction of browsing histories that are known by trackers and discuss two tracking strategies to increase the existing knowledge about users.
Regulator: Dynamic Analysis to Detect ReDoS
Robert McLaughlin, Fabio Pagani, Noah Spahn, Christopher Kruegel, and Giovanni Vigna, University of California, Santa Barbara
Regular expressions (regexps) are a convenient way for programmers to express complex string searching logic. Several popular programming languages expose an interface to a regexp matching subsystem, either by language-level primitives or through standard libraries. The implementations behind these matching systems vary greatly in their capabilities and running-time characteristics. In particular, backtracking matchers may exhibit worst-case running-time that is either linear, polynomial, or exponential in the length of the string being searched. Such super-linear worst-case regexps expose applications to Regular Expression Denial-of-Service (ReDoS) when inputs can be controlled by an adversarial attacker.
In this work, we investigate the impact of ReDoS in backtracking engines, a popular type of engine used by most programming languages. We evaluate several existing tools against a dataset of broadly collected regexps, and find that despite extensive theoretical work in this field, none are able to achieve both high precision and high recall. To address this gap in existing work, we develop Regulator, a novel dynamic, fuzzer-based analysis system for identifying regexps vulnerable to ReDoS. We implement this system by directly instrumenting a popular backtracking regexp engine, which increases the scope of supported regexp syntax and features over prior work. Finally, we evaluate this system against three common regexp datasets, and demonstrate a seven-fold increase in true positives discovered when comparing against existing tools.
Incremental Offline/Online PIR
Yiping Ma and Ke Zhong, University of Pennsylvania; Tal Rabin, University of Pennsylvania and Algorand Foundation; Sebastian Angel, University of Pennsylvania and Microsoft Research
Recent private information retrieval (PIR) schemes preprocess the database with a query-independent offline phase in order to achieve sublinear computation during a query-specific online phase. These offline/online protocols expand the set of applications that can profitably use PIR, but they make a critical assumption: that the database is immutable. In the presence of changes such as additions, deletions, or updates, existing schemes must preprocess the database from scratch, wasting prior effort. To address this, this paper introduces incremental preprocessing for offline/online PIR schemes, allowing the original preprocessing to continue to be used after database changes, while paying an update cost proportional to the number of changes rather than linear in the size of the database. We adapt two offline/online PIR schemes to use incremental preprocessing and show that our approach significantly improves throughput and reduces the latency of applications where the database changes over time.
Dos and Don'ts of Machine Learning in Computer Security
Daniel Arp, Technische Universität Berlin; Erwin Quiring, Technische Universität Braunschweig; Feargus Pendlebury, King's College London and Royal Holloway, University of London and The Alan Turing Institute; Alexander Warnecke, Technische Universität Braunschweig; Fabio Pierazzi, King's College London; Christian Wressnegger, KASTEL Security Research Labs and Karlsruhe Institute of Technology; Lorenzo Cavallaro, University College London; Konrad Rieck, Technische Universität Braunschweig
Distinguished Paper Award Winner
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment.
In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these pitfalls are widespread in the current security literature. In an empirical analysis, we further demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible. Furthermore, we identify open problems when applying machine learning in security and provide directions for further research.
Expected Exploitability: Predicting the Development of Functional Vulnerability Exploits
Octavian Suciu, University of Maryland, College Park; Connor Nelson, Zhuoer Lyu, and Tiffany Bao, Arizona State University; Tudor Dumitraș, University of Maryland, College Park
Assessing the exploitability of software vulnerabilities at the time of disclosure is difficult and error-prone, as features extracted via technical analysis by existing metrics are poor predictors for exploit development. Moreover, exploitability assessments suffer from a class bias because "not exploitable" labels could be inaccurate.
To overcome these challenges, we propose a new metric, called Expected Exploitability (EE), which reflects, over time, the likelihood that functional exploits will be developed. Key to our solution is a time-varying view of exploitability, a departure from existing metrics. This allows us to learn EE using data-driven techniques from artifacts published after disclosure, such as technical write-ups and proof-of-concept exploits, for which we design novel feature sets.
This view also allows us to investigate the effect of the label biases on the classifiers. We characterize the noise-generating process for exploit prediction, showing that our problem is subject to the most challenging type of label noise, and propose techniques to learn EE in the presence of noise.
On a dataset of 103,137 vulnerabilities, we show that EE increases precision from 49% to 86% over existing metrics, including two state-of-the-art exploit classifiers, while its precision substantially improves over time. We also highlight the practical utility of EE for predicting imminent exploits and prioritizing critical vulnerabilities.
We develop EE into an online platform which is publicly available at https://exploitability.app/.
ProFactory: Improving IoT Security via Formalized Protocol Customization
Fei Wang, Jianliang Wu, and Yuhong Nan, Purdue University; Yousra Aafer, University of Waterloo; Xiangyu Zhang and Dongyan Xu, Purdue University; Mathias Payer, EPFL
As IoT applications gain widespread adoption, it becomes important to design and implement IoT protocols with security. Existing research in protocol security reveals that the majority of disclosed protocol vulnerabilities are caused by incorrectly implemented message parsing and network state machines. Instead of testing and fixing those bugs after development, which is extremely expensive, we would like to avert them upfront. For this purpose, we propose ProFactory which formally and unambiguously models a protocol, checks model correctness, and generates a secure protocol implementation. We leverage ProFactory to generate a group of IoT protocols in the Bluetooth and Zigbee families and the evaluation demonstrates that 82 known vulnerabilities are averted. ProFactory will be publicly available.
Empirical Understanding of Deletion Privacy: Experiences, Expectations, and Measures
Mohsen Minaei, Purdue University; Mainack Mondal, IIT Kharagpur; Aniket Kate, Purdue University
In recent years, social platforms are heavily used by individuals to share their thoughts and personal information. However, due to regret over time about posting inappropriate social content, embarrassment, or even life or relationship changes, some past posts might also pose serious privacy concerns for them. To cope with these privacy concerns, social platforms offer deletion mechanisms that allow users to remove their contents. Quite naturally, these deletion mechanisms are really useful for removing past posts as and when needed. However, these same mechanisms also leave the users potentially vulnerable to attacks by adversaries who specifically seek the users' damaging content and exploit the act of deletion as a strong signal for identifying such content. Unfortunately, today user experiences and contextual expectations regarding such attacks on deletion privacy and deletion privacy in general are not well understood.
To that end, in this paper, we conduct a user survey-based exploration involving 191 participants to unpack their prior deletion experiences, their expectations of deletion privacy, and how effective they find the current deletion mechanisms. We find that more than 80% of the users have deleted at least a social media post, and users self-reported that, on average, around 35% of their deletions happened after a week of posting. While the participants identified the irrelevancy (due to time passing) as the main reason for content removal, most of them believed that deletions indicate that the deleted content includes some damaging information to the owner. Importantly, the participants are significantly more concerned about their deletions being noticed by large-scale data collectors (e.g., a third-party data collecting company or the government) than individuals from their social circle. Finally, the participants felt that popular deletion mechanisms, although very useful to help remove the content in multiple scenarios, are not very effective in protecting the privacy of those deletions. Consequently, they identify design guidelines for improving future deletion mechanisms.
Hiding in Plain Sight? On the Efficacy of Power Side Channel-Based Control Flow Monitoring
Yi Han, Matthew Chan, and Zahra Aref, Rutgers University; Nils Ole Tippenhauer, CISPA Helmholtz Center for Information Security; Saman Zonouz, Georgia Tech
Physical side-channel monitoring leverages the physical phenomena produced by a microcontroller (e.g. power consumption or electromagnetic radiation) to monitor program execution for malicious behavior. As such, it offers a promising intrusion detection solution for resource-constrained embedded systems, which are incompatible with conventional security measures. This method is especially relevant in safety and security-critical embedded systems such as in industrial control systems. Side-channel monitoring poses unique challenges for would-be attackers, such as (1) limiting attack vectors by being physically isolated from the monitored system, (2) monitoring immutable physical side channels with uninterpretable data-driven models, and (3) being specifically trained for the architectures and programs on which they are applied to. As a result, physical side-channel monitors are conventionally believed to provide a high level of security.
In this paper, we propose a novel attack to illustrate that, despite the many barriers to attack that side-channel monitoring systems create, they are still vulnerable to adversarial attacks. We present a method for crafting functional malware such that, when injected into a side-channel-monitored system, the detector is not triggered. Our experiments reveal that this attack is robust across detector models and hardware implementations. We evaluate our attack on the popular ARMmicrocontroller platform on several representative programs, demonstrating the feasibility of such an attack and highlighting the need for further research into side-channel monitors.
FUGIO: Automatic Exploit Generation for PHP Object Injection Vulnerabilities
Sunnyeo Park and Daejun Kim, KAIST; Suman Jana, Columbia University; Sooel Son, KAIST
A PHP object injection (POI) vulnerability is a security-critical bug that allows the remote code execution of class methods existing in a vulnerable PHP application. Exploiting this vulnerability often requires sophisticated property-oriented programming to shape an injection object. Existing off-the-shelf tools focus only on identifying potential POI vulnerabilities without confirming the presence of any exploit objects. To this end, we propose FUGIO, the first automatic exploit generation (AEG) tool for POI vulnerabilities. FUGIO conducts coarse-grained static and dynamic program analyses to generate a list of gadget chains that serve as blueprints for exploit objects. FUGIO then runs fuzzing campaigns using these identified chains and produces exploit objects. FUGIO generated 68 exploit objects from 30 applications containing known POI vulnerabilities with zero false positives. FUGIO also found two previously unreported POI vulnerabilities with five exploits, demonstrating its efficacy in generating functional exploits.
SAID: State-aware Defense Against Injection Attacks on In-vehicle Network
Lei Xue, The Hong Kong Polytechnic University Shenzhen Research Institute; Yangyang Liu, Tianqi Li, Kaifa Zhao, Jianfeng Li, Le Yu, and Xiapu Luo, The Hong Kong Polytechnic University; Yajin Zhou, Zhejiang University; Guofei Gu, Texas A&M University
Modern vehicles are equipped with many ECUs (Electronic Control Unit) that are connected to the IVN (In-Vehicle Network) for controlling the vehicles. Meanwhile, various interfaces of vehicles, such as OBD-II port, T-Box, sensors, and telematics, implement the interaction between the IVN and external environment. Although rich value-added functionalities can be provided through these interfaces, such as diagnostics and OTA (Over The Air) updates, the adversary may also inject malicious data into IVN, thus causing severe safety issues. Even worse, existing defense approaches mainly focus on detecting the injection attacks launched from IVN, such as malicious/compromised ECUs, by analyzing CAN frames, and cannot defend against the higher layer MIAs (Message Injection Attacks) that can cause abnormal vehicle dynamics. In this paper, we propose a new state-aware abnormal message injection attack defense approach, named SAID. It detects the abnormal data to be injected into IVN by considering the data semantics and the vehicle dynamics and prevents the MIAs launched from devices connected to the vehicles, such as the compromised diagnostic tools and T-boxes. We develop a prototype of SAID for defending against MIAs and evaluate it using both real road data and simulation data. The experimental results show that SAID can defend against more than 99% of the network and service layer attack traffic and all state layer MIAs, effectively enforcing the safety of vehicles.
A Large-scale Investigation into Geodifferences in Mobile Apps
Renuka Kumar, Apurva Virkud, Ram Sundara Raman, Atul Prakash, and Roya Ensafi, University of Michigan
Recent studies on the web ecosystem have been raising alarms on the increasing geodifferences in access to Internet content and services due to Internet censorship and geoblocking. However, geodifferences in the mobile app ecosystem have received limited attention, even though apps are central to how mobile users communicate and consume Internet content. We present the first large-scale measurement study of geodifferences in the mobile app ecosystem. We design a semi-automatic, parallel measurement testbed that we use to collect 5,684 popular apps from Google Play in 26 countries. In all, we collected 117,233 apk files and 112,607 privacy policies for those apps. Our results show high amounts of geoblocking with 3,672 apps geoblocked in at least one of our countries. While our data corroborates anecdotal evidence of takedowns due to government requests, unlike common perception, we find that blocking by developers is significantly higher than takedowns in all our countries, and has the most influence on geoblocking in the mobile app ecosystem. We also find instances of developers releasing different app versions to different countries, some with weaker security settings or privacy disclosures that expose users to higher security and privacy risks. We provide recommendations for app market proprietors to address the issues discovered.
Ferry: State-Aware Symbolic Execution for Exploring State-Dependent Program Paths
Shunfan Zhou, Zhemin Yang, and Dan Qiao, Fudan University; Peng Liu, The Pennsylvania State University; Min Yang, Fudan University; Zhe Wang and Chenggang Wu, State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences
Symbolic execution and fuzz testing are effective approaches for program analysis, thanks to their evolving path exploration approaches. The state-of-the-art symbolic execution and fuzzing techniques are able to generate valid program inputs to satisfy the conditional statements. However, they have very limited ability to explore program-state-dependent branches (state-dependent branches in this paper) which depend on earlier program execution instead of the current program inputs.
This paper is the first attempt to thoroughly explore the state-dependent branches in real-world programs. We introduce program-state-aware symbolic execution, a novel technique that guides symbolic execution engines to efficiently explore the state-dependent branches. As we show in this paper, state-dependent branches are prevalent in many important programs because they implement state machines to fulfill their application logic. Symbolically executing arbitrary programs with state-dependent branches is difficult, since there is a lack of unified specifications for their state machine implementation. Faced with this challenging problem, this paper recognizes widely-existing data dependency between current program states and previous inputs in a class of important programs. Our deep insights into these programs help us take a successful first step on this task. We design and implement a tool Ferry, which efficiently guides symbolic execution engine by automatically recognizing program states and exploring state-dependent branches. By applying Ferry to 13 different real-world programs and the comprehensive dataset Google FuzzBench, Ferry achieves higher block and branch coverage than two state-of-the-art symbolic execution engines and three popular fuzzers due to its ability to explore deep program logics, and manages to locate three 0-day vulnerabilities in jhead. Further, we show that Ferry is able to reach more program-state-dependent vulnerabilities than existing symbolic executors and fuzzing approaches with 15 collected state-dependent vulnerabilities and a test suite of six prominent programs. In the end, we test Ferry on LAVA-M dataset to understand its strengths and limitations.
Polynomial Commitment with a One-to-Many Prover and Applications
Jiaheng Zhang and Tiancheng Xie, UC Berkeley; Thang Hoang, Virginia Tech; Elaine Shi, CMU; Yupeng Zhang, Texas A&M University
Verifiable Secret Sharing (VSS) is a foundational cryptographic primitive that serves as an essential building block in multi-party computation and decentralized blockchain applications. One of the most practical ways to construct VSS is through a polynomial commitment, where the dealer commits to a random polynomial whose 0-th coefficient encodes the secret to be shared, and proves the evaluation of the committed polynomial at a different point to each of N verifiers, i.e., the polynomial commitment is used in a "one-to-many" fashion.
The recent work of Tomescu et al. (IEEE S&P 2020) was the first to consider polynomial commitment with "one-to-many prover batching", such that the prover can prove evaluations at N different points at the cost of Oe(1) proofs. However, their scheme is not optimal and requires a trusted setup.
In this paper, we asymptotically improve polynomial commitment with one-to-many prover batching. We propose two novel schemes. First, we propose a scheme with optimal asymptotics in all dimensions in the trusted setup setting. Second, we are the first to consider one-to-many prover batching for transparent polynomial commitments, and we propose a transparent scheme whose performance approximately matches the best-known scheme in the trusted setup setting.
We implement our schemes and evaluate their performance. Our scheme in the trusted setup setting improves the proof size by 20× and the verifier time by 7.8× for 2 21 parties, with a small overhead on the prover time. Our transparent polynomial commitment removes the trusted setup and further improves the prover time by 2.3×.
SGXLock: Towards Efficiently Establishing Mutual Distrust Between Host Application and Enclave for SGX
Yuan Chen, Jiaqi Li, Guorui Xu, and Yajin Zhou, Zhejiang University; Zhi Wang, Florida State University; Cong Wang, City University of Hong Kong; Kui Ren, Zhejiang University
Since its debut, SGX has been used to secure various types of applications. However, existing systems usually assume a trusted enclave and ignore the security issues caused by an untrusted enclave. For instance, a vulnerable (or even malicious) third-party enclave can be exploited to attack the host application and the rest of the system. In this paper, we propose an efficient mechanism to confine an untrusted enclave's behaviors. In particular, the threats of an untrusted enclave come from the enclave-host asymmetries, which can be abused to access arbitrary memory regions of its host application, jump to any code location after leaving the enclave and forge the stack register to manipulate the saved context. Our solution breaks such asymmetries and establishes mutual distrust between the host application and the enclave. Specifically, it leverages Intel MPK for efficient memory isolation and the x86 single-step debugging mechanism to capture the exiting event of the enclave. Then it performs the integrity check of the jump target and the stack pointer. We have implemented a prototype system and solved two practical challenges. The evaluation with multiple micro-benchmarks and representative real-world applications demonstrated the effectiveness and the efficiency of our system, with less than 4% performance overhead.
Omnes pro uno: Practical Multi-Writer Encrypted Database
Jiafan Wang and Sherman S. M. Chow, The Chinese University of Hong Kong
Multi-writer encrypted databases allow a reader to search over data contributed by multiple writers securely. Public-key searchable encryption (PKSE) appears to be the right primitive. However, its search latency is not welcomed in practice for having public-key operations linear in the entire database. In contrast, symmetric searchable encryption (SSE) realizes sublinear search, but it is inherently not multi-writer.
This paper aims for the best of both SSE and PKSE, i.e., sublinear search and multiple writers, by formalizing hybrid searchable encryption (HSE), with some seemingly conflicting yet desirable features, requiring new insights to achieve.
Our first contribution is a history-based security definition with new flavors of leakage concerning updates and writer corruptions, which are absent in the only known multi-writer notion of PKSE since it is vacuously secure against writers. HSE, built on top of dynamic SSE (DSSE), should satisfy the de facto standard of forward privacy. Its multi-writer support, again, makes the known approach (of secret state maintenance) fails. HSE should also feature efficient controllable search – each search can be confined to a different writer subset, while the search token size remains constant. For these, we devise a new partial rebuild technique and two new building blocks (of independent interests) – ID-coupling key-aggregate encryption and (optimal) epoch-based forward-private DSSE.
Our evaluation over real-world datasets shows that HSE, surpassing prior arts by orders of magnitude, is concretely efficient for popular multi-writer database applications.
Secure Poisson Regression
Mahimna Kelkar, Cornell Tech; Phi Hung Le, Mariana Raykova, and Karn Seth, Google
We introduce the first construction for secure two-party computation of Poisson regression, which enables two parties who hold shares of the input samples to learn only the resulting Poisson model while protecting the privacy of the inputs.
Our construction relies on new protocols for secure fixed-point exponentiation and correlated matrix multiplications. Our secure exponentiation construction avoids expensive bit decomposition and achieves orders of magnitude improvement in both online and offline costs over state of the art works. As a result, the dominant cost for our secure Poisson regression are matrix multiplications with one fixed matrix. We introduce a new technique, called correlated Beaver triples, which enables many such multiplications at the cost of roughly one matrix multiplication. This further brings down the cost of secure Poisson regression.
We implement our constructions and show their extreme efficiency. In a LAN setting, our secure exponentiation for 20-bit fractional precision takes less than 0.07ms with a batch-size of 100,000. One iteration of secure Poisson regression on a dataset with 10,000 samples with 1000 binary features needs about 65.82s in the offline phase, 55.14s in the online phase and 17MB total communication. For several real datasets this translates into training that takes seconds and only a couple of MB communication.
Watching the Watchers: Practical Video Identification Attack in LTE Networks
Sangwook Bae, Mincheol Son, Dongkwan Kim, CheolJun Park, Jiho Lee, Sooel Son, and Yongdae Kim, Korea Advanced Institute of Science and Technology (KAIST)
A video identification attack is a tangible privacy threat that can reveal videos that victims are watching. In this paper, we present the first study of a video identification attack in Long Term Evolution (LTE) networks. We discovered that, by leveraging broadcast radio signals, an unprivileged adversary equipped with a software-defined radio can 1) identify mobile users who are watching target videos of the adversary's interest and then 2) infer the video title that each of these users is watching. Using 46,810 LTE traces of three video streaming services from three cellular operators, we demonstrate that our attack achieves an accuracy of up to 0.985. We emphasize that this high level of accuracy stems from overcoming the unique challenges related to the operational logic of LTE networks and video streaming systems. Finally, we present an end-to-end attack scenario leveraging the presented video identification attack and propose countermeasures that are readily applicable to current LTE networks.
Automated Side Channel Analysis of Media Software with Manifold Learning
Yuanyuan Yuan, Qi Pang, and Shuai Wang, The Hong Kong University of Science and Technology
The prosperous development of cloud computing and machine learning as a service has led to the widespread use of media software to process confidential media data. This paper explores an adversary's ability to launch side channel analyses (SCA) against media software to reconstruct confidential media inputs. Recent advances in representation learning and perceptual learning inspired us to consider the reconstruction of media inputs from side channel traces as a cross-modality manifold learning task that can be addressed in a unified manner with an autoencoder framework trained to learn the mapping between media inputs and side channel observations. We further enhance the autoencoder with attention to localize the program points that make the primary contribution to SCA, thus automatically pinpointing information-leakage points in media software. We also propose a novel and highly effective defensive technique called perception blinding that can perturb media inputs with perception masks and mitigate manifold learning-based SCA.
Our evaluation exploits three popular media software to reconstruct inputs in image, audio, and text formats. We analyze three common side channels — cache bank, cache line, and page tables — and userspace-only cache set accesses logged by standard Prime+Probe. Our framework successfully re-constructs high-quality confidential inputs from the assessed media software and automatically pinpoint their vulnerable program points, many of which are unknown to the public. We further show that perception blinding can mitigate manifold learning-based SCA with negligible extra cost.
FOAP: Fine-Grained Open-World Android App Fingerprinting
Jianfeng Li, Hao Zhou, Shuohan Wu, and Xiapu Luo, The Hong Kong Polytechnic University; Ting Wang, Pennsylvania State University; Xian Zhan, The Hong Kong Polytechnic University; Xiaobo Ma, Xi'an Jiaotong University
Despite the widespread adoption of encrypted communication for mobile apps, adversaries can still identify apps or infer selected user activities of interest from encrypted mobile traffic via app fingerprinting (AF) attacks. However, most existing AF techniques only work under the closed-world assumption, thereby suffering potential precision decline when faced with apps unseen during model training. Moreover, serious privacy leakage often occurs when users conduct some sensitive operations, which are closely associated with specific UI components. Unfortunately, existing AF techniques are too coarse-grained to acquire such fine-grained sensitive information. In this paper, we take the first step to identify method-level fine-grained user action of Android apps in the open-world setting and present a systematic solution, dubbed FOAP, to address the above limitations. First, to effectively reduce false positive risks in the open-world setting, we propose a novel metric, named structural similarity, to adaptively filter out traffic segments irrelevant to the app of interest. Second, FOAP achieves fine-grained user action identification via synthesizing traffic and binary analysis. Specifically, FOAP identifies user actions on specific UI components through inferring entry point methods correlated with them. Extensive evaluations and case studies demonstrate that FOAP is not only reasonably accurate but also practical in fine-grained user activity inference and user privacy analysis.
Behind the Tube: Exploitative Monetization of Content on YouTube
Andrew Chu, University of Chicago; Arjun Arunasalam, Muslum Ozgur Ozmen, and Z. Berkay Celik, Purdue University
The YouTube video sharing platform is a prominent online presence that delivers various genres of content to society today. As the viewership and userbase of the platform grow, both individual users and larger companies have recognized the potential for monetizing this content. While content monetization is a native capability of the YouTube service, a number of requirements are enforced on the platform to prevent its abuse. Yet, methods to circumvent these requirements exist; many of which are potentially harmful to viewers and other users. In this paper, we present the first comprehensive study on exploitative monetization of content on YouTube. To do this, we first create two datasets; one using thousands of user posts from eleven forums whose users discuss monetization on YouTube, and one using listing data from five active sites that facilitate the purchase and sale of YouTube accounts. We then perform both manual and automated analysis to develop a view of illicit monetization exploits used on YouTube by both individual users and larger channel collectives. We discover six distinct exploits used to execute illicit content monetization on YouTube; four used by individual users, and two used by channel collectives. Further, we identify real-world evidence of each exploit on YouTube message board communities and provide insight into how each is executed. Through this, we present a comprehensive view of illicit monetization exploits on the YouTube platform that can motivate future investigation into mitigating these harmful endeavors.
SkillDetective: Automated Policy-Violation Detection of Voice Assistant Applications in the Wild
Jeffrey Young, Song Liao, and Long Cheng, Clemson University; Hongxin Hu, University at Buffalo; Huixing Deng, Clemson University
Today's voice personal assistant (VPA) services have been largely expanded by allowing third-party developers to build voice-apps and publish them to marketplaces (e.g., the Amazon Alexa and Google Assistant platforms). In an effort to thwart unscrupulous developers, VPA platform providers have specified a set of policy requirements to be adhered to by third-party developers, e.g., personal data collection is not allowed for kid-directed voice-apps. In this work, we aim to identify policy-violating voice-apps in current VPA platforms through a comprehensive dynamic analysis of voice-apps. To this end, we design and develop SkillDetective , an interactive testing tool capable of exploring voice-apps' behaviors and identifying policy violations in an automated manner. Distinctive from prior works, SkillDetective evaluates voice-apps' conformity to 52 different policy requirements in a broader context from multiple sources including textual, image and audio files. With SkillDetective , we tested 54,055 Amazon Alexa skills and 5,583 Google Assistant actions, and collected 518,385 textual outputs, approximately 2,070 unique audio files and 31,100 unique images from voice-app interactions. We identified 6,079 skills and 175 actions violating at least one policy requirement.
Hand Me Your PIN! Inferring ATM PINs of Users Typing with a Covered Hand
Matteo Cardaioli, Stefano Cecconello, Mauro Conti, and Simone Milani, University of Padua; Stjepan Picek, Delft University of Technology; Eugen Saraci, University of Padua
Automated Teller Machines (ATMs) represent the most used system for withdrawing cash. The European Central Bank reported more than 11 billion cash withdrawals and loading/unloading transactions on the European ATMs in 2019. Although ATMs have undergone various technological evolutions, Personal Identification Numbers (PINs) are still the most common authentication method for these devices. Unfortunately, the PIN mechanism is vulnerable to shoulder-surfing attacks performed via hidden cameras installed near the ATM to catch the PIN pad. To overcome this problem, people get used to covering the typing hand with the other hand. While such users probably believe this behavior is safe enough to protect against mentioned attacks, there is no clear assessment of this countermeasure in the scientific literature.
This paper proposes a novel attack to reconstruct PINs entered by victims covering the typing hand with the other hand. We consider the setting where the attacker can access an ATM PIN pad of the same brand/model as the target one. Afterward, the attacker uses that model to infer the digits pressed by the victim while entering the PIN. Our attack owes its success to a carefully selected deep learning architecture that can infer the PIN from the typing hand position and movements. We run a detailed experimental analysis including 58 users. With our approach, we can guess 30% of the 5-digit PINs within three attempts – the ones usually allowed by ATM before blocking the card. We also conducted a survey with 78 users that managed to reach an accuracy of only 7.92% on average for the same setting. Finally, we evaluate a shielding countermeasure that proved to be rather inefficient unless the whole keypad is shielded.
SyzScope: Revealing High-Risk Security Impacts of Fuzzer-Exposed Bugs in Linux kernel
Xiaochen Zou, Guoren Li, Weiteng Chen, Hang Zhang, and Zhiyun Qian, UC Riverside
Fuzzing has become one of the most effective bug finding approach for software. In recent years, 24*7 continuous fuzzing platforms have emerged to test critical pieces of software, e.g., Linux kernel. Though capable of discovering many bugs and providing reproducers (e.g., proof-of-concepts), a major problem is that they neglect a critical function that should have been built-in, i.e., evaluation of a bug's security impact. It is well-known that the lack of understanding of security impact can lead to delayed bug fixes as well as patch propagation. In this paper, we develop SyzScope, a system that can automatically uncover new "high-risk" impacts given a bug with seemingly "low-risk" impacts. From analyzing over a thousand low-risk bugs on syzbot, SyzScope successfully determined that 183 low-risk bugs (more than 15%) in fact contain high-risk impacts, e.g., control flow hijack and arbitrary memory write, some of which still do not have patches available yet.
Label Inference Attacks Against Vertical Federated Learning
Chong Fu, Zhejiang University; Xuhong Zhang and Shouling Ji, Binjiang Institute of Zhejiang University; Jinyin Chen, Zhejiang University of Technology; Jingzheng Wu, Institute of Software, Chinese Academy of Sciences; Shanqing Guo, Shandong University; Jun Zhou and Alex X. Liu, Ant Group; Ting Wang, Pennsylvania State University
As the initial variant of federated learning (FL), horizontal federated learning (HFL) applies to the situations where datasets share the same feature space but differ in the sample space, e.g., the collaboration between two regional banks, while trending vertical federated learning (VFL) deals with the cases where datasets share the same sample space but differ in the feature space, e.g., the collaboration between a bank and an e-commerce platform.
Although various attacks have been proposed to evaluate the privacy risks of HFL, yet, few studies, if not none, have explored that for VFL. Considering that the typical application scenario of VFL is that a few participants (usually two) collaboratively train a machine learning (ML) model with features distributed among them but labels owned by only one of them, protecting the privacy of the labels owned by one participant should be a fundamental guarantee provided by VFL, as the labels might be highly sensitive, e.g., whether a person has a certain kind of disease. However, we discover that the bottom model structure and the gradient update mechanism of VFL can be exploited by a malicious participant to gain the power to infer the privately owned labels. Worse still, by abusing the bottom model, he/she can even infer labels beyond the training dataset. Based on our findings, we propose a set of novel label inference attacks against VFL. Our experiments show that the proposed attacks achieve an outstanding performance. We further share our insights and discuss possible defenses. Our research can shed light on the hidden privacy risks of VFL and pave the way for new research directions towards more secure VFL.
Under the Hood of DANE Mismanagement in SMTP
Hyeonmin Lee, Seoul National University; Md. Ishtiaq Ashiq, Virginia Tech; Moritz Müller, SIDN Labs; Roland van Rijswijk-Deij, University of Twente & NLnet Labs; Taekyoung "Ted" Kwon, Seoul National University; Taejoong Chung, Virginia Tech
The DNS-based Authentication of Named Entities (DANE) is an Internet security protocol that enables a TLS connection without relying on trusted third parties like CAs by introducing a new DNS record type, TLSA. DANE leverages DNSSEC PKI to provide the integrity and authenticity of TLSA records. As DANE can solve security challenges in SMTP, such as STARTTLS downgrade attacks and receiver authentication, it has been increasingly deployed surpassing more than 1 M domains with SMTP servers that have TLSA records. A recent study, however, reported that there are prevalent misconfigurations on DANE SMTP servers, which hinders DANE from being proliferated.
In this paper, we investigate the reasons why it is hard to deploy and manage DANE correctly. Our study uses largescale, longitudinal measurements to study DANE adoption and management, coupled with a survey of DANE operators, some of which serve more than 100 K domains. Overall, we find that keeping the TLSA records from a name server and certificates from an SMTP server synchronized is not straightforward even when the same entity manages the two servers. Furthermore, many of the certificates are configured to be reissued automatically, which may result in invalid TLSA records. From surveying 39 mail server operators, we also learn that the majority keeps using CA-issued certificates, despite this no longer being required with DANE, since they are worried about their certificates not being trusted by clients that have not deployed DANE. Having identified several operational challenges for correct DANE management, we release automated tools and shed light on unsolved challenges.
Lend Me Your Ear: Passive Remote Physical Side Channels on PCs
Daniel Genkin, Georgia Tech; Noam Nissan, Tel Aviv University; Roei Schuster, Tel Aviv University and Cornell Tech; Eran Tromer, Tel Aviv University and Columbia University
We show that built-in sensors in commodity PCs, such as microphones, inadvertently capture electromagnetic side-channel leakage from ongoing computation. Moreover, this information is often conveyed by supposedly-benign channels such as audio recordings and common Voice-over-IP applications, even after lossy compression.
Thus, we show, it is possible to conduct physical side-channel attacks on computation by remote and purely passive analysis of commonly-shared channels. These attacks require neither physical proximity (which could be mitigated by distance and shielding), nor the ability to run code on the target or configure its hardware. Consequently, we argue, physical side channels on PCs can no longer be excluded from remote-attack threat models.
We analyze the computation-dependent leakage captured by internal microphones, and empirically demonstrate its efficacy for attacks. In one scenario, an attacker steals the secret ECDSA signing keys of the counterparty in a voice call. In another, the attacker detects what web page their counterparty is loading. In the third scenario, a player in the Counter-Strike online multiplayer game can detect a hidden opponent waiting in ambush, by analyzing how the 3D rendering done by the opponent's computer induces faint but detectable signals into the opponent's audio feed.
99% False Positives: A Qualitative Study of SOC Analysts' Perspectives on Security Alarms
Bushra A. Alahmadi, Louise Axon, and Ivan Martinovic, University of Oxford
In this work, we focus on the prevalence of False Positive (FP) alarms produced by security tools, and Security Operation Centers (SOCs) practitioners' perception of their quality. In an online survey we conducted with security practitioners (n = 20) working in SOCs, practitioners confirmed the high FP rates of the tools used, requiring manual validation. With these findings in mind, we conducted a broader, discovery-orientated, qualitative investigation with security practitioners (n = 21) of the limitations of security tools, particularly their alarms' quality and validity. Our results highlight that, despite the perceived volume of FPs, most are attributed to benign triggers---true alarms, explained by legitimate behavior in the organization's environment, which analysts may choose to ignore. To properly evaluate security tools' adequacy and performance, it is critical that vendors and researchers are able make such distinctions between types of FP. Alarm validation is a tedious task that can cause alarm burnout and eventually desensitization. Therefore, we investigated the process of alarm validation in SOCs, identifying factors that may influence the outcome of this process. To improve security alarm quality, we elicit five properties (Reliable, Explainable, Analytical, Contextual, Transferable) required to foster effective and quick validation of alarms. Incorporating these requirements in future tools will not only reduce alarm burnout but improve SOC analysts' decision-making process by generating interpretable and meaningful alarms that enable prompt reaction.
Fuzzware: Using Precise MMIO Modeling for Effective Firmware Fuzzing
Tobias Scharnowski, Nils Bars, and Moritz Schloegel, Ruhr-Universität Bochum; Eric Gustafson, UC Santa Barbara; Marius Muench, Vrije Universiteit Amsterdam; Giovanni Vigna, UC Santa Barbara and VMware; Christopher Kruegel, UC Santa Barbara; Thorsten Holz and Ali Abbasi, Ruhr-Universität Bochum
Distinguished Artifact Award Winner
As embedded devices are becoming more pervasive in our everyday lives, they turn into an attractive target for adversaries. Despite their high value and large attack surface, applying automated testing techniques such as fuzzing is not straightforward for such devices. As fuzz testing firmware on constrained embedded devices is inefficient, state-of-the-art approaches instead opt to run the firmware in an emulator (through a process called re-hosting). However, existing approaches either use coarse-grained static models of hardware behavior or require manual effort to re-host the firmware.
We propose a novel combination of lightweight program analysis, re-hosting, and fuzz testing to tackle these challenges. We present the design and implementation of Fuzzware, a software-only system to fuzz test unmodified monolithic firmware in a scalable way. By determining how hardware-generated values are actually used by the firmware logic, Fuzzware can automatically generate models that help focusing the fuzzing process on mutating the inputs that matter, which drastically improves its effectiveness.
We evaluate our approach on synthetic and real-world targets comprising a total of 19 hardware platforms and 77 firmware images. Compared to state-of-the-art work, Fuzzware achieves up to 3.25 times the code coverage and our modeling approach reduces the size of the input space by up to 95.5%. The synthetic samples contain 66 unit tests for various hardware interactions, and we find that our approach is the first generic re-hosting solution to automatically pass all of them. Fuzzware discovered 15 completely new bugs including bugs in targets which were previously analyzed by other works; a total of 12 CVEs were assigned.
SIMC: ML Inference Secure Against Malicious Clients at Semi-Honest Cost
Nishanth Chandran, Divya Gupta, and Sai Lakshmi Bhavana Obbattu, Microsoft Research; Akash Shah, UCLA
Secure inference allows a model owner (or, the server) and the input owner (or, the client) to perform inference on machine learning model without revealing their private information to each other. A large body of work has shown efficient cryptographic solutions to this problem through secure 2- party computation. However, they assume that both parties are semi-honest, i.e., follow the protocol specification. Recently, Lehmkuhl et al. showed that malicious clients can extract the whole model of the server using novel model-extraction attacks. To remedy the situation, they introduced the client-malicious threat model and built a secure inference system, MUSE, that provides security guarantees, even when the client is malicious.
In this work, we design and build SIMC, a new cryptographic system for secure inference in the client malicious threat model. On secure inference benchmarks considered by MUSE, SIMC has 23 − 29× lesser communication and is up to 11.4× faster than MUSE. SIMC obtains these improvements using a novel protocol for non-linear activation functions (such as ReLU) that has > 28× lesser communication and is up to 43× more performant than MUSE. In fact, SIMC's performance beats the state-of-the-art semi-honest secure inference system!
Finally, similar to MUSE, we show how to push the majority of the cryptographic cost of SIMC to an input independent preprocessing phase. While the cost of the online phase of this protocol, SIMC++, is same as that of MUSE, the overall improvements of SIMC translate to similar improvements to the preprocessing phase of MUSE.
HyperDegrade: From GHz to MHz Effective CPU Frequencies
Alejandro Cabrera Aldaya and Billy Bob Brumley, Tampere University
Performance degradation techniques are an important complement to side-channel attacks. In this work, we propose HYPERDEGRADE—a combination of previous approaches and the use of simultaneous multithreading (SMT) architectures. In addition to the new technique, we investigate the root causes of performance degradation using cache eviction, discovering a previously unknown slowdown origin. The slowdown produced is significantly higher than previous approaches, which translates into an increased time granularity for FLUSH+RELOAD attacks. We evaluate HYPERDEGRADE on different Intel microarchitectures, yielding significant slowdowns that achieve, in select microbenchmark cases, three orders of magnitude improvement over state-of-the-art. To evaluate the efficacy of performance degradation in side-channel amplification, we propose and evaluate leakage assessment metrics. The results evidence that HYPERDEGRADE increases time granularity without a meaningful impact on trace quality. Additionally, we designed a fair experiment that compares three performance degradation strategies when coupled with FLUSH+RELOAD from an attacker perspective. We developed an attack on an unexploited vulnerability in OpenSSL in which HYPERDEGRADE excels—reducing by three times the number of required FLUSH+RELOAD traces to succeed. Regarding cryptography contributions, we revisit the recently proposed Raccoon attack on TLS-DH key exchanges, demonstrating its application to other protocols. Using HYPERDEGRADE, we developed an end-to-end attack that shows how a Raccoon-like attack can succeed with real data, filling a missing gap from previous research.
DoLTEst: In-depth Downlink Negative Testing Framework for LTE Devices
CheolJun Park, Sangwook Bae, BeomSeok Oh, Jiho Lee, Eunkyu Lee, Insu Yun, and Yongdae Kim, Korea Advanced Institute of Science and Technology (KAIST)
An implementation flaw in LTE control plane protocols at end-user devices directly leads to severe security threats. In order to uncover these flaws, conducting negative testing is a promising approach, whose test case only contains invalid or prohibited messages. Despite its importance, the cellular standard mostly focuses on positive test cases, producing many implementation vulnerabilities unchecked, as evidenced by many existing vulnerabilities. To fill this gap, we present DOLTEST, a negative testing framework, which can comprehensively test an end-user device. Enumerable test cases with a deterministic oracle produced from detailed specification analysis make it suitable to be used as a standard to find implementation vulnerabilities. We uncovered 26 implementation flaws from 43 devices from 5 different baseband manufacturers by using DOLTEST, demonstrating its effectiveness.
GhostTouch: Targeted Attacks on Touchscreens without Physical Touch
Kai Wang, Zhejiang University; Richard Mitev, Technical University of Darmstadt; Chen Yan and Xiaoyu Ji, Zhejiang University; Ahmad-Reza Sadeghi, Technical University of Darmstadt; Wenyuan Xu, Zhejiang University
Capacitive touchscreens have become the primary human-machine interface for personal devices such as smartphones and tablets. In this paper, we present GhostTouch, the first active contactless attack against capacitive touchscreens. GhostTouch uses electromagnetic interference (EMI) to inject fake touch points into a touchscreen without the need to physically touch it. By tuning the parameters of the electromagnetic signal and adjusting the antenna, we can inject two types of basic touch events, taps and swipes, into targeted locations of the touchscreen and control them to manipulate the underlying device. We successfully launch the GhostTouch attacks on nine smartphone models. We can inject targeted taps continuously with a standard deviation of as low as 14.6 x 19.2 pixels from the target area, a delay of less than 0.5s and a distance of up to 40mm. We show the real-world impact of the GhostTouch attacks in a few proof-of-concept scenarios, including answering an eavesdropping phone call, pressing the button, swiping up to unlock, and entering a password. Finally, we discuss potential hardware and software countermeasures to mitigate the attack.
RE-Mind: a First Look Inside the Mind of a Reverse Engineer
Alessandro Mantovani and Simone Aonzo, EURECOM; Yanick Fratantonio, Cisco Talos; Davide Balzarotti, EURECOM
When a human activity requires a lot of expertise and very specialized cognitive skills that are poorly understood by the general population, it is often considered `an art.' Different activities in the security domain have fallen in this category, such as exploitation, hacking, and the main focus of this paper: binary reverse engineering (RE).
However, while experts in many areas (ranging from chess players to computer programmers) have been studied by scientists to understand their mental models and capture what is special about their behavior, the `art' of understanding binary code and solving reverse engineering puzzles remains to date a black box.
In this paper, we present a measurement of the different strategies adopted by expert and beginner reverse engineers while approaching the analysis of x86 (dis)assembly code, a typical static RE task. We do that by performing an exploratory analysis of data collected over 16,325 minutes of RE activity of two unknown binaries from 72 participants with different experience levels: 39 novices and 33 experts.
RapidPatch: Firmware Hotpatching for Real-Time Embedded Devices
Yi He and Zhenhua Zou, Tsinghua University and BNRist; Kun Sun, George Mason University; Zhuotao Liu and Ke Xu, Tsinghua University and BNRist; Qian Wang, Wuhan University; Chao Shen, Xi'an Jiaotong University; Zhi Wang, Florida State University; Qi Li, Tsinghua University and BNRist
Nowadays real-time embedded devices are becoming one main target of cyber attacks. A huge number of embedded devices equipped with outdated firmware are subject to various vulnerabilities, but they cannot be timely patched due to two main reasons. First, it is difficult for vendors who have various types of fragmented devices to generate patches for each type of device. Second, it is challenging to deploy patches on many embedded devices without restarting or halting real-time tasks, hindering the patch installation on devices (e.g., industrial control devices) that have high availability requirements. In this paper, we present RapidPatch, a new hotpatching framework to facilitate patch propagation by installing generic patches without disrupting other tasks running on heterogeneous embedded devices. RapidPatch allows RTOS developers to directly release common patches for all downstream devices so that device maintainers can easily generate device-specific patches for different firmware. We utilize eBPF virtual machines to execute patches on resource-constrained embedded devices and develop three hotpatching strategies to support hotpatching for all major microcontroller (MCU) architectures. In particular, we propose two types of eBPF patches for different types of vulnerabilities and develop an eBPF patch verifier to ensure patch safety. We evaluate RapidPatch with major CVEs on four major RTOSes running on different embedded devices. We find that over 90% vulnerabilities can be hotpatched via RapidPatch. Our system can work on devices with 64 KB or more memory and 64 MHz MCU frequency. The average patch delay is less than 8 µs and the overall latency overhead is less than 0.6%.
Towards Automatically Reverse Engineering Vehicle Diagnostic Protocols
Le Yu, Yangyang Liu, Pengfei Jing, Xiapu Luo, Lei Xue, and Kaifa Zhao, The Hong Kong Polytechnic University; Yajin Zhou, Zhejiang University; Ting Wang, The Pennsylvania State University; Guofei Gu, Texas A&M University; Sen Nie and Shi Wu, Tencent Keen Security Lab
In-vehicle protocols are very important to the security assessment and protection of modern vehicles since they are used in communicating with, accessing, and even manipulating ECUs (Electronic Control Units) that control various vehicle components. Unfortunately, the majority of in-vehicle protocols are proprietary without publicly-available documentations. Although recent studies proposed methods to reverse engineer the CAN protocol used in the communication among ECUs, they cannot be applied to vehicle diagnostics protocols, which have been widely exploited by attackers to launch remote attacks. In this paper, we propose a novel framework for automatically reverse engineering the diagnostic protocols by leveraging professional diagnostic tools for vehicles. Specifically, we design and develop a new cyber-physical system that uses a set of algorithms to control a programmable robotics arm with the aid of cameras to automatically trigger and capture the messages of diagnostics protocols as well as reverse engineer their formats, semantic meanings, proprietary formulas used for processing the response messages. We perform a large scale experiment to evaluate our prototype by using 18 real vehicles. It successfully reverses engineers 570 messages (446 for reading sensor values and 124 for controlling components). The experimental results show that our framework achieves high precision in reverse engineering proprietary formulas and obtains much more messages than the prior approach based on app analysis.
Rolling Colors: Adversarial Laser Exploits against Traffic Light Recognition
Chen Yan, Zhejiang University; Zhijian Xu, Zhejiang University and The Chinese University of Hong Kong; Zhanyuan Yin, The University of Chicago; Xiaoyu Ji and Wenyuan Xu, Zhejiang University
Traffic light recognition is essential for fully autonomous driving in urban areas. In this paper, we investigate the feasibility of fooling traffic light recognition mechanisms by shedding laser interference on the camera. By exploiting the rolling shutter of CMOS sensors, we manage to inject a color stripe overlapped on the traffic light in the image, which can cause a red light to be recognized as a green light or vice versa. To increase the success rate, we design an optimization method to search for effective laser parameters based on empirical models of laser interference. Our evaluation in emulated and real-world setups on 2 state-of-the-art recognition systems and 5 cameras reports a maximum success rate of 30% and 86.25% for Red-to-Green and Green-to-Red attacks. We observe that the attack is effective in continuous frames from more than 40 meters away against a moving vehicle, which may cause end-to-end impacts on self-driving such as running a red light or emergency stop. To mitigate the threat, we propose redesigning the rolling shutter mechanism.