Person:

Smith, Michael

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Smith

First Name

Michael

Name

Smith, Michael

Search Results

Now showing 1 - 10 of 17
  • Publication

    Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler

    (IEEE Computer Society, 2007) Federova, Alexandra; Seltzer, Margo; Smith, Michael

    We describe a new operating system scheduling algorithm that improves performance isolation on chip multiprocessors (CMP). Poor performance isolation occurs when an application’s performance is determined by the behaviour of its co-runners, i.e., other applications simultaneously running with it. This performance dependency is caused by unfair, corunner-dependent cache allocation on CMPs. Poor performance isolation interferes with the operating system’s control over priority enforcement and hinders QoS provisioning. Previous solutions required modifications to the hardware. We present a new software solution. Our cache-fair algorithm ensures that the application runs as quickly as it would under fair cache allocation, regardless of how the cache is actually allocated. If the thread executes fewer instructions per cycle than it would under fair cache allocation, the scheduler increases that thread’s CPU timeslice. This way, the thread’s overall performance does not suffer because it is allowed to use the CPU longer. We describe our implementation of the algorithm in Solaris™ 10, and show that it significantly improves performance isolation for SPEC CPU, SPEC JBB and TPC-C.

  • Publication

    Eliminating voltage emergencies via software-guided code transformations

    (Association for Computing Machinery (ACM), 2010) Reddi, Vijay Janapa; Campanoni, Simone; Gupta, Meeta S.; Smith, Michael; Wei, Gu-Yeon; Brooks, David; Hazelwood, Kim

    In recent years, circuit reliability in modern high-performance processors has become increasingly important. Shrinking feature sizes and diminishing supply voltages have made circuits more sensitive to microprocessor supply voltage fluctuations. These fluctuations result from the natural variation of processor activity as workloads execute, but when left unattended, these voltage fluctuations can lead to timing violations or even transistor lifetime issues. In this article, we present a hardware--software collaborative approach to mitigate voltage fluctuations. A checkpoint-recovery mechanism rectifies errors when voltage violates maximum tolerance settings, while a runtime software layer reschedules the program's instruction stream to prevent recurring violations at the same program location. The runtime layer, combined with the proposed code-rescheduling algorithm, removes 60% of all violations with minimal overhead, thereby significantly improving overall performance. Our solution is a radical departure from the ongoing industry-standard approach to circumvent the issue altogether by optimizing for the worst-case voltage flux, which compromises power and performance efficiency severely, especially looking ahead to future technology generations. Existing conservative approaches will have severe implications on the ability to deliver efficient microprocessors. The proposed technique reassembles a traditional reliability problem as a runtime performance optimization problem, thus allowing us to design processors for typical case operation by building intelligent algorithms that can prevent recurring violations.

  • Publication

    Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling

    (IEEE, 2010) Reddi, Vijay Janapa; Kanev, Svilen; Kim, Wonyoung; Campanoni, Simone; Smith, Michael; Wei, Gu-Yeon; Brooks, David

    Parameter variations have become a dominant challenge in microprocessor design. Voltage variation is especially daunting because it happens so rapidly. We measure and characterize voltage variation in a running Intel Core2 Duo processor. By sensing on-die voltage as the processor runs single-threaded, multi-threaded, and multi-program workloads, we determine the average supply voltage swing of the processor to be only 4 percent, far from the processor's 14percent worst-case operating voltage margin. While such large margins guarantee correctness, they penalize performance and power efficiency. We investigate and quantify the benefits of designing a processor for typical-case (rather than worst-case) voltage swings, assuming that a fail-safe mechanism protects it from infrequently occurring large voltage fluctuations. With today's processors, such resilient designs could yield 15 percent to 20 percent performance improvements. But we also show that in future systems, these gains could be lost as increasing voltage swings intensify the frequency of fail-safe recoveries. After characterizing micro architectural activity that leads to voltage swings within multi-core systems, we show that a voltage-noise-aware thread scheduler in software can co-schedule phases of different programs to mitigate error recovery overheads in future resilient processor designs.

  • Publication

    Voltage Noise in Production Processors

    (Institute of Electrical & Electronics Engineers (IEEE), 2011) Janapa Reddi, Vijay; Kanev, Svilen; Kim, Wonyoung; Campanoni, Simone; Smith, Michael; Wei, Gu-Yeon; Brooks, David

    Voltage variations are a major challenge in processor design. Here, researchers characterize the voltage noise characteristics of programs as they run to completion on a production Core 2 Duo processor. Furthermore, they characterize the implications of resilient architecture design for voltage variation in future systems.

  • Publication

    Cache-Fair Thread Scheduling for Multicore Processors

    (2006) Fedorova, Alexandra; Seltzer, Margo; Smith, Michael

    We present a new operating system scheduling algorithm for multicore processors. Our algorithm reduces the effects of unequal CPU cache sharing that occur on these processors and cause unfair CPU sharing, priority inversion, and inadequate CPU accounting. We describe the implementation of our algorithm in the Solaris operating system and demonstrate that it produces fairer schedules enabling better priority enforcement and improved performance stability for applications. With conventional scheduling algorithms, application performance on multicore processors varies by up to 36% depending on the runtime characteristics of concurrent processes. We reduce this variability by up to a factor of seven.

  • Publication

    Quality and speed in linear-scan register allocation

    (1997) Traub, Omri; Holloway, Glenn; Smith, Michael

    A linear-scan algorithm directs the global allocation of register candidates to registers based on a simple linear sweep over the program being compiled. This approach to register allocation makes sense for systems, such as those for dynamic compilation, where compilation speed is important. In contrast, most commercial and research optimizing compilers rely on a graph-coloring approach to global register allocation. In this paper, we compare the performance of a linear-scan method against a modern graph-coloring method. We implement both register allocators within the Machine SUIF extension of the Stanford SUIF compiler system. Experimental results show that linear scan is much faster than coloring on benchmarks with large numbers of register candidates. We also describe improvements to the linear-scan approach that do not change its linear character, but allow it to produce code of a quality near to that produced by graph coloring.

  • Publication

    A Public-Key Infrastructure for Key Distribution in TinyOS Based on Elliptic Curve Cryptography

    (IEEE, 2004) Malan, David; Welsh, Matt; Smith, Michael

    We present the first known implementation of elliptic curve cryptography over F2p for sensor networks based on the 8-bit, 7.3828-MHz MICA2 mote. Through instrumentation of UC Berkeley's TinySec module, we argue that, although secret-key cryptography has been tractable in this domain for some time, there has remained a need for an efficient, secure mechanism for distribution of secret keys among nodes. Although public-key infrastructure has been thought impractical, we argue, through analysis of our own implementation for TinyOS of multiplication of points on elliptic curves, that public-key infrastructure is, in fact, viable for TinySec keys' distribution, even on the MICA2. We demonstrate that public keys can be generated within 34 seconds, and that shared secrets can be distributed among nodes in a sensor network within the same, using just over 1 kilobyte of SRAM and 34 kilobytes of ROM.

  • Publication

    Implementing Public-Key Infrastructure for Sensor Networks

    (Association for Computing Machinery, 2008) Malan, David; Welsh, Matt; Smith, Michael

    We present a critical evaluation of the first known implementation of elliptic curve cryptography over F2p for sensor networks based on the 8-bit, 7.3828-MHz MICA2 mote. We offer, along the way, a primer for those interested in the field of cryptography for sensor networks. We discuss, in particular, the decisions underlying our design and alternatives thereto. And we elaborate on the methodologies underlying our evaluation.

    Through instrumentation of UC Berkeley's TinySec module, we argue that, although symmetric cryptography has been tractable in this domain for some time, there has remained a need, unfulfilled until recently, for an efficient, secure mechanism for distribution of secret keys among nodes. Although public-key infrastructure has been thought impractical, we show, through analysis of our original implementation for TinyOS of point multiplication on elliptic curves, that public-key infrastructure is indeed viable for TinySec keys' distribution, even on the MICA2. We demonstrate that public keys can be generated within 34 seconds and that shared secrets can be distributed among nodes in a sensor network within the same time, using just over 1 kilobyte of SRAM and 34 kilobytes of ROM. We demonstrate that communication costs are minimal, with only 2 packets required for transmission of a public key among nodes. We make available all of our source code for other researchers to download and use. And we discuss recent results based on our work that corroborate and improve upon our conclusions.

  • Publication

    Exploiting Temporal Consistency to Reduce False Positives in Host-Based, Collaborative Detection of Worms

    (Association for Computing Machinery, 2006) Malan, David J.; Smith, Michael

    The speed of today’s worms demands automated detection, but the risk of false positives poses a difficult problem. In prior work, we proposed a host-based intrusion-detection system for worms that leveraged collaboration among peers to lower its risk of false positives, and we simulated this approach for a system with two peers. In this paper, we build upon that work and evaluate our ideas “in the wild.” We implement Wormboy 2.0, a prototype of our vision that allows us to quantify and compare worms’ and non-worms’ temporal consistency, similarity over time in worms’ and non-worms’ invocations of system calls. We deploy our prototype to a network of 30 hosts running Windows XP with Service Pack 2 to monitor and analyze 10,776 processes, inclusive of 511 unique non-worms (873 if we consider unique versions to be unique non-worms). We identify properties with which we can distinguish non-worms from worms 99% of the time. We find that our collaborative architecture, using patterns of system calls and simple heuristics, can detect worms running on multiple peers. And we find that collaboration among peers significantly reduces our probability of false positives because of the unlikely appearance on many peers simultaneously of non-worm processes with worm-like properties.

  • Publication

    Host-Based Detection of Worms through Peer-to-Peer Cooperation

    (Association for Computing Machinery, 2005) Malan, David J.; Smith, Michael

    We propose a host-based, runtime defense against worms that achieves negligible risk of false positives through peer-to-peer cooperation. We view correlation among otherwise independent peers’ behavior as anomalous behavior, indication of a fast-spreading worm. We detect correlation by exploiting worms’ temporal consistency, similarity (low temporal variance) in worms’ invocations of system calls. We evaluate our ideas on Windows XP with Service Pack 2 using traces of nine variants of worms and twenty-five non-worms, including ten commercial applications and fifteen processes native to the platform. We find that two peers, upon exchanging snapshots of their internal behavior, defined with frequency distributions of system calls, can decide that they are, more likely than not, executing a worm between 76% and 97% of the time. More importantly, we find that the probability that peers might err, judging a non-worm a worm, is negligible.