Person:
Shieber, Stuart

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Shieber

First Name

Stuart

Name

Shieber, Stuart

Search Results

Now showing 1 - 10 of 92
  • Thumbnail Image
    Publication
    Learning Neural Templates for Text Generation
    (Association for Computational Linguistics, 2018-10) Wiseman, Sam; Shieber, Stuart; Rush, Alexander Sasha
    While neural, encoder-decoder models have had significant empirical success in text generation, there remain several unaddressed problems with this style of generation. Encoder-decoder models are largely (a) uninterpretable, and (b) difficult to control in terms of their phrasing or content. This work proposes a neural generation system using a hidden semimarkov model (HSMM) decoder, which learns latent, discrete templates jointly with learning to generate. We show that this model learns useful templates, and that these templates make generation both more interpretable and controllable. Furthermore, we show that this approach scales to real data sets and achieves strong performance nearing that of encoder-decoder text generation models.
  • Publication
    Reflexives and Reciprocals in Synchronous Tree Adjoining Grammar
    (2017-09) Aggazzotti, Cristina; Shieber, Stuart
    An attractive feature of the formalism of synchronous tree adjoining grammar (STAG) is its potential to handle linguistic phenomena whose syntactic and semantic derivations seem to diverge. Recent work has aimed at adapting STAG to capture such cases. Anaphors, including both reflexives and reciprocals, have presented a particular challenge due to the locality constraints imposed by the STAG formalism. Previous attempts to model anaphors in STAG have focused specifically on reflexives and have not expanded to incorporate reciprocals. We show how STAG can not only capture the syntactic distribution and semantic representation of both reflexives and reciprocals, but also do so in a unified way.
  • Publication
    On Evaluating the Generalization of LSTM Models in Formal Languages
    (Society for Computation in Linguistics, 2019-01) Suzgun, Mirac; Belinkov, Yonatan; Shieber, Stuart
    Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established themselves as a dominant model for language processing. Yet, there still remains an uncertainty regarding their language learning capabilities. In this paper, we empirically evaluate the inductive learning capabilities of Long Short-Term Memory networks, a popular extension of simple RNNs, to learn simple formal languages, in particular $a^nb^n$, $a^nb^nc^n$, and $a^n b^n c^n d^n$. We investigate the influence of various aspects of learning, such as training data regimes and model capacity, on the generalization to unobserved samples. We find striking differences in model performances under different training settings and highlight the need for careful analysis and assessment when making claims about the learning capabilities of neural network models.
  • Publication
    Adapting Sequence Models for Sentence Correction
    (Association for Computational Linguistics, 2017) Schmaltz, Allen; Kim, Yoon; Shieber, Stuart; Rush, Alexander Sasha
    In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. Our strongest sequence-to-sequence model improves over our strongest phrase-based statistical machine translation model, with access to the same data, by $6 M^2$ (0.5 GLEU) points. Additionally, in the data environment of the standard CoNLL-2014 setup, we demonstrate that modeling (and tuning against) diffs yields similar or better $M^2$ scores with simpler models and/or significantly less data than previous sequence-to-sequence approaches.
  • Thumbnail Image
    Publication
    LSTM Networks Can Perform Dynamic Counting
    (2019-06-09) Suzgun, Mirac; Gehrmann, Sebastian; Belinkov, Yonatan; Shieber, Stuart
    In this paper, we systematically assess the ability of standard recurrent networks to perform dynamic counting and to encode hierarchical representations. All the neural models in our experiments are designed to be small-sized networks both to prevent them from memorizing the training sets and to visualize and interpret their behaviour at test time. Our results demonstrate that the Long Short-Term Memory (LSTM) networks can learn to recognize the well-balanced parenthesis language (Dyck-1) and the shuffles of multiple Dyck-1 languages, each defined over different parenthesis-pairs, by emulating simple real-time k-counter machines. To the best of our knowledge, this work is the first study to introduce the shuffle languages to analyze the computational power of neural networks. We also show that a single-layer LSTM with only one hidden unit is practically sufficient for recognizing the Dyck-1 language. However, none of our recurrent networks was able to yield a good performance on the Dyck-2 language learning task, which requires a model to have a stack-like mechanism for recognition.
  • Thumbnail Image
    Publication
    Neo-Riemannian Cycle Detection with Weighted Finite-State Transducers
    (University of Miami, 2011) Bragg, Jonathan; Chew, Elaine; Shieber, Stuart
    This paper proposes a finite-state model for detecting harmonic cycles as described by neo-Riemannian theorists. Given a string of triads representing a harmonic analysis of a piece, the task is to identify and label all substrings corresponding to these cycles with high accuracy. The solution method uses a noisy channel model implemented with weighted finitestate transducers. On a dataset of four works by Franz Schubert, our model predicted cycles in the same regions as cycles in the ground truth with a precision of 0.18 and a recall of 1.0. The recalled cycles had an average edit distance of 3.2 insertions or deletions from the ground truth cycles, which average 6.4 labeled triads in length. We suggest ways in which our model could be used to contribute to current work in music theory, and be generalized to other music pattern-finding applications.
  • Thumbnail Image
    Publication
    Bimorphisms and synchronous grammars
    (Institute of Computer Science - Poland Academy of Science, 2014) Shieber, Stuart
    We tend to think of the study of language as proceeding by characterizing the strings and structures of a language, and we think of natural language processing as using those structures to build systems of utility in manipulating the language. But many language-related problems are more fruitfully viewed as requiring the specification of a relation between two languages, rather than the specification of a single language. We provide a synthesis and extension of work that unifies two approaches to such language relations: the automata-theoretic approach based on tree transducers that transform trees to their counterparts in the relation, and the grammatical approach based on synchronous grammars that derive pairs of trees in the relation. In particular, we characterize synchronous tree-substitution grammars and synchronous tree-adjoining grammars in terms of bimorphisms, which have previously been used to characterize tree transducers. In the process, we provide new approaches to formalizing the various concepts: a metanotation for describing varieties of tree automata and transducers in equational terms; a rigorous formalization of tree-adjoining and tree-substitution grammars and their synchronous counterparts, using trees over ranked alphabets; and generalizations of tree-adjoining grammar allowing multiple adjunction.
  • Thumbnail Image
    Publication
    Ecumenical open access and the Finch Report principles
    (British Academy for the Humanities and Social Sciences, 2013) Shieber, Stuart
  • Thumbnail Image
    Publication
    There Can Be No Turing-Test--Passing Memorizing Machines
    (Michigan Publishing, 2014) Shieber, Stuart
    Anti-behaviorist arguments against the validity of the Turing Test as a sufficient condition for attributing intelligence are based on a memorizing machine, which has recorded within it responses to every possible Turing Test interaction of up to a fixed length. The mere possibility of such a machine is claimed to be enough to invalidate the Turing Test. I consider the nomological possibility of memorizing machines, and how long a Turing Test they can pass. I replicate my previous analysis of this critical Turing Test length based on the age of the universe, show how considerations of communication time shorten that estimate and allow eliminating the sole remaining contingent assumption, and argue that the bound is so short that it is incompatible with the very notion of the Turing Test. I conclude that the memorizing machine objection to the Turing Test as a sufficient condition for attributing intelligence is invalid.
  • Thumbnail Image
    Publication
    Eliciting and annotating uncertainty in spoken language
    (2014) Pon-Barry, Heather Roberta; Shieber, Stuart; Longenbaugh, Nicholas
    A major challenge in the field of automatic recognition of emotion and affect in speech is the subjective nature of affect labels. The most common approach to acquiring affect labels is to ask a panel of listeners to rate a corpus of spoken utterances along one or more dimensions of interest. For applications ranging from educational technology to voice search to dictation, a speaker’s level of certainty is a primary dimension of interest. In such applications, we would like to know the speaker’s actual level of certainty, but past research has only revealed listeners’ perception of the speaker’s level of certainty. In this paper, we present a method for eliciting spoken utterances using stimuli that we design such that they have a quantitative, crowdsourced legibility score. While we cannot control a speaker’s actual internal level of certainty, the use of these stimuli provides a better estimate of internal certainty compared to existing speech corpora. The Harvard Uncertainty Speech Corpus, containing speech data, certainty annotations, and prosodic features, is made available to the research community.