Rich Linguistic Structure from Large-Scale Web Data
MetadataShow full item record
CitationYamangil, Elif. 2013. Rich Linguistic Structure from Large-Scale Web Data. Doctoral dissertation, Harvard University.
AbstractThe past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:11181110
- FAS Theses and Dissertations