Publication:
Content-Aware Manipulations for Image and Video Collections

Thumbnail Image

Date

2012-08-17

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Dale, Kevin. 2012. Content-Aware Manipulations for Image and Video Collections. Doctoral dissertation, Harvard University.

Research Data

Abstract

Digital photography and videography have become ubiquitous. With ever-cheaper and more capable cameras, found in dedicated devices and, increasingly, in multipurpose smartphones and tablets, it is now easier than ever for the casual user to generate their own dense stream of personal multimedia data. With popular photo and video sharing sites, like Flickr, Facebook, and YouTube, users can share their images and video with the world, making for a vast amount of multimedia data stored at home and on the web. This flood of data presents many challenges, particularly for the non-professional, to manage their data, for example, to apply simple photographic adjustments, or to find interesting shots worth keeping among megabytes of data from even a short weekend trip. At the same time, researchers have a unique opportunity to exploit the vast amount of publicly available multimedia data to make graphics tasks easier for the individual. This dissertation presents work that seeks to address some of these challenges, and, where possible, exploit existing data sets to do so. First, we discuss a general approach that finds images similar to a given input from among a collection of photographs, from which various task-specific properties are transferred to the input. We demonstrate this basic approach in two distinct settings–image restoration and CG image enhancement. Next, we focus on collections of video. We first present a method for efficiently browsing and summarizing collections of related videos. Our approach is based on a simple pairwise video alignment that identifies a relevant sequence of video clips that best matches an input video. Finally, we discuss our work on replacing facial performances in video that requires no special hardware and can be used to retarget existing footage to synthesize new performances.

Description

Other Available Sources

Keywords

computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories