Publication:

Machine Learning for Tangible Effects: Natural Language Processing for Uncovering the Illicit Massage Industry & Computer Vision for Tactile Sensing

Loading...
Thumbnail Image

Date

2023-09-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Ouyang, Nancy Rui. 2023. Machine Learning for Tangible Effects: Natural Language Processing for Uncovering the Illicit Massage Industry & Computer Vision for Tactile Sensing. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

I explore two questions in this thesis: how can computer science be used to fight human trafficking? And how can computer vision create a sense of touch?

The United States illicit massage industry (IMI) is a multi-billion dollar industry that offers not just therapeutic massages but also commercial sexual services. Illicit massage parlors number in the thousands and exist in every major city in the United States. Employees are often immigrant women with few other job opportunities, leaving them vulnerable to fraud, coercion, and other facets of human trafficking.

By creating datasets using three publicly-accessible websites: Google Places, Rubmaps, and the AMPReviews forum, I show how we can use natural language processing tools such as bag-of-words combined with machine learning classifiers to help monitor spatiotemporal trends in the IMI. Monitoring plays an essential role in preventing trafficking and protecting employees within the IMI. I further show how to use word embeddings such as Word2Vec to derive insights into the labor pressures and language barriers affecting IMI employees. Similarly, I analyze the income, demographics, and societal pressures (such as relationship status) affecting sex buyers. Other insights include linked domains and using the word embeddings as a tool for acronym expansion.

I also consider counter-trafficking in the banking sector. Human trafficking is about money, much of which will eventually flow through the legal financial system. Banks are legally required to have safeguards to guarantee they are not aiding or abetting criminal activity. My preliminary work focuses on creating synthetic transaction data so that researchers can more easily prototype, evaluate, and collaborate on developing anti-money laundering algorithms. This work adopts agent-based modeling and is inspired by red-flagged transactional behaviors from the United States and the Canadian financial regulatory agencies. I show both the uses and limitations of my model in generating timestamps and payee-recipient graphs for transactions.

Finally, I consider the role of computer vision in creating tactile sensors. Tactile sensors are critical for robots that seek to manipulate and interact with the world, a prerequisite for helping with household tasks. Existing sensors include the Gelsight sensor, which consists of a camera facing a gel that is lit from multiple angles. The surface of the gel is slightly translucent and slightly reflective (semi-specular), and when objects are pressed into the gel, the image becomes a tactile image. Adapting a Gelsight sensor to the task of finding buried objects in sand required several modifications. Creating a wedge-shaped sensor allows for digging down into the granular media. The novel use of fluorescent paint instead of LEDs for gel lighting allows for significant sensor size reduction. Finally, an integrated vibrator motor counteracts jamming in the granular media, reducing force requirements for moving through the media.

This work also shows how to use a webcam and a printed reference marker, or fiducial, to create a low-cost six-axis force-torque sensor. Commercial six-axis force-torque sensors cost thousands of dollars and often contain delicate strain gauges. By contrast, this sensor is inexpensive, made using readily-available rapid prototyping technologies, and easy to modify. All code and hardware design files are open sourced, opening up six-axis force-torque sensing to a wider range of applications.

Description

Other Available Sources

Research Data

Keywords

computational social science, computer vision, human trafficking, illicit massage industry, natural language processing, tactile sensing, Artificial intelligence, Robotics, Social research

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories