Publication:

Grammar Type for String

Loading...
Thumbnail Image

Date

2023-10-10

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Xu, Licheng. 2023. Grammar Type for String. Master's thesis, Harvard University Division of Continuing Education.

Abstract

Strings are ubiquitous in computer programs. Both the correctness and the security of programs that use Strings often rely on them not being arbitrary Strings, but belonging to specific sets of Strings. However, this restriction is often not enforced, let alone clearly specified. To remediate this issue, this thesis creates a language extension on top of the standard Java language. The language extension introduces a Grammar Type that is a subtype of String but conforming to the regex expression specified for each Grammar Type. For example, String[[“a∗b”]] is a Grammar Type that represents all Strings that are any number of a’s followed by a b. When we declare or cast a variable as String[[some regex]], the variable has to be a String conforming to that some regex, otherwise it would be either a compile time error or runtime error depending on the situation. This makes the Java type system more powerful, as we can now inherently validate Strings with any pattern we define using a regex. At the same time, since Grammar Types compile down to Strings, they inherit all the functionality of Strings. With these advantages, Grammar Types can be used in various types of applications that need input validation, like validating email addresses, validating URLs, or mitigating SQL injection attacks.

Description

Other Available Sources

Research Data

Keywords

Compiler, Java, Regular expressions, Software engineering, String, Type system, Computer science, Information technology, Computer engineering

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories