# Alphabet (formal languages)

(Redirected from Alphabet (computer science))
Jump to navigation Jump to search

In formal language theory, a string is defined as a finite sequence of members of an underlying base set; this set is called the alphabet of a string or collection of strings.[1][2] The members of the set are called symbols, and are typically thought of as representing letters, characters, or digits.[1][2] For example, a common alphabet is {0,1}, the binary alphabet, and a binary string is a string drawn from the alphabet {0,1}. An infinite sequence of letters may be constructed from elements of an alphabet as well.

## Notation

If L is a formal language, i.e. a (possibly infinite) set of finite-length strings, the alphabet of L is the set of all symbols that may occur in any string in L. For example, if L is the set of all variable identifiers in the programming language C, L’s alphabet is the set { a, b, c, ..., x, y, z, A, B, C, ..., X, Y, Z, 0, 1, 2, ..., 7, 8, 9, _ }.

Given an alphabet ${\displaystyle \Sigma }$, the set of all strings of length ${\displaystyle n}$ over the alphabet ${\displaystyle \Sigma }$ is indicated by ${\displaystyle \Sigma ^{n}}$. The set ${\textstyle \bigcup _{i\in \mathbb {N} }\Sigma ^{i}}$ of all finite strings (regardless of their length) is indicated by the Kleene star operator as ${\displaystyle \Sigma ^{*}}$, and is also called the Kleene closure of ${\displaystyle \Sigma }$. The notation ${\displaystyle \Sigma ^{\omega }}$ indicates the set of all infinite sequences over the alphabet ${\displaystyle \Sigma }$, and ${\displaystyle \Sigma ^{\infty }}$ indicates the set ${\displaystyle \Sigma ^{\ast }\cup \Sigma ^{\omega }}$ of all finite or infinite sequences.

For example, using the binary alphabet {0,1}, the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the empty string).

## Applications

Alphabets are important in the use of formal languages, automata and semiautomata. In most cases, for defining instances of automata, such as deterministic finite automata (DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a finite set, but is not otherwise restricted.

When using automata, regular expressions, or formal grammars as part of string-processing algorithms, the alphabet may be assumed to be the character set of the text to be processed by these algorithms, or a subset of allowable characters from the character set.

## References

1. ^ a b Aho, Alfred V.; Sethi, Ravi; Ullman, Jeffrey D. (1985). Compilers: Principles, Techniques, and Tools (March 1988 reprint ed.). Addison-Wesley. p. 92. ISBN 0-201-10088-6. The term alphabet or character class denotes any finite set of symbols.
2. ^ a b Ebbinghaus, H.-D.; Flum, J.; Thomas, W. (1994). Mathematical Logic (2nd ed.). New York: Springer. p. 11. ISBN 0-387-94258-0. By an alphabet ${\displaystyle {\mathcal {A}}}$ we mean a nonempty set of symbols.

## Literature

Retrieved from "https://en.wikipedia.org/w/index.php?title=Alphabet_(formal_languages)&oldid=914587706"
This content was retrieved from Wikipedia : http://en.wikipedia.org/wiki/Alphabet_(computer_science)
This page is based on the copyrighted Wikipedia article "Alphabet (formal languages)"; it is used under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC-BY-SA). You may redistribute it, verbatim or modified, providing that you comply with the terms of the CC-BY-SA