The Infiniteness of Words

Maya Mubayi

Countable, uncountable, and infinite sets can be found everywhere, from pure mathematical cases to sets we find in everyday life. We know that sets of numbers, such as \mathbb{R}, \mathbb{N}, \mathbb{Z}, and many more can be classified as countable, uncountable, and infinite. Similarly, you can think of a tree as an infinite or finite set of branches, or even your own wardrobe as a finite set of pant-shirt-shoe combinations. What about the words in the English language? The number of possible English sentences? The number of possible books? Are those finite sets? Infinite sets? Uncountable sets? I will explore these questions further in this blog.

Why Are the Set of English Words Finite?

First define E as the set of all English words. Since we are dealing with words, we can assume that there is in fact a longest word in the English language. Let the length of this word be n for n \in \mathbb{N}. If E_{1} is the set of all English words of length 1, E_{2} is the set of all English words on length 2, and so on, we know that E_{1} \cup E_{2} \cup\cup E_{n} = E. Since there are at most 26 options for any letter, for each E_{i} \subseteq E for i \le n. This means that E is composed of a finite union of finite sets. Therefore, E is not only countable, but it’s also finite!

Why Are the Set of All Possible English Sentences Countable?

Before building up to books, we will now look at combinations of English words into sentences. Using grammar and the existence of compound sentences, unlike with words, we actually cannot define the length of the longest possible English sentence. For example, we could say “I went on a walk with my mom and my dad and my friend and …” with as many “ands” as we want to connect nouns. Therefore, we will show that the set of all possible English sentences is countably infinite.

Let S_{1} be the set of all English sentences containing one word, S_{2} be the set of all English sentences containing two words, and so on. Thus, we can define S such that S = S_{1} \cup S_{2} \cup S_{3} \cup \cdots. Thus S represents the set of all English sentences. Since we previously proved that there are a finite number of words, let |E| = a. Since there are at most a options for any word within a sentence, for each S_{i} \in S, we can say that |S_{i}| \le a^i. Therefore each S_{i} is a finite sets (therefore countable set). Moreover S_{1}, S_{2}, S_{3}… represent a countably infinite collection of sets. Thus, S is the union of an infinitely countable set of finite sets. As we proved in class, this mean that S is an infinitely countable set.

We have now found that although the set of English words is finite, the sets of possible English sentences is not!

Why Are the Set of All Possible English Books Countable?

Now we can begin to looks the the collection of all possible English books. We can think of books as a finite collection of sentences. Similar to sentences, there is no way to define the number of the longest possible books, since this number could be infinitely large. Therefore, we will show that the set of all possible English books is countably infinite. The proof for this will be very similar to the proof for countability of sentences so I won’t go into detail. The proof will proceed in the following outline.

  1. Let B_{1} be the set of all English books containing one sentence, B_{2} be the set of all English books containing two sentences, and so on. Thus, we can define B such that B = B_{1} \cup B_{2} \cup B_{3}\cup \cdots. Thus B represents the set of all possible English books. Note that each B_{i} is the cartesian product of i B_{1}‘s. Since we previously proved that there are countably infinite possible English sentences, we can say that B_{1} is a countably infinite set. Thus there is a bijection g: B_{1} \rightarrow \mathbb{N}. Then each B_{i} can be written as \mathbb{N} \times \mathbb{N} \times \cdots \times\mathbb{N} where the Cartesian product is an ordered i-tuple. Using properties that we proved in class, we can find a bijection h: \mathbb{N} \times \mathbb{N} \times \cdots \times\mathbb{N} \to \mathbb{N}. Thus each B_{i} is a countably infinite set.
  2. B_{1}, B_{2}, B_{3}… is a countable collection of sets.
  3. Thus since we have a countable collection of countable sets, the union must be countable. Therefore B is a countably infinite set.

We have shown that the set of all possible English novels is in fact a countably infinite set. How interesting is it that we can go from a finite set of words to an infinite set of sentences and books! This just shows how finite sets can be used to create infinite sets. Thanks for reading!

Sources

  • Chomsky, N., &; Miller, G. A. (1958). Finite State languages. Information and Control, 1(2), 91–112. https://doi.org/10.1016/s0019-9958(58)90082-2
  • Thompson, Henry. “D. T. Langendoen and P. M. Postal, The Vastness of Natural Languages. Oxford: Basil Blackwell, 1984. Pp. Ix 189.” Journal of Linguistics, vol. 22, no. 1, 1986, pp. 241–242., doi:10.1017/S0022226700010732.

mbm9123

2 Comments

  1. Wow! This was such an enjoyable read. I have never considered the possibility that something like the English language could be applied to the topic of cardinality. From reading your explanations, I would be interested in seeing how you might approach this using a language that doesn’t have a set alphabet (since presumably as long as there are a finite amount of characters in the alphabet, we cannot have an uncountable number of words). For example, languages like Chinese could theoretically have an infinite number of strokes in a “word”.

    I would argue that unless partial strokes exist, the strokes that consist of any word can be mapped onto the entirety of the natural numbers or a subset of the natural numbers, which should make the set of all words countably infinite rather than finite. Would you say that this logic makes sense given your research on this topic?

  2. Hi Maya,

    This is such an interesting application of what we discussed in class! I never even thought of trying to take the idea of bijections and apply them to letters, words, and sentences. I am so curious however, how far out the ideas you mentioned will extend! I write this sitting in a library, and am inspired by the question, what about bookshelves? Although you can create a bijection from the set of the cartesian product of all natural numbers to the different books in the English language, what about groupings of those? What if the bookshelves are finite, so there are an infinite number of finite bookshelves? Can I apply the same concepts? I love how much this piqued my interest in the more applied ideas of what I really saw throughout the course as extremely theoretical ideas. I wonder, as more words get added to the English language every year by the Oxford English Dictionary, given infinite time, does that affect the proof? I love how much this enticed me to explore the subject further, and thank you for the interesting approach to looking at the set of all English sentences!

Leave a Reply

Your email address will not be published. Required fields are marked *