15

Definitions of Turing machines are always explicit about the blank symbol not being part of the input alphabet.

I wonder what goes wrong when you would make it part of the input alphabet, because effectively the blank symbol already seems to be part of the input.

To explain that 'seems' in the last sentence, consider the following.

In the default setup, an infinite number of blank symbols appear on the right of the input. When the tape head moves over the first blank symbol, computation can just continue, as it doesn't need to be an accept or reject state.

Now suppose the computation would subsequently write symbols from the input alphabet to the right of that first blank symbol, then return to the leftmost position while also returning to the start state. It would then 'start over' with a different tape. Effectively, it now starts with a different input, where there are input symbols to the right of the blank that weren't there before. The input seems to effectively include the blank symbol. The further behavior of the machine could now also be different: after encountering the blank again, it will now encounter different symbols to the right.

Supposing this scenario is indeed possible, why wouldn't you consider the blank symbol part of the input alphabet and why wouldn't you allow including it as part of the 'initial' input?

Perhaps it is just a way to define the input such that it isn't always infinite?

Confusion
  • 253
  • 2
  • 7

3 Answers3

25

The main reason is that it allows the machine to detect the end of its input: it's (the character before) the first blank. If you allowed blanks in the input, the machine could never know whether it might find more input by scanning farther to the right. Of course, you could solve that by having a special "end of input" character but then you have to insist that that can't appear in the input, so you've just shifted the problem one level deeper.

It also makes the initial conditions much easier to specify: the input is the non-blank section of the initial tape, which must be finite and contiguous. And if you want a blank character to be a part of the input alphabet, you can always add an extra character (call it "space" or something) and have the machine behave however you want when it sees it.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
6

You can define the blank symbol to be part of the alphabet. The problem with that is that if a Turing machine with input b010010b (where b stands for blank) never reads past the second b, then the machine will behave exactly the same way on all inputs starting with b010010b.

These Turing machines are called prefix Turing machines, and they are very useful for proving some theorems about Kolmogorov complexity.

Peter Shor
  • 4,500
  • 1
  • 26
  • 35
6

Very short answer: the tape alphabet is the set of symbols that can appear on the tape, and it includes the blank symbol. The input alphabet is the set of symbols that can appear in the initial input, and it does not include the blank symbol. The main alphabet the machine cares about is the tape alphabet: it still needs rules for what to do when it sees a blank, for example.

This distinction is important, as others have said, so that the machine can tell where its input ends. It's the same reason you can't (usefully) put a zero-character in the middle of a string in C: the zero-character is reserved to mean "the last non-zero character before this is the end of the data, so when you see this, you're done". If you need to expect zero-characters in the middle of the string, writing strlen gets a whole lot harder.

Draconis
  • 7,216
  • 1
  • 19
  • 28