2

Probably this is very naive question, based on lack of understanding but I was watching a video from Andreas M. Antonopoulos about blockchain, bitcoin and consensus algorithms and he asked for the SHA-256 output of the string Hello!. A guy told him the first few characters, and being Windows/C# dev I decided just for fun to check the answer implementing this very simple C# method:

    public byte[] Generate(string valueToHash)
    {
        byte[] hashValue;
        byte[] stringToBytes = Encoding.ASCII.GetBytes(valueToHash);

        using (SHA256 hashGenerator = SHA256.Create())
        {
            hashValue = hashGenerator.ComputeHash(stringToBytes);
        }

        return hashValue;
    }

and then output the result using this method:

    public static void PrintByteArray(byte[] array)
    {
        for (int i = 0; i < array.Length; i++)
        {
            Console.Write($"{array[i]:X2}");
            if ((i % 4) == 3) Console.Write(" ");
        }
        Console.WriteLine();
    }

But I got a different result.

I remember in the past that there were sites which were storing a huge databases of hashes and the actual value to produce them and back then it was possible to paste the hash value and if it's something common you will get the actual string. This got me to think that no matter what language/Os you are using, a value hashed with SHA-256 for example would produce the same result. However it seems that this is not entirely the case. The guy from the video who provided the answer turned out (from the comments) to use a Mac machine, so what exactly is causing the SHA-256 to produce different outputs for the same input? Is it the OS, is it the programming language? Maybe I've made mistake in my simple code?

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
Leron
  • 131
  • 1
  • 5

1 Answers1

5

Yes, SHA-256 is defined to give the same output for the same input byte sequence on every platform. However, by design, even the smallest change in the input will change the output into something completely different. So the most likely reason for the discrepancy you observed is that the input you used is not actually the same as the original input from the video.

Some possible causes for this discrepancy might include:

  • An extra null byte at the end of one of the inputs.
  • An extra newline or space at the end of one of the inputs.
  • A difference in the encoding of newlines (if the string contains one).
  • A difference in the capitalization of one of the letters (e.g. hello vs. Hello).
  • A difference in punctuation (e.g. Hello! vs. Hello. or "Hello!").
  • A difference in the character encoding used to encode the string as a sequence of bytes (unlikely for a plain ASCII string like Hello!, since most commonly used character encodings today are extensions of ASCII, but possible e.g. if one string was encoded as UTF-16).
Ilmari Karonen
  • 46,700
  • 5
  • 112
  • 189