Questions tagged [frequency-analysis]

Frequency analysis is the study of letters or groups of letters contained in a ciphertext in an attempt to partially reveal the message.

Frequency analysis is the study of letters or groups of letters contained in a ciphertext in an attempt to partially reveal the message.

The English language (as well as most other languages) have certain letters and groups of letters appear in varying frequencies. The following chart shows the frequency distribution of letters in the English alphabet:

frequency distribution chart of letters in the English alphabet
image source: Wikipedia

As you can see, the letter e is the most common, while j,q and x are very uncommon.

If the encryption method does not effectively mask these frequencies, it is possible to statistically determine parts of the plaintext from looking at the ciphertext alone… based on knowing the usual frequencies of letters in English communication.

Many classical ciphers are vulnerable to frequency analysis.

51 questions
25
votes
5 answers

Developing algorithm for detecting plain text via frequency analysis

I'm currently attempting the Matasano Crypto Challenges as a basic intro to cryptography. For solving some of the earlier challenges I utilised n-grams to determine which is going to be the most likely English plain text. It has been quite…
CryptoNoob
  • 253
  • 1
  • 3
  • 6
17
votes
5 answers

Examples of frauds discovered because someone tried to mimic a random sequence

[Moderator note: this question now lives there] So, I'm preparing a talk about the well known fact that humans are bad at the task of generating uniformly random sequences of numbers when asked to do so, which is a huge flaw for simple cryptographic…
10
votes
2 answers

Why are the ciphertexts of Ansible Vault's AES256-encrypted files disproportionately composed of '3' and '6'?

I was encrypting some Ansible secrets this morning and noticed that the ciphertexts seemed to have a lot of 3s and 6s in them. I did some frequency counts and found that yes, in fact, about 40% of the digits are 3s, and more than 20%…
David Moles
  • 213
  • 1
  • 7
8
votes
3 answers

How to perform frequency analysis of a substitution cipher using a Base64 alphabet

Let's imagine a cipher that works like the following: Plaintext is encoded to Base64. The characters in the encoded plaintext are substituted with a randomly shuffled character set(A-z, 0-9, -, _, =). The shuffled alphabet is, in essence, the…
6
votes
3 answers

How can frequency analysis be applied to modern ciphers?

I am building a computer program that deciphers Caesar, Vigenere and monoalphabetic substitution ciphers. All of those are susceptible to frequency analysis. However, it does not seem to be real-world applicable considering the complexity of the…
6
votes
2 answers

How to break homophonic substitutions and nomenclators with too many symbols?

Early attempts to thwart frequency analysis attacks on ciphers involved using homophonic substitutions, i.e., some letters map to more than one ciphertext symbol. The earliest known example of this, from 1401, is shown below: [Source: “Quadibloc”…
5
votes
2 answers

Will compression help defeat single letter frequency attack against a mono alphabetic substitution cipher?

Alice has a long message to send. She is using mono alphabetic substitution cipher. She thinks that if she compresses the message it may protect the text from single letter frequency attack by Eve. Does the compression help? Should she compress the…
5
votes
2 answers

Is there a practical security difference between OTP with letters and OTP with numbers?

Is there a practical security difference between “OTP with letters” and “OTP with numbers”? If, for example, I encrypt a letter message using Tabula Recta with a random letter key, I would get: THISI SATES TXAAA - message MOSSK VFXNN EJRQW - OTP…
B37a4good
  • 53
  • 4
5
votes
5 answers

If classical ciphers are used with compressed plaintext, how much does it make frequency analysis attack harder?

Classical ciphers, such as the Vigenère cipher, are weak and no longer used. They can be broken by using frequency analysis, which is a well-known fact. However, frequency analysis often depends on the number of captured ciphertexts and/or their…
5
votes
2 answers

Strategy to crack a presumed substitution cipher

The ciphertext given is: ejitp spawa qleji taiul rtwll rflrl laoat wsqqj atgac kthls iraoa twlpl qjatw jufrh lhuts qataq itats aittk stqfj cae I've done frequency analysis on the text. That yielded the following results: A(15), B(0), C(2), D(0),…
5
votes
3 answers

How feasible is word-level frequency analysis over English (or any language)?

Say I have some black box which, given any English word, deterministically outputs a token for that word. Assume our black box is implemented using strong cryptography, i.e. the hardness of reversing a token to its word is reducible to some known…
pg1989
  • 4,736
  • 25
  • 43
4
votes
2 answers

Substitution ciphers amended with cipher block chaining: susceptible to frequency analysis?

I have been studying ways to amend a simple substitution cipher, and one of the toy suggestions was to use CBC in the following way: identify each letter with a number from $0\ldots 25$ start with a random IV, i.e. just a random letter add IV to…
4
votes
1 answer

How to break a Quagmire 3 cipher?

What would be a good way to go about attacking a Quagmire 3 cipher? I understand that it is polyalphabetic with a key and an indicator. I have started by taking the cipher text, taking every n-th letter and putting that in a string, computing the…
4
votes
1 answer

Is frequency analysis the only attack on a simple substitution cipher?

I understand that "anagramming" is often used after frequency analysis of a traditional substitution cipher, but anagramming requires prior frequency analysis to provide a starting point. Is there any technique other than frequency analysis for a…
Roddus
  • 181
  • 5
4
votes
1 answer

Is frequency analysis a viable attack on non-text data encoded by substitution?

Is frequency analysis a viable attack on non-text data encoded by substitution (eg image and audio formats encoded with a substitution cipher)?
Roddus
  • 181
  • 5
1
2 3 4