The Vigenère Cipher: Complete Examples

This page discusses a few complete examples. Each example uses Kasiski's method and the index of coincidence method to determine a possible keyword length, with which a possible keyword is constructed and used to decrypt the ciphertext. This procedure repeats until a meaningful plaintext is found. Keyword length search is limited to the range of 2 and 20 in this section. With a computer program, we can search for any length and this restriction is purely artificial.

Example 1

The following is a ciphertext to be analyzed.

DAZFI SFSPA VQLSN PXYSZ WXALC DAFGQ UISMT PHZGA MKTTF TCCFX
KFCRG GLPFE TZMMM ZOZDE ADWVZ WMWKV GQSOH QSVHP WFKLS LEASE
PWHMJ EGKPU RVSXJ XVBWV POSDE TEQTX OBZIK WCXLW NUOVJ MJCLL
OEOFA ZENVM JILOW ZEKAZ EJAQD ILSWW ESGUG KTZGQ ZVRMN WTQSE
OTKTK PBSTA MQVER MJEGL JQRTL GFJYG SPTZP GTACM OECBX SESCI
YGUFP KVILL TWDKS ZODFW FWEAA PQTFS TQIRG MPMEL RYELH QSVWB
AWMOS DELHM UZGPG YEKZU KWTAM ZJMLS EVJQT GLAWV OVVXH KWQIL
IEUYS ZWXAH HUSZO GMUZQ CIMVZ UVWIF JJHPW VXFSE TZEDF

The first task is estimating the keyword length. Kasiski's method found the following repeated strings and their positions.

Length String Positions Distance
6 YSZWXA 17 353 336
4 HQSV 84 294 210
MJEG 103 215 112
OSDE 121 303 102
3 ETZ 59 389 330
HPW 88 382 294
AZE 154 168 14
TAM 208 322 114
SZO 264 362 98
ELM 292 306 14
MUZ 309 366 57

The following table shows the distances and their factors. The most common factors are 2, 3, 7 and 14. Since the factor 2 is unlikely, we have three estimates 3, 7 and 14. Of these three possible keyword lengths, 14 is the most likely because it has the highest count.


Factors
Distance 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
336 X X X
X X X


X
X
X



330 X X
X X


X X


X




294 X X

X X





X





210 X X
X X X

X


X X




182 X



X




X X





114 X X

X











X
112 X
X

X X




X
X



98 X



X





X





57
X














X
14 X










X





Total 9 6 2 2 5 6 2 0 2 1 1 1 7 2 2 0 0 2 0

The following table has the average of index of coincidence values of each length (i.e., number of cosets). The highest one is 14 with an average of 0.064378. Hence, we have strong evidence showing that the keyword length is 14.

Length 1 2 3 4 5
Average 0.042614 0.044759 0.042969 0.042926 0.041610
Length 6 7 8 9 10
Average 0.044117 0.050827 0.043529 0.041019 0.044318
Length 11 12 13 14 15
Average 0.042662 0.041208 0.040273 0.064378 0.040722
Length 16 17 18 19 20
Average 0.042582 0.047256 0.041294 0.042000 0.043187

The table below has the smallest χ2 value of each coset and the corresponding letter:

A 0.556354
M 0.435045
B 0.738187
R 0.589010
O 1.061695
I 0.824027
S 1.322445
E 0.372434
T 1.188328
H 0.981700
O 1.096823
M 2.236836
A 0.726619
S 0.625189

Thus, the recovered keyword is AMBROISETHOMAS. The following is the decrypted text with spaces and punctuation added. We are lucky to decrypt it in one shot.

DO YOU KNOW THE LAND WHERE THE ORANGE TREE BLOSSOMS?
THE COUNTRY OF GOLDEN FRUITS AND MARVELOUS ROSES,
WHERE THE BREEZE IS SOFTER AND BIRDS LIGHTER,
WHERE BEES GATHER POLLEN IN EVERY SEASON,
AND WHERE SHINES AND SMILES, LIKE A GIFT FROM GOD,
AN ETERNAL SPRINGTIME UNDER AN EVER-BLUE SKY!
ALAS! BUT I CANNOT FOLLOW YOU
TO THAT HAPPY SHORE FROM WHICH FATE HAS EXILED ME!
THERE! IT IS THERE THAT I SHOULD LIKE TO LIVE,
TO LOVE, TO LOVE, AND TO DIE!
IT IS THERE THAT I SHOULD LIKE TO LIVE, IT IS THERE, YES, THERE!

This is part one of "Connais-tu le pays" of Ambroise Thomas' opera Mignon. In the first act, Mignon speaks to Wilhelm and Lothano who rescue her, tells her abduction, and describes her past time with this beautiful and well-known aria. The above is an English translation from French. "Connais-tu le pays" means "Do you know this country (or land)". ♦

Example 2

The following is the ciphertext to be analyzed.

QRBAI UWYOK ILBRZ XTUWL EGXSN VDXWR XMHXY FCGMW WWSME LSXUZ
MKMFS BNZIF YEIEG RFZRX WKUFA XQEDX DTTHY NTBRJ LHTAI KOCZX
QHBND ZIGZG PXARJ EDYSJ NUMKI FLBTN HWISW NVLFM EGXAI AAWSL
FMHXR SGRIG HEQTU MLGLV BRSIL AEZSG XCMHT OWHFM LWMRK HPRFB
ELWGF RUGPB HNBEM KBNVW HHUEA KILBN BMLHK XUGML YQKHP RFBEL
EJYNV WSIJB GAXGO TPMXR TXFKI WUALB RGWIE GHWHG AMEWW LTAEL
NUMRE UWTBL SDPRL YVRET LEEDF ROBEQ UXTHX ZYOZB XLKAC KSOHN
VWXKS MAEPH IYQMM FSECH RFYPB BSQTX TPIWH GPXQD FWTAI KNNBX
SIYKE TXTLV BTMQA LAGHG OTPMX RTXTH XSFYG WMVKH LOIVU ALMLD
LTSYV WYNVW MQVXP XRVYA BLXDL XSMLW SUIOI IMELI SOYEB HPHNR
WTVUI AKEYG WIETG WWBVM VDUMA EPAUA KXWHK MAUPA MUKHQ PWKCX
EFXGW WSDDE OMLWL NKMWD FWTAM FAFEA MFZBN WIHYA LXRWK MAMIK
GNGHJ UAZHM HGUAL YSULA ELYHJ BZMSI LAILH WWYIK EWAHN PMLBN
NBVPJ XLBEF WRWGX KWIRH XWWGQ HRRXW IOMFY CZHZL VXNVI OYZCM
YDDEY IPWXT MMSHS VHHXZ YEWNV OAOEL SMLSW KXXFX STRVI HZLEF
JXDAS FIE

Kasiski's method found many repeating strings as shown in the table below.

Length String Positions Distance Length String Positions Distance
9 GOTPMXRTX 263 419 156 3 LFM 137 149 12
8 KHPRFBEL 194 242 48 LVB 168 408 240
5 DFWTA 389 569 180 LAE 174 618 444
4 KILB 9 225 216 MLW 189 477 288
TAIK 92 392 300 477 561 84
SILA 172 628 456 NVW 217 253 36
YNVW 252 456 204 253 349 96
GWIE 281 509 228 349 457 108
XTHX 331 427 96 LBN 227 647 420
HXZY 333 717 384 UAL 276 444 168
MAEP 355 523 168 444 612 168
3 LBR 11 278 267 WHG 287 383 96
EGX 20 140 120 AEL 297 619 322
MHX 31 151 120 TXT 378 405 27
WWS 40 554 514 405 426 21
MEL 43 486 443 NNB 396 649 253
ELS 44 728 684 YGW 433 508 75
MFS 52 364 312 SML 476 730 254
IEG 62 283 221 GWW 514 553 39
RXW 68 677 609 KMA 534 594 60
GPX 109 385 276 DDE 557 701 144
NUM 120 300 180 AMF 573 579 6
WNV 134 722 588 HZL 687 745 58

Since there are potentially many repeating strings of length 3 (i.e., tri-graph), they usually can help determine the keyword length. The table below shows the distances and their factors of repeating strings of length 3. The most common factors are 2, 3, 4, 6 and 12 with occurrences 23, 25, 18, 20 and 18, respectively. In this case, even though the factor 2 is ignored, it is not so obvious about the correct length of the keyword. This is a common problem with Kasiski's method.


Factors
Distance 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
684 X X X
X

X

X




X X
609
X

X













588 X X X X X




X
X





514 X

















444 X X X
X




X







443


















420 X X X X X X

X
X
X X



X
322 X



X





X





312 X X X
X
X


X X






288 X X X
X
X X

X


X
X

276 X X X
X




X







267
X
















254 X

















253








X








240 X X X X X
X
X
X

X X


X
221










X


X


180 X X X X X

X X
X

X

X
X
168 X X X
X X X


X
X





144 X X X
X
X X

X


X
X

120 X X X X X
X
X
X

X



X
108 X X X
X

X

X




X

96 X X X
X
X


X


X



84 X X X
X X



X
X





75
X
X








X




60 X X X X X


X
X

X



X
58 X

















39
X








X






36 X X X
X

X

X




X

27
X




X










21
X


X












12 X X X
X




X







6 X X

X













Total 23 25 18 7 20 5 7 7 5 1 18 3 5 6 4 1 6 1 5

The following table has the average of index of coincidence values of each length (i.e., number of cosets). The highest one is 12 with an average of 0.067244. Hence, we have strong evidence showing that the keyword length is 12. Note that the length 12 is not the highest count using Kasiski's method.

Length 1 2 3 4 5
Average 0.043760 0.044658 0.049472 0.050311 0.043309
Length 6 7 8 9 10
Average 0.058913 0.043981 0.049996 0.048465 0.045402
Length 11 12 13 14 15
Average 0.042631 0.071725 0.044905 0.045486 0.049118
Length 16 17 18 19 20
Average 0.050852 0.043347 0.057025 0.040518 0.048491

The following table has the smallest χ2 value of each coset and the corresponding letter:

U 0.231760
N 0.317295
I 0.340955
T 0.529938
E 0.302274
D 0.395590
S 0.455630
T 0.393412
A 0.219040
T 0.354012
E 0.321425
S 0.404944

Thus, the recovered keyword is UNITEDSTATES. The following is the decrypted plaintext with spaces and punctuation added.

WE, THEREFORE, THE REPRESENTATIVES OF THE UNITED STATES OF AMERICA,
IN GENERAL CONGRESS, ASSEMBLED, APPEALING TO THE SUPREME JUDGE OF
THE WORLD FOR THE RECTITUDE OF OUR INTENTIONS, DO, IN THE NAME,
AND BY AUTHORITY OF THE GOOD PEOPLE OF THESE COLONIES, SOLEMNLY PUBLISH
AND DECLARE, THAT THESE UNITED COLONIES ARE, AND OF RIGHT OUGHT TO BE
FREE AND INDEPENDENT STATES, THAT THEY ARE ABSOLVED FROM ALL ALLEGIANCE
TO THE BRITISH CROWN, AND THAT ALL POLITICAL CONNECTION BETWEEN THEM AND
THE STATE OF GREAT BRITAIN, IS AND OUGHT TO BE TOTALLY DISSOLVED, AND THAT
AS FREE AND INDEPENDENT STATES, THEY HAVE FULL POWER TO LEVY WAR,
CONCLUDE PEACE, CONTRACT ALLIANCES, ESTABLISH COMMERCE, AND TO DO ALL
OTHER ACTS AND THINGS WHICH INDEPENDENT STATES MAY OF RIGHT DO. AND FOR
THE SUPPORT OF THIS DECLARATION, WITH A FIRM RELIANCE ON THE PROTECTION
OF DIVINE PROVIDENCE, WE MUTUALLY PLEDGE TO EACH OTHER OUR LIVES,
OUR FORTUNES AND OUR SACRED HONOR.

This is the last paragraph of the Declaration of Independence. ♦

Example 3

The following is the ciphertext to be analyzed.

LPROZ OOGRJ ZGFLV TUKMC WFDQM PZXIJ LVRWQ XEOSZ ZHTEK UYSCR
PTFCZ UHXIJ LPPTD CPRBY OSMGY TLEVD UAQMF IFMZV LVYTO QDLHX
LBPLL KYCQY ODRKS ACTEU XZEVO UAQMF OSDSU BKMBJ QEORF WFQCK
HKSOD INGJZ SHGVV LMSZD WWHFJ AVQGF NUWMW AOIXT CSRYC YPPTP
LFUCR AVQHR RVRES QCKHM LGARF YHXZP CSNWS NCRQV GHRLR ZEDHF
JVUPC XZJQC ATISQ SCGXN BALWY MAQCY OSD

The following table shows the count of each factor in the range of 2 and 20. Kasiski's method suggests that the possible length of the unknown keyword may be 3, 5 and 15.

Factors 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Counts 5 5 2 5 3 1 0 2 3 0 0 0 1 5 0 1 0 1 0

The following has the average values of the index of coincidence in the range of 1 and 20. Only one of these 20 average values is larger than 0.6 (i.e. 0.06073 for length 15).

Length 1 2 3 4 5
Average 0.03849 0.03831 0.04325 0.04065 0.04496
Length 6 7 8 9 10
Average 0.04479 0.03877 0.04181 0.03915 0.04863
Length 11 12 13 14 15
Average 0.03506 0.04817 0.04186 0.03473 0.06073
Length 16 17 18 19 20
Average 0.04141 0.03725 0.03631 0.03906 0.04978

Since both methods agree on 15, we use this value as our estimation. The recovered keyword is THEILIADQFHOMER and the recovered plaintext is shown below:

SINGO GODBE SSTHE ANGER OFAAH ILLES SONOF PELCU STHAT BROUG
HTCMU NTLES SILLS UPOLT HEACH AEANS MANWA BRAVE SOULD IDIRS
ENDHU RRYIN GDOUN TOHAD ESAND MANWA HEROD IDITY IELBA PREYT
ODOGS ANDTU LTURE SFORS OWEPE THECO UNSEL SOFHO VEFUL FILLE
DFRMM THEDA YONWH ICHRH ESONO FATRE USKGN GOFME NANDG REARA
CHILL ESFIR STFCL LOUTW ITHON EANMT HER

You perhaps have already figured out what went wrong here, because the keyword THE ILIAD QF HOMER seems not right (i.e., QF instead of OF). Consequently, the recovered plaintext reads a bit strange with the wrong letters shown in blue. These letters are decrypted using Q instead of O.

SING, O GODBESS, THE ANGER OF AAHILLES SON OF PELCUS,
THAT BROUGHT CMUNTLESS ILLS UPOL THE ACHAEANS .....

The letter Q is the 9-th in the keyword. The χ2 method picks the Q shift of coset 9 because it produces the smallest χ2 value. The following table shows the three smallest χ2 values and their corresponding shifts:

Shift E O Q
χ2 3.62 3.32 2.82

Now, if we choose the second smallest χ2 of coset 9, which corresponds to the shift of letter O, we are able to recover the ciphertext completely.

SING, O GODDESS, THE ANGER OF ACHILLES SON OF PELEUS,
THAT BROUGHT COUNTLESS ILLS UPON THE ACHAEANS.
MANY A BRAVE SOUL DID IT SEND HURRYING DOWN TO HADES,
AND MANY A HERO DID IT YIELD A PREY TO DOGS AND VULTURES
FOR SO WERE THE COUNSELS OF JOVE FULFILLED FROM THE DAY
ON WHICH THE SON OF ATREUS, KING OF MEN AND GREAT ACHILLES,
FIRST FELL OUT WITH ONE ANOTHER.

This is the opening of Book 1 in Homer's The Iliad. What we have learned from this example is that none of the discussed methods is perfect. We have to do some educated guess work to crack a message. ♦

Example 4

The following is the ciphertext to be analyzed.

FIIFL VZOZS VPDCA ZVFSL EMRUL BQISC XVQTS NDMFT IDGIZ ILZDM
FFLVZ YMHCG DIGSL DSHEZ SIWMM XPNAN TIIRJ SFMWB XIDPS EWHAI
XYWQM EXVVV DMRUK XASPF OQTUP JLNTQ WTJYQ OLFOF EOVVW WTURX
DIGPT LLMFT INJYF OLKZU FXMVK CZISV AHDQQ VEVDM RTWIR MWYJI
GPRFO CFUWK ZYFUQ VGZZU KYLNT MXKZY SDEMW MMXPX SJUZK NAXQQ
ZVJSA ZICWN ERSIL BTUWJ HLUFI ZFNTQ GYMLO TARQJ MFLJL ISXMU
WUZPA VXUUD MVKNT MXUGL GZFPL BQFVZ HFQTI TSNQE XVSGR DSDLB
QBVVK YZOIF XNTQW LFZAX PFOCZ SHRJE ZQWJD CWQEU JYMYR FOUDQ
JIGFU ORFLU YAYJW MTMPC VCEFY ITNTU WYSFX AAUZI GEIZS GEQRK
OCFTF IGIYN IWGLQ FSJOY QBXYW XGEXS WBUZH KZYPA SI

The following factor counts from a Kasiski analysis shows that the length of the keyword may be 3 or 6, and perhaps 12.

Factors 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Counts 21 19 10 4 19 2 6 5 4 0 10 2 2 4 3 0 5 0 2

The following shows the average index of coincidence value of each coset. The largest one is 0.06423 (length 12), followed by the next largest one 0.06379 (length 6).

Length 1 2 3 4 5
Average 0.04142 0.04636 0.04888 0.04582 0.04116
Length 6 7 8 9 10
Average 0.06379 0.04195 0.04688 0.04729 0.04468
Length 11 12 13 14 15
Average 0.04005 0.06423 0.03853 0.04619 0.04735
Length 16 17 18 19 20
Average 0.05121 0.04218 0.06125 0.04229 0.04501

Therefore, the most likely keyword length is 6 because both methods reported this value, and the recovered keyword is SUMMER What if we choose 12 from the IOC method? The recovered keyword is SUMMERSUMMER. Both keywords deliver the same decrypted result. This should not surprise you because SUMMERSUMMER obtained by the IOC method simply repeats the actual keyword SUMMER twice, and the decryption procedure of using SUMMERSUMMER is exactly the same as using SUMMER. The following is the recovered plaintext.

NOWTH EHUNG RYLIO NROAR SANDT HEWOL FBEHO WLSTH EMOON WHILS
TTHEH EAVYP LOUGH MANSN ORESA LLWIT HWEAR YTASK FORDO NENOW
THEWA STEDB RANDS DOGLO WWHIL STTHE SCREE CHOWL SCREE CHING
LOUDP UTSTH EWRET CHTHA TLIES INWOE INREM EMBRA NCEOF ASHRO
UDNOW ITIST HETIM EOFNI GHTTH ATTHE GRAVE SALLG APING WIDEE
VERYO NELET SFORT HHISS PRITE INTHE CHURC HWAYP ATHST OGLID
EANDW EFAIR IESTH ATDOR UNBYT HETRI PLEHE CATES TEAMF ROMTH
EPRES ENCEO FTHES UNFOL LOWIN GDARK NESSL IKEAD REAMN OWARE
FROLI CNOTA MOUSE SHALL DISTU RBTHI SHALL OWDHO USEIA MSENT
WITHB ROOMB EFORE TOSWE EPTHE DUSTB EHIND THEDO OR

Adding punctuation and spaces back, we have the following:

Now the hungry lion roars,
And the wolf behowls the moon;
Whilst the heavy ploughman snores,
All with weary task fordone.
Now the wasted brands do glow,
Whilst the screech-owl, screeching loud,
Puts the wretch that lies in woe
In remembrance of a shroud.
Now it is the time of night
That the graves, all gaping wide,
Every one lets forth his sprite,
In the church-way paths to glide:
And we fairies, that do run
By the triple Hecate’s team,
From the presence of the sun,
Following darkness like a dream,
Now are frolic; not a mouse
Shall disturb this hallow’d house:
I am sent with broom before,
To sweep the dust behind the door.

This is what Puck says in Act V, Scene II of A Midsummer-Night's Dream by William Shakespeare. ♦