Burrow and Wheeler discovered in 1994 a clever but simple way to achieve very nice compression ratios when compressing blocks of data.
Basically, they proposes a very simple procedure to pre-process the blocks to be compressed returning a convenient word that enables simple compression methods to be used that give good compression rates.
Let us look at an example.
Imagine that you want to compress the word abracadabra.
Burrow and Wheeler propose to compute the set of all the possible
rotations (right to left rotations) of the word to be compressed. This
gives:
0 | abracadabra |
1 | bracadabraa |
2 | racadabraab |
3 | acadabraabr |
4 | cadabraabra |
5 | adabraabrac |
6 | dabraabraca |
7 | abraabracad |
8 | braabracada |
9 | raabracadab |
10 | aabracadabr |
Then they tell us to sort this list of strings and to look at the word composed by the letters of the last column. We get:
0 | 10 | aabracadabr |
1 | 7 | abraabracad |
2 | 0 | abracadabra |
3 | 3 | acadabraabr |
4 | 5 | adabraabrac |
5 | 8 | braabracada |
6 | 1 | bracadabraa |
7 | 4 | cadabraabra |
8 | 6 | dabraabraca |
9 | 9 | raabracadab |
10 | 2 | racadabraab |
and the resulting word is rdarcaaaabb.
Surprisingly enough this process tends to gather the same letters in the same part of the resulting word (and this then offers a very competitive compression ratio using very simple algorithms).
But more interesting is the following fact: there is an efficient way to recover the original word abacadabra by knowing only the word rdarcaaaabb and the position of the original word abacadabra in the rotation list that gave rdarcaaaabb, i.e. position 2, in this example.
Your task is compute the inverse transformation: given a preprocessed word and the position of the original word in the sorted rotation list write a program that computes the original word.
The input contains two lines. The first line is the preprocessed word. The second line contains the position (an integer) of the original word in the ordered rotation list.
The maximum size of a word to be compressed is 1000.
The ascii code of the input characters is exclusively included in the interval [33..126]. This constraint ensures that the given characters are human readable characters.
A single line with the original word.
rdarcaaaabb 2
abracadabra
This document was translated from LATEX by HEVEA.