I will come to the lexicographical part later. Let's take a look at this from the top down. There are three general scenarios to solve F(0,N1)F(0, N - 1):

1. S[0,N1]S[0, N - 1] is impossible to encode.

2. In the format x[encodedstring]x[encoded string], the complete string S[0,N1]S[0, N - 1] can be encoded.

3. S[0,N1]S[0, N - 1] can be encoded by concatenating the encoded strings of smaller sub-strings.

Cases 1 and 2 are rather basic; however, in example 3, we must loop over all split indexes and answer the sub-problems below:

F(0,0),F(1,N1);F(0,0), F(1, N - 1);

F(0,2),F(2,N1);F(0,2), F(2, N - 1);

...........

F(0,N2),F(N1,N1)F(0, N - 2), F(N - 1, N - 1)

In order to solve F(1,N1)F(1, N - 1) we need to solve F(1,2)F(1, 2), F(2,N1)F(2, N - 1), etc. We have overlapping sub-problems!

O(N^3) dynamic programming solution.

State definition:

dp[i][j]dp[i][j]: the shortest encoded string we can get from s[i,j]s[i, j];

Answer: dp[0][N1]dp[0][N - 1]

State transition:

Assign the original sub-string to dp[i][j]dp[i][j] for each possible length of sub-string. Then see if this entire sub-string can be encoded as x[encoded string] (more on that later); if so, we're done computing for this sub-string. Because no encoded concatenation of smaller sub-strings can match this. Otherwise, we'll have to loop through all of the split indices in ii, j1j - 1 to find the shortest encoded string.

So far, the discussion has been rather obvious, except for one point: how do we determine whether a string may be encoded as x[encodedstring]x[encoded string]? A string must be periodic if it can be encoded in this style. We wish to know how many times the shortest repeating sub-string appears in the original string. If we have aaaaaaaaaaaaaa, for example, we want to receive 8[a]8[a], not 4[aa]4[aa] or 2[aaaa]2[aaaa]. It turns out that there is a pretty pleasant way to achieve this objective:

For a given string SS, we append SS to SS to get SSSS, then we call SS.indexOf(S,1SS.indexOf(S, 1), meaning we try to find the starting index of the first sub-string SS in SSSS and we start our search from index 11, not 00. Because we just appended an entire copy of SS, so there is a guaranteed match. If the first matched index is >=S.length()>= S.length(), we know that S is not periodic thus can not be encoded for case 2. Otherwise, the length of the shortest repeating sub-string is this matched index idx and it repeats S.length()/idxS.length() / idx times. We then just need to construct this encoded string and make sure that it is shorter than the original string SS before using it to update our dp table.

Now, about the lexicographical part, it’s pretty obvious. Our first priority is shortening the length of the string. So, we need to handle lexicographical order only when the encoding gives us the same length as aaaaaaaa and 4[a]4[a]. And as numbers comes first in lexicographical order, so it’s better to encode in this scenario.

Statistics

75% Solution Ratio
mahdi.hasnatEarliest, Dec '21
TurinhstuFastest, 0.1s
mahdi.hasnatLightest, 8.3 MB
mahdi.hasnatShortest, 2139B
Toph uses cookies. By continuing you agree to our Cookie Policy.