Sumeet Singh February 2016

Find the minimal lexographical string formed by merging two strings

Suppose we are given two strings s1 and s2(both lowercase). We have two find the minimal lexographic string that can be formed by merging two strings.

At the beginning , it looks prettty simple as merge of the mergesort algorithm. But let us see what can go wrong.

s1: zyy

s2: zy

Now if we perform merge on these two we must decide which z to pick as they are equal, clearly if we pick z of s2 first then the string formed will be:

zyzyy

If we pick z of s1 first, the string formed will be:

zyyzy which is correct.

As we can see the merge of mergesort can lead to wrong answer.

Here's another example:

s1:zyy

s2:zyb

Now the correct answer will be zybzyy which will be got only if pick z of s2 first.

There are plenty of other cases in which the simple merge will fail. My question is Is there any standard algorithm out there used to perform merge for such output.

Answers


svs February 2016

You could use dynamic programming. In f[x][y] store the minimal lexicographical string such that you've taken x charecters from the first string s1 and y characters from the second s2. You can calculate f in bottom-top manner using the update:

f[x][y] = min(f[x-1][y] + s1[x], f[x][y-1] + s2[y]) \\ the '+' here represents
                                                    \\ the concatenation of a 
                                                    \\ string and a character

You start with f[0][0] = "" (empty string).

For efficiency you can store the strings in f as references. That is, you can store in f the objects

class StringRef {
    StringRef prev;
    char c;
}

To extract what string you have at certain f[x][y] you just follow the references. To udapate you point back to either f[x-1][y] or f[x][y-1] depending on what your update step says.


Serge Rogatch February 2016

It seems that the solution can be almost the same as you described (the "mergesort"-like approach), except that with special handling of equality. So long as the first characters of both strings are equal, you look ahead at the second character, 3rd, etc. If the end is reached for some string, consider the first character of the other string as the next character in the string for which the end is reached, etc. for the 2nd character, etc. If the ends for both strings are reached, then it doesn't matter from which string to take the first character. Note that this algorithm is O(N) because after a look-ahead on equal prefixes you know the whole look-ahead sequence (i.e. string prefix) to include, not just one first character.

EDIT: you look ahead so long as the current i-th characters from both strings are equal and alphabetically not larger than the first character in the current prefix.

Post Status

Asked in February 2016
Viewed 3,423 times
Voted 7
Answered 2 times

Search




Leave an answer