'Maximum String Mismatches in Java

I was recently going through a question in codehub and I was unable to solve this query. Can anyone help me how can this be solved?

You are given a string S of length N. You can select and reverse any substring of S of any length. You are allowed to perform this operation many number of times.

Determine maximum number of mismatches by performing operation.

Mismatch(S) is defined as number of corresponding positions were characters are different in S and reverse(S). For example : S = abab, reverse(S) = baba. Number of mismatches = 4. S= abca. Number of mismatches = 2.

Pseudo code :

static int solve( String S, int n)
{
//To do
}

Will be helpful if some one can explain once the code is done how this can be interpreted more easily and approached to solve.



Solution 1:[1]

I recently encountered the same question in one of the competency tests, I don't know about above solution, but my implementation below in python works for above problem

import itertools
def maximum_mismatches(s,n):
    if len(set(s)) == 1:
        return 0
    maxc = 0
    for str_c in set(itertools.permutations(s,n)):
        rev_str = str_c[::-1]
        counter = 0
        for i in range(n):
            if str_c[i] != rev_str[i]:
                counter += 1
        if maxc < counter:
            maxc = counter
    return maxc

I have tested for multiple test cases, it is working

Solution 2:[2]

Calculate the frequency of each character in the string and then take the difference of frequency of each character. See the code example in Java. It returns the max mismatch of characters of a string & it's reverse, given that a substring in the original string can be reversed any number of times.

Examples: 1. cantaaaa (answer = 6), 2. cbbaa (answer = 4), 3. aaaaabbbbccdd (answer = 12)

public static int maxMismatch(String str){
    if(str == null || str.length() <= 1) return 0;

    int freq[] = new int[26];

    for(int i = 0; i < str.length(); i++){
        freq[str.charAt(i) - 97]++;
    }

    int diff = 0;
    for(int i = 0; i < freq.length; i++){
        diff = Math.abs(freq[i] - diff);
    }

    return str.length() - diff;
}

Solution 3:[3]

import itertools as it
def mismatch(S,n):
    max =0
    for x in set(it.permutations(S[::-1],n)):
        cnt = 0
        for i in range(n):
            if x[i]!=S[i]:
                cnt+=1
        if cnt > max:
            max=cnt
    return max

Solution 4:[4]

#include <bits/stdc++.h>
using namespace std;
int max_mismatch(string s)
{
  unordered_map<char,int> fmap;
  for(auto &ch:s)
    fmap[ch]++;
  vector<int> tempV;
  for(auto &[_,f]:fmap)
     tempV.push_back(f);
  sort(tempV.begin(),tempV.end(),greater<int>());    
  vector<int> freq(tempV.size());
  int diff=0,left=0,right=freq.size()-1;
  bool leftinsert=true;
  for(auto f:tempV)
  {
      if(leftinsert)
          freq[left++]=f;
      else
         freq[right--]=f;
      leftinsert^=1;     
  }
  /*for(auto &val:freq)
     cout<<val<<" ";
  cout<<endl;  */
  left=0;right=freq.size()-1;
  for(int i=0;i<s.size()/2;i++)
  {
     if(left==right)
       break;
     if(!(--freq[left])) left++;
     if(!(--freq[right])) right--;
     diff+=2;
  }
  return diff;        
}
int main()
{
    string s="aatacanaa";
    cout<<max_mismatch(s);
}

Solution 5:[5]

First a function to calculate the mismatch score, using this spec:

Mismatch(S) is defined as number of corresponding positions were characters are different in S and reverse(S). For example : S = abab, reverse(S) = baba. Number of mismatches = 4. S= abca. Number of mismatches = 2.

We don't need to actually reverse the string. Just compare the first character with the last, then the second with the second last, etc. I find it straightforward to use two index variables for something like this:

int mismatches(String s) {
    int result = 0;
    int i = 0;
    int j = s.length() - 1;
    while (j >= 0) {
        if (s.charAt(i) != s.charAt(j)) result++;
        i++;
        j--;
    }
    return result;
}

And here are some tests to increase our confidence:

@Test
public void test_mismatches() {
    assertEquals(0, mismatches(""));
    assertEquals(0, mismatches("a"));
    assertEquals(0, mismatches("aa"));
    assertEquals(2, mismatches("ab"));
    assertEquals(0, mismatches("aaa"));
    assertEquals(2, mismatches("aab"));
    assertEquals(2, mismatches("abc"));
    assertEquals(0, mismatches("aaaa"));
    assertEquals(2, mismatches("aaab"));
    assertEquals(2, mismatches("aaba"));
    assertEquals(4, mismatches("abcd"));
}

Now let's find the max mismatch. Here is the spec:

You are given a string S of length N. You can select and reverse any substring of S of any length. You are allowed to perform this operation many number of times.

Determine maximum number of mismatches by performing operation.

This is unclear and has typos. It is lazily written and if it was client work, I would be asking for a clarification. I've been burned in the past by working from unclear specs, and building something that the client didn't intend.

I think the use of the word "reverse" is a misnomer. I think it is to help hint at a solution for calculating the mismatch. The spec should be:

You are given a string. Determine maximum number of mismatches from any substring. For clarity, the original string is a valid substring of itself.

With this revised spec, we can just use a nested for loop to traverse the search space:

int maxMismatches(String s) {
    int result = 0;
    for (int i = 0; i < s.length(); i++)
        for (int j = s.length(); j > i; j--) {
            int mismatch = mismatches(s.substring(i, j));
            if (mismatch > result)
                result = mismatch;
        }
    return result;
}

And some tests:

@Test
public void test_max_mismatches() {
    assertEquals(0, maxMismatches(""));
    assertEquals(0, maxMismatches("a"));
    assertEquals(0, maxMismatches("aa"));
    assertEquals(2, maxMismatches("ab"));
    assertEquals(2, maxMismatches("aba"));
    assertEquals(4, maxMismatches("aabbaa"));
    assertEquals(6, maxMismatches("aaabcda"));
}

There is a gap in this set of tests. Some of the test strings require truncating the start to find the max. Some require truncating the end, but none require truncating both the start and end.

Both functions have obvious optimisations but they reduce clarity in a scenario where I don't think performance is important.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Ashutosh Chamoli
Solution 3 Shruthi C
Solution 4 sumit kumar
Solution 5 Chris Foley