Sort a String arrayTag(s): Internationalization Varia


Sort utilities are now part of latest JDK versions.

Case sensitive

java.util.Arrays.sort(myArray);

Case insensitive

java.util.Arrays.sort(myArray, String.CASE_INSENSITIVE_ORDER);

Sort with international characters.

Take the following example :

import java.util.*;
import java.io.*;

public class TestSort1 {

static String [] words = { "Réal", "Real", "Raoul", "Rico" };

  public static void main(String args[]) throws Exception {
    try {
      Writer w = getWriter();
      w.write("Before :\n");

      for (String s : words) {
        w.write(s + " ");
      }

      java.util.Arrays.sort(words);

      w.write("\nAfter :\n");
      for (String s : words) {
        w.write(s + " ");
      }
      w.flush();
      w.close();
    }
    catch(Exception e){
      e.printStackTrace();
    }

  }


  // useful to output accentued characters to the console
  public static Writer getWriter() throws UnsupportedEncodingException {
    if (System.console() == null) {
      Writer w =
        new BufferedWriter
         (new OutputStreamWriter(System.out, "Cp850"));
      return w;
    }
    else {
      return System.console().writer();
    }
  }
}
The output is :
Before :
Réal Real Raoul Rico
After :
Raoul Real Rico Réal
which is wrong since we expect to find "Réal" after "Real".

To solve the problem , replace

java.util.Arrays.sort(words);
by
java.util.Arrays.sort(words, java.text.Collator.getInstance(java.util.Locale.FRENCH));
// or
// java.util.Arrays.sort(words, java.text.Collator.getInstance());
and the output will be :
Before :
Réal Real Raoul Rico
After :
Raoul Real Réal Rico
Or you can do it the long way :
import java.util.Locale;
import java.text.Collator;

...
Locale loc = Locale.FRENCH;
sortArray(Collator.getInstance(loc), words);
...

public static void sortArray(Collator collator, String[] strArray) {
  String tmp;
  if (strArray.length == 1) return;
  for (int i = 0; i < strArray.length; i++) {
    for (int j = i + 1; j < strArray.length; j++) {
      if( collator.compare(strArray[i], strArray[j] ) > 0 ) {
        tmp = strArray[i];
        strArray[i] = strArray[j];
        strArray[j] = tmp;
        }
      }
    }
  }
See this HowTo
blog comments powered by Disqus