Getting weird results from java string codepoints on a windows machine [duplicate]

14 hours ago 1
ARTICLE AD BOX

You can use codePointCount(beginIndex, endIndex) to count the number of code points in your String instead of using length().

val1.codePointCount(0, val1.length())

See the following example,

String val1 = "\u5B66\uD8F0\uDE30"; System.out.println("character count: " + val1.length()); System.out.println("code points: "+ val1.codePointCount(0, val1.length()));

output

character count: 3 code points: 2

FYI, you cannot print individual surrogate characters from a String using charAt() either. In order to print individual supplementary character from a String use codePointAt and offsetByCodePoints(index, codePointOffset), like this,

for (int i =0; i<val1.codePointCount(0, val1.length()); i++) System.out.println("character at " + i + ": "+ val1.codePointAt(val1.offsetByCodePoints(0, i))); }

gives,

character at 0: 23398 character at 1: 311856

for Java 8

You can use val1.codePoints(), which returns an IntStream of all code points in the sequence.

Since you are interested in length of your String, use,

val1.codePoints().count();

to print code points,

val1.codePoints().forEach(a -> System.out.println(a));
Read Entire Article