The backward slash “\” is used to escape characters in the java string. If a string contains a backward slash “\”, a escape character is added along with backward slash “\”. For example, a String “Yawin\tutor” is created as “Yawin\\tutor” in java. In regular expression if you use single backward slash “\” throws error as it is a escape character. If you use double backward slash “\\”, it throws “java.util.regex.PatternSyntaxException: Unexpected internal error near index” exception.
Exception
The exception java.util.regex.PatternSyntaxException: Unexpected internal error near index 1 \ will be shown as like below stack trace.
Exception in thread "main" java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
\
^
at java.util.regex.Pattern.error(Pattern.java:1955)
at java.util.regex.Pattern.compile(Pattern.java:1702)
at java.util.regex.Pattern.<init>(Pattern.java:1351)
at java.util.regex.Pattern.compile(Pattern.java:1028)
at java.lang.String.split(String.java:2380)
at java.lang.String.split(String.java:2422)
at com.yawintutor.StringSplit.main(StringSplit.java:9)
Root cause
In regular expression if you use single backward slash “\” throws error as it is a escape character. If you use double backward slash “\\”, it throws “java.util.regex.PatternSyntaxException: Unexpected internal error near index” exception.
The double backward slash is treated as a single backward slash “\” in regular expression. So four backward slash “\\\\” should be added to match a single backward slash in a String.
How to reproduce this issue
The example below contains two backward slash “\\” in the regular expression that throws “java.util.regex.PatternSyntaxException: Unexpected internal error near index” exception.
package com.yawintutor;
public class StringSplit {
public static void main(String[] args) {
String str = "Yawin\\tutor";
String str2 = "\\";
String[] strArray;
strArray = str.split(str2);
System.out.println("Given String : " + str);
for(int i=0;i<strArray.length;i++) {
System.out.println("strArray["+i+"] = " + strArray[i]);
}
}
}
Solution
Modify the regular expression with four backward slash “\\\\” in the above program. This will resolve the exception. The regular expression string will convert the four backward slash to 2 backward slash. The two backward slash “\\” is identified as single slash in regular expression matching.
package com.yawintutor;
public class StringSplit {
public static void main(String[] args) {
String str = "Yawin\\tutor";
String str2 = "\\\\";
String[] strArray;
strArray = str.split(str2);
System.out.println("Given String : " + str);
for(int i=0;i<strArray.length;i++) {
System.out.println("strArray["+i+"] = " + strArray[i]);
}
}
}
Output
Given String : Yawin\tutor
strArray[0] = Yawin
strArray[1] = tutor