org.jikesrvm.classloader
Class UTF8Convert

java.lang.Object
  extended by org.jikesrvm.classloader.UTF8Convert

public abstract class UTF8Convert
extends Object

Abstract class that contains conversion routines to/from utf8 and/or pseudo-utf8. It does not support utf8 encodings of more than 3 bytes.

The difference between utf8 and pseudo-utf8 is the special treatment of null. In utf8, null is encoded as a single byte directly, whereas in pseudo-utf8, it is encoded as a two-byte sequence. See the JVM specification for more information.


Nested Class Summary
private static class UTF8Convert.ByteArrayStringEncoderVisitor
          Visitor that builds up a char[] as characters are decoded
private static class UTF8Convert.ByteBufferStringEncoderVisitor
          Visitor that builds up a char[] as characters are decoded
private static class UTF8Convert.StringHashCodeVisitor
          Visitor that builds up a String.hashCode form hashCode as characters are decoded
private static class UTF8Convert.UTF8CharacterVisitor
          UTF8 character visitor abstraction
 
Field Summary
(package private) static boolean ALLOW_NORMAL_UTF8
          Set fromUTF8 to not throw an exception when given a normal utf8 byte array.
(package private) static boolean ALLOW_PSEUDO_UTF8
          Set fromUTF8 to not throw an exception when given a pseudo utf8 byte array.
(package private) static boolean STRICTLY_CHECK_FORMAT
          Strictly check the format of the utf8/pseudo-utf8 byte array in fromUTF8.
(package private) static boolean WRITE_PSEUDO_UTF8
          Set toUTF8 to write in pseudo-utf8 (rather than normal utf8).
 
Constructor Summary
UTF8Convert()
           
 
Method Summary
static boolean check(byte[] bytes)
          Check whether the given sequence of bytes is valid (pseudo-)utf8.
static int computeStringHashCode(byte[] utf8)
          Convert the given sequence of (pseudo-)utf8 formatted bytes into a String hashCode.
static String fromUTF8(byte[] utf8)
          Convert the given sequence of (pseudo-)utf8 formatted bytes into a String.
static String fromUTF8(ByteBuffer utf8)
          Convert the given sequence of (pseudo-)utf8 formatted bytes into a String.
private static void throwDataFormatException(String message, int location)
          Generate exception messages without bloating code
static byte[] toUTF8(String s)
          Convert the given String into a sequence of (pseudo-)utf8 formatted bytes.
static void toUTF8(String s, ByteBuffer b)
          Convert the given String into a sequence of (pseudo-)utf8 formatted bytes.
static int utfLength(String s)
          Returns the length of a string's UTF encoded form.
private static void visitUTF8(byte[] utf8, UTF8Convert.UTF8CharacterVisitor visitor)
          Visit all bytes of the given utf8 string calling the visitor when a character is decoded.
private static void visitUTF8(ByteBuffer utf8, UTF8Convert.UTF8CharacterVisitor visitor)
          Visit all bytes of the given utf8 string calling the visitor when a character is decoded.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STRICTLY_CHECK_FORMAT

static final boolean STRICTLY_CHECK_FORMAT
Strictly check the format of the utf8/pseudo-utf8 byte array in fromUTF8.

See Also:
Constant Field Values

ALLOW_NORMAL_UTF8

static final boolean ALLOW_NORMAL_UTF8
Set fromUTF8 to not throw an exception when given a normal utf8 byte array.

See Also:
Constant Field Values

ALLOW_PSEUDO_UTF8

static final boolean ALLOW_PSEUDO_UTF8
Set fromUTF8 to not throw an exception when given a pseudo utf8 byte array.

See Also:
Constant Field Values

WRITE_PSEUDO_UTF8

static final boolean WRITE_PSEUDO_UTF8
Set toUTF8 to write in pseudo-utf8 (rather than normal utf8).

See Also:
Constant Field Values
Constructor Detail

UTF8Convert

public UTF8Convert()
Method Detail

fromUTF8

public static String fromUTF8(byte[] utf8)
                       throws UTFDataFormatException
Convert the given sequence of (pseudo-)utf8 formatted bytes into a String.

The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.

Parameters:
utf8 - (pseudo-)utf8 byte array
Returns:
unicode string
Throws:
UTFDataFormatException - if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8

fromUTF8

public static String fromUTF8(ByteBuffer utf8)
                       throws UTFDataFormatException
Convert the given sequence of (pseudo-)utf8 formatted bytes into a String. The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.

Parameters:
utf8 - (pseudo-)utf8 byte array
Returns:
unicode string
Throws:
UTFDataFormatException - if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8

computeStringHashCode

public static int computeStringHashCode(byte[] utf8)
                                 throws UTFDataFormatException
Convert the given sequence of (pseudo-)utf8 formatted bytes into a String hashCode.

The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.

Parameters:
utf8 - (pseudo-)utf8 byte array
Returns:
hashCode corresponding to if this were a String.hashCode
Throws:
UTFDataFormatException - if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8

throwDataFormatException

private static void throwDataFormatException(String message,
                                             int location)
                                      throws UTFDataFormatException
Generate exception messages without bloating code

Throws:
UTFDataFormatException

visitUTF8

private static void visitUTF8(byte[] utf8,
                              UTF8Convert.UTF8CharacterVisitor visitor)
                       throws UTFDataFormatException
Visit all bytes of the given utf8 string calling the visitor when a character is decoded.

The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.

Parameters:
utf8 - (pseudo-)utf8 byte array
visitor - called when characters are decoded
Throws:
UTFDataFormatException - if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8

visitUTF8

private static void visitUTF8(ByteBuffer utf8,
                              UTF8Convert.UTF8CharacterVisitor visitor)
                       throws UTFDataFormatException
Visit all bytes of the given utf8 string calling the visitor when a character is decoded.

The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.

Parameters:
utf8 - (pseudo-)utf8 byte array
visitor - called when characters are decoded
Throws:
UTFDataFormatException - if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8

toUTF8

public static byte[] toUTF8(String s)
Convert the given String into a sequence of (pseudo-)utf8 formatted bytes.

The output format is controlled by the WRITE_PSEUDO_UTF8 flag.

Parameters:
s - String to convert
Returns:
array containing sequence of (pseudo-)utf8 formatted bytes

toUTF8

public static void toUTF8(String s,
                          ByteBuffer b)
Convert the given String into a sequence of (pseudo-)utf8 formatted bytes.

The output format is controlled by the WRITE_PSEUDO_UTF8 flag.

Parameters:
s - String to convert
b - Byte buffer to hold result

utfLength

public static int utfLength(String s)
Returns the length of a string's UTF encoded form.


check

public static boolean check(byte[] bytes)
Check whether the given sequence of bytes is valid (pseudo-)utf8.

Parameters:
bytes - byte array to check
Returns:
true iff the given sequence is valid (pseudo-)utf8.