Android APIs
public abstract class

CharsetEncoder

extends Object
java.lang.Object
   ↳ java.nio.charset.CharsetEncoder

Class Overview

Transforms a sequence of 16-bit Java characters to a byte sequence in some encoding.

The input character sequence is a CharBuffer and the output byte sequence is a ByteBuffer.

Use encode(CharBuffer) to encode an entire CharBuffer to a new ByteBuffer, or encode(CharBuffer, ByteBuffer, boolean) for more control. When using the latter method, the entire operation proceeds as follows:

  1. Invoke reset() to reset the encoder if this instance has been used before.
  2. Invoke encode with the endOfInput parameter set to false until additional input is not needed (as signaled by the return value). The input buffer must be filled and the output buffer must be flushed between invocations.

    The encode method will convert as many characters as possible, and the process won't stop until the input buffer has been exhausted, the output buffer has been filled, or an error has occurred. A CoderResult instance will be returned to indicate the current state. The caller should fill the input buffer, flush the output buffer, or recovering from an error and try again, accordingly.

  3. Invoke encode for the last time with endOfInput set to true.
  4. Invoke flush(ByteBuffer) to flush remaining output.

There are two classes of encoding error: malformed input signifies that the input character sequence is not legal, while unmappable character signifies that the input is legal but cannot be mapped to a byte sequence (because the charset cannot represent the character, for example).

Errors can be handled in three ways. The default is to report the error to the caller. The alternatives are to ignore the error or replace the problematic input with the byte sequence returned by replacement(). The disposition for each of the two kinds of error can be set independently using the onMalformedInput(CodingErrorAction) and onUnmappableCharacter(CodingErrorAction) methods.

The default replacement bytes depend on the charset but can be overridden using the replaceWith(byte[]) method.

This class is abstract and encapsulates many common operations of the encoding process for all charsets. Encoders for a specific charset should extend this class and need only to implement the encodeLoop method for basic encoding. If a subclass maintains internal state, it should also override the implFlush and implReset methods.

This class is not thread-safe.

Summary

Protected Constructors
CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)
Constructs a new CharsetEncoder using the given parameters and the replacement byte array { (byte) '?' }.
CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)
Constructs a new CharsetEncoder using the given Charset, replacement byte array, average number and maximum number of bytes created by this encoder for one input character.
Public Methods
final float averageBytesPerChar()
Returns the average number of bytes created by this encoder for a single input character.
boolean canEncode(CharSequence sequence)
Tests whether the given CharSequence can be encoded by this encoder.
boolean canEncode(char c)
Tests whether the given character can be encoded by this encoder.
final Charset charset()
Returns the Charset which this encoder uses.
final ByteBuffer encode(CharBuffer in)
This is a facade method for the encoding operation.
final CoderResult encode(CharBuffer in, ByteBuffer out, boolean endOfInput)
Encodes characters starting at the current position of the given input buffer, and writes the equivalent byte sequence into the given output buffer from its current position.
final CoderResult flush(ByteBuffer out)
Flushes this encoder.
boolean isLegalReplacement(byte[] replacement)
Tests whether the given argument is legal as this encoder's replacement byte array.
CodingErrorAction malformedInputAction()
Returns this encoder's CodingErrorAction when a malformed input error occurred during the encoding process.
final float maxBytesPerChar()
Returns the maximum number of bytes which can be created by this encoder for one input character, must be positive.
final CharsetEncoder onMalformedInput(CodingErrorAction newAction)
Sets this encoder's action on malformed input error.
final CharsetEncoder onUnmappableCharacter(CodingErrorAction newAction)
Sets this encoder's action on unmappable character error.
final CharsetEncoder replaceWith(byte[] replacement)
Sets the new replacement value.
final byte[] replacement()
Returns the replacement byte array, which is never null or empty.
final CharsetEncoder reset()
Resets this encoder.
CodingErrorAction unmappableCharacterAction()
Returns this encoder's CodingErrorAction when unmappable character occurred during encoding process.
Protected Methods
abstract CoderResult encodeLoop(CharBuffer in, ByteBuffer out)
Encodes characters into bytes.
CoderResult implFlush(ByteBuffer out)
Flushes this encoder.
void implOnMalformedInput(CodingErrorAction newAction)
Notifies that this encoder's CodingErrorAction specified for malformed input error has been changed.
void implOnUnmappableCharacter(CodingErrorAction newAction)
Notifies that this encoder's CodingErrorAction specified for unmappable character error has been changed.
void implReplaceWith(byte[] newReplacement)
Notifies that this encoder's replacement has been changed.
void implReset()
Resets this encoder's charset related state.
[Expand]
Inherited Methods
From class java.lang.Object

Protected Constructors

protected CharsetEncoder (Charset cs, float averageBytesPerChar, float maxBytesPerChar)

Added in API level 1

Constructs a new CharsetEncoder using the given parameters and the replacement byte array { (byte) '?' }.

Parameters
cs Charset
averageBytesPerChar float
maxBytesPerChar float

protected CharsetEncoder (Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)

Added in API level 1

Constructs a new CharsetEncoder using the given Charset, replacement byte array, average number and maximum number of bytes created by this encoder for one input character.

Parameters
cs Charset: the Charset to be used by this encoder.
averageBytesPerChar float: average number of bytes created by this encoder for one single input character, must be positive.
maxBytesPerChar float: maximum number of bytes which can be created by this encoder for one single input character, must be positive.
replacement byte: the replacement byte array, cannot be null or empty, its length cannot be larger than maxBytesPerChar, and must be a legal replacement, which can be justified by isLegalReplacement.
Throws
IllegalArgumentException if any parameters are invalid.

Public Methods

public final float averageBytesPerChar ()

Added in API level 1

Returns the average number of bytes created by this encoder for a single input character.

Returns
float

public boolean canEncode (CharSequence sequence)

Added in API level 1

Tests whether the given CharSequence can be encoded by this encoder.

Note that this method may change the internal state of this encoder, so it should not be called when another encode process is ongoing, otherwise it will throw an IllegalStateException.

Parameters
sequence CharSequence
Returns
boolean
Throws
IllegalStateException if another encode process is ongoing.

public boolean canEncode (char c)

Added in API level 1

Tests whether the given character can be encoded by this encoder.

Note that this method may change the internal state of this encoder, so it should not be called when another encoding process is ongoing, otherwise it will throw an IllegalStateException.

Parameters
c char
Returns
boolean
Throws
IllegalStateException if another encode process is ongoing.

public final Charset charset ()

Added in API level 1

Returns the Charset which this encoder uses.

Returns
Charset

public final ByteBuffer encode (CharBuffer in)

Added in API level 1

This is a facade method for the encoding operation.

This method encodes the remaining character sequence of the given character buffer into a new byte buffer. This method performs a complete encoding operation, resets at first, then encodes, and flushes at last.

This method should not be invoked if another encode operation is ongoing.

Parameters
in CharBuffer: the input buffer.
Returns
ByteBuffer a new ByteBuffer containing the bytes produced by this encoding operation. The buffer's limit will be the position of the last byte in the buffer, and the position will be zero.
Throws
IllegalStateException if another encoding operation is ongoing.
MalformedInputException if an illegal input character sequence for this charset is encountered, and the action for malformed error is CodingErrorAction.REPORT
UnmappableCharacterException if a legal but unmappable input character sequence for this charset is encountered, and the action for unmappable character error is CodingErrorAction.REPORT. Unmappable means the Unicode character sequence at the input buffer's current position cannot be mapped to a equivalent byte sequence.
CharacterCodingException if other exception happened during the encode operation.

public final CoderResult encode (CharBuffer in, ByteBuffer out, boolean endOfInput)

Added in API level 1

Encodes characters starting at the current position of the given input buffer, and writes the equivalent byte sequence into the given output buffer from its current position.

The buffers' position will be changed with the reading and writing operation, but their limits and marks will be kept intact.

A CoderResult instance will be returned according to following rules:

  • A malformed input result indicates that some malformed input error was encountered, and the erroneous characters start at the input buffer's position and their number can be got by result's length. This kind of result can be returned only if the malformed action is CodingErrorAction.REPORT.
  • CoderResult.UNDERFLOW indicates that as many characters as possible in the input buffer have been encoded. If there is no further input and no characters left in the input buffer then this task is complete. If this is not the case then the client should call this method again supplying some more input characters.
  • CoderResult.OVERFLOW indicates that the output buffer has been filled, while there are still some characters remaining in the input buffer. This method should be invoked again with a non-full output buffer.
  • A unmappable character result indicates that some unmappable character error was encountered, and the erroneous characters start at the input buffer's position and their number can be got by result's length. This kind of result can be returned only on CodingErrorAction.REPORT.

The endOfInput parameter indicates if the invoker can provider further input. This parameter is true if and only if the characters in the current input buffer are all inputs for this encoding operation. Note that it is common and won't cause an error if the invoker sets false and then has no more input available, while it may cause an error if the invoker always sets true in several consecutive invocations. This would make the remaining input to be treated as malformed input. input.

This method invokes the encodeLoop method to implement the basic encode logic for a specific charset.

Parameters
in CharBuffer: the input buffer.
out ByteBuffer: the output buffer.
endOfInput boolean: true if all the input characters have been provided.
Returns
CoderResult a CoderResult instance indicating the result.
Throws
IllegalStateException if the encoding operation has already started or no more input is needed in this encoding process.
CoderMalfunctionError If the encodeLoop method threw an BufferUnderflowException or BufferUnderflowException.

public final CoderResult flush (ByteBuffer out)

Added in API level 1

Flushes this encoder.

This method will call implFlush. Some encoders may need to write some bytes to the output buffer when they have read all input characters. Subclasses can override implFlush to perform any writes that are required at the end of the output sequence, such as footers and other metadata.

The maximum number of written bytes won't be larger than out.remaining(). If the encoder wants to write more bytes than the output buffer's available remaining space, then it will return CoderResult.OVERFLOW. This method must then be called again with a byte buffer that has free space.

If the encoder was asked to flush its output when its input is incomplete, (because it ends with an unpaired surrogate, say) it may return CodeResult.MALFORMED.

In all other cases the encoder will return CoderResult.UNDERFLOW, which signifies that all the input so far has been successfully encoded.

During the flush, the output buffer's position will be changed accordingly, while its mark and limit will be intact.

This method is a no-op if the encoder has already been flushed.

Parameters
out ByteBuffer: the given output buffer.
Returns
CoderResult CoderResult.UNDERFLOW or CoderResult.OVERFLOW or CoderResult.MALFORMED
Throws
IllegalStateException if this encoder isn't already flushed or at end of input.

public boolean isLegalReplacement (byte[] replacement)

Added in API level 1

Tests whether the given argument is legal as this encoder's replacement byte array. The given byte array is legal if and only if it can be decoded into characters.

Parameters
replacement byte
Returns
boolean

public CodingErrorAction malformedInputAction ()

Added in API level 1

Returns this encoder's CodingErrorAction when a malformed input error occurred during the encoding process.

Returns
CodingErrorAction

public final float maxBytesPerChar ()

Added in API level 1

Returns the maximum number of bytes which can be created by this encoder for one input character, must be positive.

Returns
float

public final CharsetEncoder onMalformedInput (CodingErrorAction newAction)

Added in API level 1

Sets this encoder's action on malformed input error. This method will call the implOnMalformedInput method with the given new action as argument.

Parameters
newAction CodingErrorAction: the new action on malformed input error.
Returns
CharsetEncoder this encoder.
Throws
IllegalArgumentException if the given newAction is null.

public final CharsetEncoder onUnmappableCharacter (CodingErrorAction newAction)

Added in API level 1

Sets this encoder's action on unmappable character error. This method will call the implOnUnmappableCharacter method with the given new action as argument.

Parameters
newAction CodingErrorAction: the new action on unmappable character error.
Returns
CharsetEncoder this encoder.
Throws
IllegalArgumentException if the given newAction is null.

public final CharsetEncoder replaceWith (byte[] replacement)

Added in API level 1

Sets the new replacement value. This method first checks the given replacement's validity, then changes the replacement value and finally calls the implReplaceWith method with the given new replacement as argument.

Parameters
replacement byte: the replacement byte array, cannot be null or empty, its length cannot be larger than maxBytesPerChar, and it must be legal replacement, which can be justified by calling isLegalReplacement(byte[] replacement).
Returns
CharsetEncoder this encoder.
Throws
IllegalArgumentException if the given replacement cannot satisfy the requirement mentioned above.

public final byte[] replacement ()

Added in API level 1

Returns the replacement byte array, which is never null or empty.

Returns
byte[]

public final CharsetEncoder reset ()

Added in API level 1

Resets this encoder. This method will reset the internal state and then calls implReset() to reset any state related to the specific charset.

Returns
CharsetEncoder

public CodingErrorAction unmappableCharacterAction ()

Added in API level 1

Returns this encoder's CodingErrorAction when unmappable character occurred during encoding process.

Returns
CodingErrorAction

Protected Methods

protected abstract CoderResult encodeLoop (CharBuffer in, ByteBuffer out)

Added in API level 1

Encodes characters into bytes. This method is called by encode.

This method will implement the essential encoding operation, and it won't stop encoding until either all the input characters are read, the output buffer is filled, or some exception is encountered. Then it will return a CoderResult object indicating the result of the current encoding operation. The rule to construct the CoderResult is the same as for encode. When an exception is encountered in the encoding operation, most implementations of this method will return a relevant result object to the encode method, and subclasses may handle the exception and implement the error action themselves.

The buffers are scanned from their current positions, and their positions will be modified accordingly, while their marks and limits will be intact. At most in.remaining() characters will be read, and out.remaining() bytes will be written.

Note that some implementations may pre-scan the input buffer and return CoderResult.UNDERFLOW until it receives sufficient input.

Parameters
in CharBuffer: the input buffer.
out ByteBuffer: the output buffer.
Returns
CoderResult a CoderResult instance indicating the result.

protected CoderResult implFlush (ByteBuffer out)

Added in API level 1

Flushes this encoder. The default implementation does nothing and always returns CoderResult.UNDERFLOW; this method can be overridden if needed.

Parameters
out ByteBuffer: the output buffer.
Returns
CoderResult CoderResult.UNDERFLOW or CoderResult.OVERFLOW.

protected void implOnMalformedInput (CodingErrorAction newAction)

Added in API level 1

Notifies that this encoder's CodingErrorAction specified for malformed input error has been changed. The default implementation does nothing; this method can be overridden if needed.

Parameters
newAction CodingErrorAction: the new action.

protected void implOnUnmappableCharacter (CodingErrorAction newAction)

Added in API level 1

Notifies that this encoder's CodingErrorAction specified for unmappable character error has been changed. The default implementation does nothing; this method can be overridden if needed.

Parameters
newAction CodingErrorAction: the new action.

protected void implReplaceWith (byte[] newReplacement)

Added in API level 1

Notifies that this encoder's replacement has been changed. The default implementation does nothing; this method can be overridden if needed.

Parameters
newReplacement byte: the new replacement string.

protected void implReset ()

Added in API level 1

Resets this encoder's charset related state. The default implementation does nothing; this method can be overridden if needed.