public abstract class CharsetEncoder extends Object
The input character sequence is a CharBuffer
and the
output byte sequence is a ByteBuffer
.
Use encode(CharBuffer)
to encode an entire CharBuffer
to a
new ByteBuffer
, or encode(CharBuffer, ByteBuffer, boolean)
for more
control. When using the latter method, the entire operation proceeds as follows:
reset()
to reset the encoder if this instance has been used before.encode
with the endOfInput
parameter set to false until additional input is not needed (as signaled by the return value).
The input buffer must be filled and the output buffer must be flushed between invocations.
The encode
method will
convert as many characters as possible, and the process won't stop until the
input buffer has been exhausted, the output buffer has been filled, or an
error has occurred. A CoderResult
instance will be
returned to indicate the current state. The caller should fill the input buffer, flush
the output buffer, or recovering from an error and try again, accordingly.
encode
for the last time with
endOfInput
set to true.flush(ByteBuffer)
to flush remaining output.There are two classes of encoding error: malformed input signifies that the input character sequence is not legal, while unmappable character signifies that the input is legal but cannot be mapped to a byte sequence (because the charset cannot represent the character, for example).
Errors can be handled in three ways. The default is to
report
the error to the caller. The alternatives are to
ignore
the error or replace
the problematic input with the byte sequence returned by replacement()
. The disposition
for each of the two kinds of error can be set independently using the onMalformedInput(java.nio.charset.CodingErrorAction)
and onUnmappableCharacter(java.nio.charset.CodingErrorAction)
methods.
The default replacement bytes depend on the charset but can be overridden using the
replaceWith(byte[])
method.
This class is abstract and encapsulates many common operations of the
encoding process for all charsets. Encoders for a specific charset should
extend this class and need only to implement the
encodeLoop
method for basic
encoding. If a subclass maintains internal state, it should also override the
implFlush
and implReset
methods.
This class is not thread-safe.
Charset
,
CharsetDecoder
Modifier | Constructor and Description |
---|---|
protected |
CharsetEncoder(Charset cs,
float averageBytesPerChar,
float maxBytesPerChar)
Constructs a new
CharsetEncoder using the given parameters and
the replacement byte array { (byte) '?' } . |
protected |
CharsetEncoder(Charset cs,
float averageBytesPerChar,
float maxBytesPerChar,
byte[] replacement)
Constructs a new
CharsetEncoder using the given
Charset , replacement byte array, average number and
maximum number of bytes created by this encoder for one input character. |
Modifier and Type | Method and Description |
---|---|
float |
averageBytesPerChar()
Returns the average number of bytes created by this encoder for a single
input character.
|
boolean |
canEncode(char c)
Checks if the given character can be encoded by this encoder.
|
boolean |
canEncode(CharSequence sequence)
Checks if a given
CharSequence can be encoded by this
encoder. |
Charset |
charset()
Returns the
Charset which this encoder uses. |
ByteBuffer |
encode(CharBuffer in)
This is a facade method for the encoding operation.
|
CoderResult |
encode(CharBuffer in,
ByteBuffer out,
boolean endOfInput)
Encodes characters starting at the current position of the given input
buffer, and writes the equivalent byte sequence into the given output
buffer from its current position.
|
protected abstract CoderResult |
encodeLoop(CharBuffer in,
ByteBuffer out)
Encodes characters into bytes.
|
CoderResult |
flush(ByteBuffer out)
Flushes this encoder.
|
protected CoderResult |
implFlush(ByteBuffer out)
Flushes this encoder.
|
protected void |
implOnMalformedInput(CodingErrorAction newAction)
Notifies that this encoder's
CodingErrorAction specified
for malformed input error has been changed. |
protected void |
implOnUnmappableCharacter(CodingErrorAction newAction)
Notifies that this encoder's
CodingErrorAction specified
for unmappable character error has been changed. |
protected void |
implReplaceWith(byte[] newReplacement)
Notifies that this encoder's replacement has been changed.
|
protected void |
implReset()
Resets this encoder's charset related state.
|
boolean |
isLegalReplacement(byte[] replacement)
Checks if the given argument is legal as this encoder's replacement byte
array.
|
CodingErrorAction |
malformedInputAction()
Returns this encoder's
CodingErrorAction when a malformed
input error occurred during the encoding process. |
float |
maxBytesPerChar()
Returns the maximum number of bytes which can be created by this encoder for
one input character, must be positive.
|
CharsetEncoder |
onMalformedInput(CodingErrorAction newAction)
Sets this encoder's action on malformed input error.
|
CharsetEncoder |
onUnmappableCharacter(CodingErrorAction newAction)
Sets this encoder's action on unmappable character error.
|
byte[] |
replacement()
Returns the replacement byte array, which is never null or empty.
|
CharsetEncoder |
replaceWith(byte[] replacement)
Sets the new replacement value.
|
CharsetEncoder |
reset()
Resets this encoder.
|
CodingErrorAction |
unmappableCharacterAction()
Returns this encoder's
CodingErrorAction when unmappable
character occurred during encoding process. |
protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)
CharsetEncoder
using the given parameters and
the replacement byte array { (byte) '?' }
.protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)
CharsetEncoder
using the given
Charset
, replacement byte array, average number and
maximum number of bytes created by this encoder for one input character.cs
- the Charset
to be used by this encoder.averageBytesPerChar
- average number of bytes created by this encoder for one single
input character, must be positive.maxBytesPerChar
- maximum number of bytes which can be created by this encoder
for one single input character, must be positive.replacement
- the replacement byte array, cannot be null or empty, its
length cannot be larger than maxBytesPerChar
,
and must be a legal replacement, which can be justified by
isLegalReplacement
.IllegalArgumentException
- if any parameters are invalid.public final float averageBytesPerChar()
public boolean canEncode(char c)
Note that this method can change the internal status of this encoder, so
it should not be called when another encoding process is ongoing,
otherwise it will throw an IllegalStateException
.
This method can be overridden for performance improvement.
c
- the given encoder.IllegalStateException
- if another encode process is ongoing so that the current
internal status is neither RESET or FLUSH.public boolean canEncode(CharSequence sequence)
CharSequence
can be encoded by this
encoder.
Note that this method can change the internal status of this encoder, so
it should not be called when another encode process is ongoing, otherwise
it will throw an IllegalStateException
.
This method can be overridden for performance improvement.sequence
- the given CharSequence
.CharSequence
can be encoded by
this encoder.IllegalStateException
- if current internal status is neither RESET or FLUSH.public final ByteBuffer encode(CharBuffer in) throws CharacterCodingException
This method encodes the remaining character sequence of the given character buffer into a new byte buffer. This method performs a complete encoding operation, resets at first, then encodes, and flushes at last.
This method should not be invoked if another encode operation is ongoing.
in
- the input buffer.ByteBuffer
containing the bytes produced by
this encoding operation. The buffer's limit will be the position
of the last byte in the buffer, and the position will be zero.IllegalStateException
- if another encoding operation is ongoing.MalformedInputException
- if an illegal input character sequence for this charset is
encountered, and the action for malformed error is
CodingErrorAction.REPORT
UnmappableCharacterException
- if a legal but unmappable input character sequence for this
charset is encountered, and the action for unmappable
character error is
CodingErrorAction.REPORT
.
Unmappable means the Unicode character sequence at the input
buffer's current position cannot be mapped to a equivalent
byte sequence.CharacterCodingException
- if other exception happened during the encode operation.public final CoderResult encode(CharBuffer in, ByteBuffer out, boolean endOfInput)
The buffers' position will be changed with the reading and writing operation, but their limits and marks will be kept intact.
A CoderResult
instance will be returned according to
following rules:
malformed input
result
indicates that some malformed input error was encountered, and the
erroneous characters start at the input buffer's position and their
number can be got by result's length
. This
kind of result can be returned only if the malformed action is
CodingErrorAction.REPORT
.CoderResult.UNDERFLOW
indicates that
as many characters as possible in the input buffer have been encoded. If
there is no further input and no characters left in the input buffer then
this task is complete. If this is not the case then the client should
call this method again supplying some more input characters.CoderResult.OVERFLOW
indicates that the
output buffer has been filled, while there are still some characters
remaining in the input buffer. This method should be invoked again with a
non-full output buffer.unmappable character
result indicates that some unmappable character error was encountered,
and the erroneous characters start at the input buffer's position and
their number can be got by result's length
.
This kind of result can be returned only on
CodingErrorAction.REPORT
.
The endOfInput
parameter indicates if the invoker can
provider further input. This parameter is true if and only if the
characters in the current input buffer are all inputs for this encoding
operation. Note that it is common and won't cause an error if the invoker
sets false and then has no more input available, while it may cause an
error if the invoker always sets true in several consecutive invocations.
This would make the remaining input to be treated as malformed input.
input.
This method invokes the
encodeLoop
method to
implement the basic encode logic for a specific charset.
in
- the input buffer.out
- the output buffer.endOfInput
- true if all the input characters have been provided.CoderResult
instance indicating the result.IllegalStateException
- if the encoding operation has already started or no more
input is needed in this encoding process.CoderMalfunctionError
- If the encodeLoop
method threw an BufferUnderflowException
or
BufferUnderflowException
.protected abstract CoderResult encodeLoop(CharBuffer in, ByteBuffer out)
encode
.
This method will implement the essential encoding operation, and it won't
stop encoding until either all the input characters are read, the output
buffer is filled, or some exception is encountered. Then it will
return a CoderResult
object indicating the result of the
current encoding operation. The rule to construct the
CoderResult
is the same as for
encode
. When an
exception is encountered in the encoding operation, most implementations
of this method will return a relevant result object to the
encode
method, and some
performance optimized implementation may handle the exception and
implement the error action itself.
The buffers are scanned from their current positions, and their positions
will be modified accordingly, while their marks and limits will be
intact. At most in.remaining()
characters
will be read, and out.remaining()
bytes
will be written.
Note that some implementations may pre-scan the input buffer and return
CoderResult.UNDERFLOW
until it receives sufficient input.
in
- the input buffer.out
- the output buffer.CoderResult
instance indicating the result.public final CoderResult flush(ByteBuffer out)
This method will call implFlush
. Some
encoders may need to write some bytes to the output buffer when they have
read all input characters, subclasses can overridden
implFlush
to perform writing action.
The maximum number of written bytes won't larger than
out.remaining()
. If some encoder wants to
write more bytes than the output buffer's available remaining space, then
CoderResult.OVERFLOW
will be returned, and this method
must be called again with a byte buffer that has free space. Otherwise
this method will return CoderResult.UNDERFLOW
, which
means one encoding process has been completed successfully.
During the flush, the output buffer's position will be changed accordingly, while its mark and limit will be intact.
out
- the given output buffer.CoderResult.UNDERFLOW
or
CoderResult.OVERFLOW
.IllegalStateException
- if this encoder hasn't read all input characters during one
encoding process, which means neither after calling
encode(CharBuffer)
nor after
calling encode(CharBuffer, ByteBuffer, boolean)
with true
for the last boolean parameter.protected CoderResult implFlush(ByteBuffer out)
CoderResult.UNDERFLOW
; this method can be
overridden if needed.out
- the output buffer.CoderResult.UNDERFLOW
or
CoderResult.OVERFLOW
.protected void implOnMalformedInput(CodingErrorAction newAction)
CodingErrorAction
specified
for malformed input error has been changed. The default implementation
does nothing; this method can be overridden if needed.newAction
- the new action.protected void implOnUnmappableCharacter(CodingErrorAction newAction)
CodingErrorAction
specified
for unmappable character error has been changed. The default
implementation does nothing; this method can be overridden if needed.newAction
- the new action.protected void implReplaceWith(byte[] newReplacement)
newReplacement
- the new replacement string.protected void implReset()
public boolean isLegalReplacement(byte[] replacement)
replacement
- the given byte array to be checked.public CodingErrorAction malformedInputAction()
CodingErrorAction
when a malformed
input error occurred during the encoding process.public final float maxBytesPerChar()
public final CharsetEncoder onMalformedInput(CodingErrorAction newAction)
implOnMalformedInput
method with the given new action as argument.newAction
- the new action on malformed input error.IllegalArgumentException
- if the given newAction is null.public final CharsetEncoder onUnmappableCharacter(CodingErrorAction newAction)
implOnUnmappableCharacter
method with the given new action as argument.newAction
- the new action on unmappable character error.IllegalArgumentException
- if the given newAction is null.public final byte[] replacement()
public final CharsetEncoder replaceWith(byte[] replacement)
implReplaceWith
method with the given
new replacement as argument.replacement
- the replacement byte array, cannot be null or empty, its
length cannot be larger than maxBytesPerChar
,
and it must be legal replacement, which can be justified by
calling isLegalReplacement(byte[] replacement)
.IllegalArgumentException
- if the given replacement cannot satisfy the requirement
mentioned above.public final CharsetEncoder reset()
implReset()
to reset any status related to the
specific charset.public CodingErrorAction unmappableCharacterAction()
CodingErrorAction
when unmappable
character occurred during encoding process.