Package org.jcodings.specific
Class GB18030Encoding
- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.AbstractEncoding
-
- org.jcodings.MultiByteEncoding
-
- org.jcodings.specific.GB18030Encoding
-
- All Implemented Interfaces:
java.lang.Cloneable
public final class GB18030Encoding extends MultiByteEncoding
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
GB18030Encoding.State
-
Field Summary
Fields Modifier and Type Field Description private static int
C1
private static int
C2
private static int
C4
private static int
CM
private static java.lang.String
GB18030
private static int[]
GB18030_MAP
private static int[][]
GB18030Trans
static GB18030Encoding
INSTANCE
-
Constructor Summary
Constructors Modifier Constructor Description protected
GB18030Encoding()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
codeToMbc(int code, byte[] bytes, int p)
Extracts code point into it's multibyte representationint
codeToMbcLength(int code)
Returns character length given a code point Oniguruma equivalent:code_to_mbclen
int[]
ctypeCodeRange(int ctype, IntHolder sbOut)
Returns code range for a given character type Oniguruma equivalent:get_ctype_code_range
java.lang.String
getCharsetName()
The name of the equivalent Java Charset for this encoding.boolean
isCodeCType(int code, int ctype)
Perform a check whether given code is of given character type (e.g.boolean
isReverseMatchAllowed(byte[] bytes, int p, int end)
Returns true if it's safe to use reversal Boyer-Moore search fail fast algorithm Oniguruma equivalent:is_allowed_reverse_match
int
leftAdjustCharHead(byte[] bytes, int start, int s, int end)
Seeks the previous character head in a stream Oniguruma equivalent:left_adjust_char_head
int
length(byte[] bytes, int p, int end)
Returns character length given stream, character position and stream end returns1
for singlebyte encodings or performs sanity validations for multibyte ones and returns the character length, missing characters in the stream otherwiseprivate int
lengthForThreeUptoFour(byte[] bytes, int p, int end, int s)
private int
lengthForTwoUptoFour(byte[] bytes, int p, int end, int s)
int
mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)
onigenc_ascii_mbc_case_foldint
mbcToCode(byte[] bytes, int p, int end)
Returns code point for a character Oniguruma equivalent:mbc_to_code
-
Methods inherited from class org.jcodings.MultiByteEncoding
caseMap, isInRange, length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLength
-
Methods inherited from class org.jcodings.AbstractEncoding
applyAllCaseFold, asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, caseFoldCodesByString, isCodeCTypeInternal, isNewLine, propertyNameToCType
-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, load, load, maxLength, maxLengthDistance, mbcodeStartPosition, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Field Detail
-
GB18030
private static final java.lang.String GB18030
- See Also:
- Constant Field Values
-
C1
private static final int C1
- See Also:
- Constant Field Values
-
C2
private static final int C2
- See Also:
- Constant Field Values
-
C4
private static final int C4
- See Also:
- Constant Field Values
-
CM
private static final int CM
- See Also:
- Constant Field Values
-
GB18030_MAP
private static final int[] GB18030_MAP
-
GB18030Trans
private static final int[][] GB18030Trans
-
INSTANCE
public static final GB18030Encoding INSTANCE
-
-
Method Detail
-
length
public int length(byte[] bytes, int p, int end)
Description copied from class:Encoding
Returns character length given stream, character position and stream end returns1
for singlebyte encodings or performs sanity validations for multibyte ones and returns the character length, missing characters in the stream otherwise
-
lengthForTwoUptoFour
private int lengthForTwoUptoFour(byte[] bytes, int p, int end, int s)
-
lengthForThreeUptoFour
private int lengthForThreeUptoFour(byte[] bytes, int p, int end, int s)
-
mbcToCode
public int mbcToCode(byte[] bytes, int p, int end)
Description copied from class:Encoding
Returns code point for a character Oniguruma equivalent:mbc_to_code
-
codeToMbcLength
public int codeToMbcLength(int code)
Description copied from class:Encoding
Returns character length given a code point Oniguruma equivalent:code_to_mbclen
- Specified by:
codeToMbcLength
in classEncoding
-
codeToMbc
public int codeToMbc(int code, byte[] bytes, int p)
Description copied from class:Encoding
Extracts code point into it's multibyte representation
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)
Description copied from class:AbstractEncoding
onigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFold
in classAbstractEncoding
- Parameters:
flag
- case fold flagpp
- anIntHolder
that points at character headlower
- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
isCodeCType
public boolean isCodeCType(int code, int ctype)
Description copied from class:Encoding
Perform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)- Specified by:
isCodeCType
in classEncoding
- Parameters:
code
- a code point of a characterctype
- a character type to check against Oniguruma equivalent:is_code_ctype
-
ctypeCodeRange
public int[] ctypeCodeRange(int ctype, IntHolder sbOut)
Description copied from class:Encoding
Returns code range for a given character type Oniguruma equivalent:get_ctype_code_range
- Specified by:
ctypeCodeRange
in classEncoding
-
getCharsetName
public java.lang.String getCharsetName()
Description copied from class:Encoding
The name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetName
in classEncoding
- Returns:
- the name of the equivalent Java Charset for this encoding
-
leftAdjustCharHead
public int leftAdjustCharHead(byte[] bytes, int start, int s, int end)
Description copied from class:Encoding
Seeks the previous character head in a stream Oniguruma equivalent:left_adjust_char_head
- Specified by:
leftAdjustCharHead
in classEncoding
- Parameters:
bytes
- byte streamstart
- positions
- stopend
- end
-
isReverseMatchAllowed
public boolean isReverseMatchAllowed(byte[] bytes, int p, int end)
Description copied from class:Encoding
Returns true if it's safe to use reversal Boyer-Moore search fail fast algorithm Oniguruma equivalent:is_allowed_reverse_match
- Specified by:
isReverseMatchAllowed
in classEncoding
-
-