Class BaseSJISEncoding

    • Field Detail

      • CR_Hiragana

        private static final int[] CR_Hiragana
      • CR_Katakana

        private static final int[] CR_Katakana
      • PropertyList

        private static final int[][] PropertyList
      • CTypeNameHash

        private static final BytesHash<java.lang.Integer> CTypeNameHash
      • SJIS_CAN_BE_TRAIL_TABLE

        static final boolean[] SJIS_CAN_BE_TRAIL_TABLE
      • SjisEncLen

        static final int[] SjisEncLen
    • Constructor Detail

      • BaseSJISEncoding

        protected BaseSJISEncoding​(java.lang.String name,
                                   int[][] Trans)
    • Method Detail

      • getCharsetName

        public java.lang.String getCharsetName()
        Description copied from class: Encoding
        The name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.
        Overrides:
        getCharsetName in class Encoding
        Returns:
        the name of the equivalent Java Charset for this encoding
      • mbcToCode

        public int mbcToCode​(byte[] bytes,
                             int p,
                             int end)
        Description copied from class: Encoding
        Returns code point for a character Oniguruma equivalent: mbc_to_code
        Specified by:
        mbcToCode in class Encoding
      • codeToMbcLength

        public int codeToMbcLength​(int code)
        Description copied from class: Encoding
        Returns character length given a code point Oniguruma equivalent: code_to_mbclen
        Specified by:
        codeToMbcLength in class Encoding
      • SJIS_ISMB_TRAIL

        private static boolean SJIS_ISMB_TRAIL​(int code)
      • codeToMbc

        public int codeToMbc​(int code,
                             byte[] bytes,
                             int p)
        Description copied from class: Encoding
        Extracts code point into it's multibyte representation
        Specified by:
        codeToMbc in class Encoding
        Returns:
        character length for the given code point Oniguruma equivalent: code_to_mbc
      • getLowerCase

        private static int getLowerCase​(int code)
      • mbcCaseFold

        public int mbcCaseFold​(int flag,
                               byte[] bytes,
                               IntHolder pp,
                               int end,
                               byte[] lower)
        Description copied from class: AbstractEncoding
        onigenc_ascii_mbc_case_fold
        Overrides:
        mbcCaseFold in class AbstractEncoding
        Parameters:
        flag - case fold flag
        pp - an IntHolder that points at character head
        lower - a buffer where to extract case folded character Oniguruma equivalent: mbc_case_fold
      • propertyNameToCType

        public int propertyNameToCType​(byte[] bytes,
                                       int p,
                                       int end)
        Description copied from class: AbstractEncoding
        onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings
        Overrides:
        propertyNameToCType in class AbstractEncoding
      • isCodeCType

        public boolean isCodeCType​(int code,
                                   int ctype)
        Description copied from class: Encoding
        Perform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)
        Specified by:
        isCodeCType in class Encoding
        Parameters:
        code - a code point of a character
        ctype - a character type to check against Oniguruma equivalent: is_code_ctype
      • ctypeCodeRange

        public int[] ctypeCodeRange​(int ctype,
                                    IntHolder sbOut)
        Description copied from class: Encoding
        Returns code range for a given character type Oniguruma equivalent: get_ctype_code_range
        Specified by:
        ctypeCodeRange in class Encoding