Text Module
Text encoding and decoding utilities for JV-Link data processing.
JV-Link returns data encoded in Shift-JIS (code page 932), which is the standard encoding for Japanese text in legacy Windows applications. This module provides functions to decode Shift-JIS to .NET strings and encode back for transmission.
Caching: Frequently decoded/encoded strings are cached in memory to improve performance when processing large volumes of data. Cache size is limited to prevent unbounded memory growth (max 1024 entries per cache).
Environment Variables:
XANTHOS_DISABLE_TEXT_CACHE- Set to any non-empty value to disable caching. Useful for memory-constrained environments or debugging.
Functions and values
| Function or value |
Description
|
||
Full Usage:
clearCaches ()
Parameters:
unit
|
Clears both encode and decode caches.
Use this function to free memory when processing is complete or when switching between different data sources. The caches will be rebuilt automatically as new data is processed. This function is thread-safe but may have race conditions with concurrent encode/decode operations (which will simply repopulate the cache).
|
||
Full Usage:
decodeShiftJis bytes
Parameters:
byte[]
-
The Shift-JIS encoded byte array. May be null or empty.
Returns: string
The decoded string with trailing null characters removed.
Returns Empty for null or empty input.
|
Decodes a Shift-JIS encoded byte array to a .NET string.
This function first attempts strict Shift-JIS decoding. If that fails (invalid byte sequences), it falls back to lenient Shift-JIS, then strict UTF-8, then lenient UTF-8 as a last resort. Results are cached for performance unless:
|
||
Full Usage:
decodeShiftJisBstrBytesIfNeeded text
Parameters:
string
Returns: string
|
Decodes JV-Link Shift-JIS bytes that were mistakenly marshalled as a BSTR string.
Some JV-Link COM APIs populate out-parameters using Shift-JIS bytes, but the COM marshaller may expose them to .NET as a UTF-16 string without decoding. This function attempts to recover the original bytes from common BSTR layouts and decode them as Shift-JIS. If the input already looks like readable Japanese text, it is returned as-is.
|
||
Full Usage:
encodeShiftJis text
Parameters:
string
-
The string to encode. May be null or empty.
Returns: byte array
The Shift-JIS encoded byte array.
Returns an empty array for null or empty input.
|
Encodes a .NET string to Shift-JIS byte array.
Results are cached for performance unless:
|
||
Full Usage:
looksGarbledJvText text
Parameters:
string
Returns: bool
|
Heuristically detects obviously garbled (mojibake) text returned by JV-Link. This is intentionally conservative and only flags strings that contain a high amount of private-use characters or control characters, which should not appear in normal Japanese explanations.
|
||
Full Usage:
normalizeJvText text
Parameters:
string
-
The text to normalize. May be null or empty.
Returns: string
The normalized string with fullwidth digits and letters converted to ASCII.
Returns the input unchanged if null or empty.
|
Normalizes JV-Link text by converting fullwidth characters to ASCII equivalents.
JV-Link data often contains fullwidth (全角) characters that should be normalized for consistent processing:
Also applies Unicode NormalizationForm.FormKC for compatibility decomposition.
|
Xanthos