Format: Language
Language blocks allows to represent the data associated with spoken and written language, symbolic expression of the meaning of words and categorization. Later it could be part of a universal language and definition of its meaning and usage.
Language Data
Language data includes information related to the spoken word, such as text, language, text encoding, speaker identification, declaration of the meaning of words, translating languages and so on.
Catalog: XBUPProtocol / Society / Language /
Language
This block is used to determine the language in which is the stored text or image, or other language-dependent data such as speech or sign language record. For the basic version there was used the definition of numbers available for the individual world languages. Other versions and index blocks are reserved for future use.
The next block will be possible to specify the conditions under which that language is used, such as spoken word on video streaming.
Index language - RFC
This block uses standard language for specifying numbers [RFC] LanguageNumber listed on the Internet, for example on the website [IANA].
Catalog: XBUPProtocol / Society / Language / RFCLanguage
UBNatural - MajorLanguageIndex
UBNatural - MinorLanguageIndex
Language Name in ASCII
The second option is to use RFC LanguageString and state name in ASCII encoding. The preferred option of course number, because the name of the encoding is to be displayed at the application level.
Codes are usually in the form of xxYY, where xx is the language code and YY is the country code.
Catalog: XBUPProtocol/Society/Language/RFCASCIILanguage
UBPointer - StringPointer
Multilanguage Data
This block is a simple derivation of the list of language identifiers. Is suitable when the data are used for multiple languages simultaneously.
Catalog: XBUPProtocol/Society/Language/MultiLanguage
UBList - Languages
Text Encoding
Text Encoding is basically mandatory in the general language text string. At a higher level protocol should be defined after the table of characters and their graphical representation, as well as the equality of different characters for encoding. For a definition of encoding is possible to use one of the following ways.
Catalog: XBUPProtocol/Society/Language/Encoding/
IANA Encoding Index
The following block to determine the text encoding is based on well-established standard IANA indexes used for encoding.
Catalog: XBUPProtocol/Society/Language/Encoding/
UBNatural - IANAEncodingMajorNumber
UBNatural - IANAEncodingMinorNumber
ASCII Encoding Name
The following block to determine the text encoding is based on well-established standard IANA indexes used for encoding.
UBPointer - IANAEncodingStringPointer
Text String
A text string is “meaning” of words encoded with an alphabet. When saving the text should take into account support for any language, code and other text attributes. If the text is a form of compression of graphic symbols and meaning.
The basic block for the general text is as follows (this is the transformation block):
Catalog: XBUPProtocol/Society/Language/Text/String
UBPointer - StringDataPointer
UBPointer - EncodingPointer
UBPointer - LanguagePointer
Probably should not be used directly to encode a value, but use an external block, which will possibly be defined as automatic, or by referring else.
Another option is to create blocks for the chain with fixed values of coding, where the value is actually included in the coding code block. And to create such blocks and ASCIIString UTFString.
ASCII Text String
A text string with the ASCII encoding was fixing the code value in the block String.
Catalog: XBUPProtocol/Society/Language/Text/ASCIIString
UBPointer - StringDataPointer
UBPointer - LanguagePointer
UTF-8 Text String
Like in the previous case, the time value of fixed encoding to UTF-8.
Catalog: XBUPProtocol/Society/Language/Text/UTF8String
UBPointer - StringDataPointer
UBPointer - LanguagePointer
Commentary Block
Direct application of the text block is a block for the realization of the text comments. It can be inserted at any level anywhere in the file. Annotation blocks will probably be several types based on their visual results.
Links
IANACharset IANA MIBEnum Character Set Registry, URL: ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets
RFC Request For Comment, URL: https://www.rfc.org
ASCII American Standard for Code Interchange
UTF-8 UCS Transformation Format, URLs: https://www.faqs.org/rfcs/rfc2279.html
ISO 639.2 Codes for the Representation of Names of Languages, URL: https://www.loc.gov/standards/iso639-2/php/English_list.php
IANA Root-Zone Root-Zone Whois Information, URL: https://www.iana.org/root-whois/index.html
Page Source