3
edits
m (Update color reference link) |
(update documentation and fix some errors (left a lot of the game-specific stuff untouched)) |
||
Line 1: | Line 1: | ||
{{lowercase}} | {{lowercase}} | ||
<onlyinclude> | <onlyinclude> | ||
''' | '''Message Studio Binary Text''' | ||
<code>MSBT</code> | <code>MSBT</code> is a binary file format belonging to LibMessageStudio (LMS). These files store the game's text and can contain "tags" that define how said text is displayed/interacted with. | ||
</onlyinclude> | </onlyinclude> | ||
== | == File Layout == | ||
<code>MSBT</code> files are composed of a file header followed by blocks (each with their own block header). All sections/blocks must be aligned to 0x10 (16) bytes. In BotW, the file layout is as follows: | |||
* Header | * Header | ||
* Labels | * Labels Block | ||
* Attributes | * Attributes Block | ||
* | * Text Block | ||
The list of possible blocks is as follows: | |||
== | * LBL1 (labels) | ||
* TXT2 (text) | |||
* ATR1 (attributes) | |||
* TSY1 (style info) | |||
* ATO1 (unknown) | |||
== File Header == | |||
=== Header Structure === | === Header Structure === | ||
Line 23: | Line 30: | ||
| 0x00 | | 0x00 | ||
| 8 | | 8 | ||
| | | char[8] | ||
| msbt file signature (magic) <code>4D 73 67 53 74 64 42 6E</code> or "MsgStdBn" | | msbt file signature (magic) <code>4D 73 67 53 74 64 42 6E</code> or "MsgStdBn" | ||
|- | |- | ||
| 0x08 | | 0x08 | ||
| 2 | | 2 | ||
| | | u16 | ||
| Byte-Order Mark | | Byte-Order Mark | ||
|- | |- | ||
Line 34: | Line 41: | ||
| 2 | | 2 | ||
| | | | ||
| Padding | | Padding {{check}} | ||
|- | |- | ||
| 0x0c | | 0x0c | ||
| | | 1 | ||
| | | u8 | ||
| | | Encoding (0 = UTF8, 1 = UTF16, 2 = UTF32 - games generally only support one specific encoding) {{check}} | ||
|- | |||
|0x0d | |||
|1 | |||
|u8 | |||
|Version (3) | |||
|- | |- | ||
| 0x0e | | 0x0e | ||
| 2 | | 2 | ||
| | | u16 | ||
| | | Block Count {{check}} | ||
|- | |- | ||
| 0x10 | | 0x10 | ||
| 2 | | 2 | ||
| | | | ||
| Padding | | Padding {{check}} | ||
|- | |- | ||
| 0x12 | | 0x12 | ||
| 4 | | 4 | ||
| | | u32 | ||
| File Size | | File Size | ||
|- | |- | ||
| 0x16 | | 0x16 | ||
| 10 | | 10 | ||
| | | | ||
| Padding | | Padding {{check}} | ||
|} | |} | ||
== | == Block Header == | ||
The | This header is shared across all block types. The block data follows directly after the header and is aligned to 0x10 (16) bytes. The block size does not include the size of the header. | ||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 75: | Line 84: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| Signature | | Block Signature | ||
|- | |- | ||
| 0x04 | | 0x04 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Block Size | ||
|- | |- | ||
| 0x08 | | 0x08 | ||
Line 89: | Line 98: | ||
|} | |} | ||
== Labels Block == | |||
The labels block contains the label names for file's text. Its signature is <code>LBL1</code>. | |||
The section begins with a four-byte header specifying the number of label groups. | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 99: | Line 110: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Label Group Count | ||
|} | |} | ||
Each entry in the | |||
=== Label Groups === | |||
Following the header is a table of label groups. Each entry in the table is eight bytes. The first four bytes specify the number of labels in the group and the second four specify the offset of the first label relative to the start of the section. The number of label groups in many games appears to be the smallest prime number larger than half the number of labels (with a max of 101 groups and a minimum of 2). In BotW, this may not always be the case. | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 111: | Line 124: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Label Count | ||
|- | |- | ||
| 0x04 | | 0x04 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Offset | ||
|} | |} | ||
Messages are sorted into label groups by hashing the label name. A recreation of the hash function is as follows:<syntaxhighlight lang="python3"> | |||
def calc_hash(label): | |||
hash = 0 | |||
for char in label: | |||
hash = hash * 0x492 + ord(char) | |||
return (hash & 0xFFFFFFFF) % num_label_groups | |||
</syntaxhighlight> | |||
=== Labels | === Labels === | ||
Following the label groups is the array of labels. Each label consists of a u8 string length followed by a null-terminated string. At the end of the label is a u32 index specifying the index of the message that corresponds to the label in the text block. | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 132: | Line 150: | ||
| 0x00 | | 0x00 | ||
| 1 | | 1 | ||
| | | u8 | ||
| String | | String Length | ||
|- | |- | ||
| 0x01 | | 0x01 | ||
| ''n'' | | ''n'' | ||
| char[''n''] | | char[''n''] | ||
| String | | Label String | ||
|- | |- | ||
| 0xnn | | 0xnn | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Message Text Index | ||
|} | |} | ||
== Attributes | == Attributes Block == | ||
The attributes block stores additional, optional attributes that can be associated with messages. Its signature is <code>ATR1</code>. The interpretation of attribute data is completely up to the game's discretion and attributes in BotW are not fully understood at this time. The attribute seem to indicate which actor should be attributed with the dialog (A good example is in <code>100enemy.msbt</code> where <code>NPC_GodVoice</code> is referenced). | |||
Each attribute corresponds to the message of the same index in the text block {{check}}. | |||
The block begins with an eight-byte header that specifies the number of attributes and the size of a single attribute. | |||
The | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 161: | Line 178: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Attribute Count | ||
|- | |- | ||
| 0x04 | | 0x04 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Attribute Size | ||
|} | |} | ||
Following the brief header is an array of the attribute data. In many cases, this data is actually a string offset relative to the start of the section (as is the case in BotW). | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 185: | Line 195: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| Attribute String Offset | |||
| Attribute | |||
|} | |} | ||
=== | === Attribute Strings === | ||
Should the attribute data be a string offset, then the data is followed by an array of null-terminated strings encoded using the encoding specified in the file header. In BotW, this is means UTF16-LE on Switch and UTF16-BE on Wii U. | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 216: | Line 209: | ||
| 0x00 | | 0x00 | ||
| ''n'' | | ''n'' | ||
| | | char_type[] | ||
| | | String | ||
|} | |} | ||
== | == Text Block == | ||
The text | The text block contains the text for messages. Its signature is <code>TXT2</code>. | ||
The section begins with a small four-byte header specifying the number of messages in the section. | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 257: | Line 225: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| | | Message Count | ||
|} | |} | ||
Each | Following the brief header is an array of u32 offsets to the text strings. Each offset is relative to the beginning of the section. | ||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 269: | Line 237: | ||
| 0x00 | | 0x00 | ||
| 4 | | 4 | ||
| | | u32 | ||
| String | | String Offset | ||
|} | |} | ||
=== | === Message Strings === | ||
The | The strings are stored as an array of strings encoded using the encoding specified in the header. In BotW, this is means UTF16-LE on Switch and UTF16-BE on Wii U. Each string is read from its specified offset until the next string offset. The last string uses the section size specified in the block header to determine its end position. | ||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 285: | Line 251: | ||
| 0x00 | | 0x00 | ||
| ''n'' | | ''n'' | ||
| | | char_type[] | ||
| | | String | ||
|} | |} | ||
== | == Tags == | ||
{{expand section}} | {{expand section}} | ||
Message strings can contain tags that alter how the message is displayed or processed. These tags are processed by <code>nn::ui2d::TagProcessorBase</code> which in Nintendo EPD games is extend by<code>eui::TagProcessor</code> which can then be further extended by the game to handle game-specific tags. The interpretation of tags is completely game-dependent. | |||
Tags are embedded directly inside of the text and begin with a brief tag header. | |||
{| class="wikitable" | {| class="wikitable" | ||
!Offset (h) | !Offset (h) | ||
Line 300: | Line 266: | ||
!Description | !Description | ||
|- | |- | ||
| 0x00 | |0x00 | ||
| 2 | |2 | ||
| | |u16 | ||
| | |Signature (<code>00 0e</code> or <code>00 0f</code>) | ||
|- | |- | ||
| 0x02 | | 0x02 | ||
| 2 | | 2 | ||
| | | u16 | ||
| | | Tag Group | ||
|- | |||
|0x04 | |||
|2 | |||
|u16 | |||
|Tag Type | |||
|- | |||
|0x06 | |||
|2 | |||
|u16 | |||
|Extra Data Size (this value is ignored for <code>00 0f</code> tags which have no data) | |||
|} | |} | ||
In Nintendo EPD games, tag group 0 tags are system tags, group 1 is <code>eui</code> tags, group 2 is app (game) specific tags, and group 201 is grammar tags. The other groups are currently unknown. In BotW, group 4 appears to be for animations and group 5 appears to be for delays. | |||
For example, in <code>100enemy.msbt</code> the following data appears: | For example, in <code>100enemy.msbt</code> the following data appears: | ||
<pre> | 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F | | <pre> | 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F | | ||
Line 322: | Line 300: | ||
70 | 00 02 00 03 00 04 00 05 02 03 00 00 | ............ | | 70 | 00 02 00 03 00 04 00 05 02 03 00 00 | ............ | | ||
</pre> | </pre> | ||
At byte <code>0x68</code> is the | At byte <code>0x68</code> is the tag header followed with tag group 1 and tag type 6. 0xa is the size of the extra data with the extra data being <code>[00 02 00 03 00 04 00 05 02 03]</code>. | ||
=== Tags === | |||
Not all tags functions are currently known. | |||
==== System Tags ==== | |||
=== | |||
{| class="wikitable" | {| class="wikitable" | ||
!Group | |||
!Type | !Type | ||
!Notes | !Notes | ||
|- | |- | ||
| 0 | | 0 | ||
| | | 0 | ||
| | |Ruby (extra data is a u16 display span followed by the ruby text) | ||
|- | |- | ||
| 0 | |||
| 1 | | 1 | ||
| | | Font (extra data is a u16 font type) | ||
|- | |- | ||
| 0 | |||
| 2 | | 2 | ||
| | |Font Size (extra data is a u16 font size) | ||
|- | |- | ||
| 0 | |||
| 3 | | 3 | ||
| | |Font Color (extra data is a u16 color type) | ||
|- | |- | ||
| 0 | |||
| 4 | | 4 | ||
| | | Page Break (no extra data) | ||
|} | |} | ||
The available colors are red, green, blue, gray, and orange in that order. 0xffff indicates a reset to the default white text color. | |||
==== | ==== EUI Tags ==== | ||
{| class="wikitable" | {| class="wikitable" | ||
!Group | |||
!Type | !Type | ||
!Notes | !Notes | ||
|- | |- | ||
| | | 1 | ||
| | | 0 | ||
| | |Delay (extra data is a u32 frame count) | ||
|- | |||
| 1 | |||
| 1 | |||
| Text Speed? | |||
|- | |||
| 1 | |||
| 2 | |||
|No Text Scroll? | |||
|- | |- | ||
| | | 1 | ||
| | | 3 | ||
| | |Auto Advance (extra data is a u32 frame count) | ||
|- | |- | ||
| | |1 | ||
| | |4 | ||
| | |Two Choices (extra data is an array of u16 label indices for each choice followed by a u8 selected index and u8 cancel index) | ||
|- | |- | ||
| | |1 | ||
| | |5 | ||
| | |Three Choices | ||
|- | |- | ||
| | |1 | ||
| | |6 | ||
| | |Four Choices | ||
|- | |- | ||
| | |1 | ||
| | |7 | ||
| | |Picture Font (Icon) (extra data is two u8s, the second being the type) | ||
|- | |- | ||
| | |1 | ||
| | |8 | ||
| | | | ||
|- | |- | ||
| | |1 | ||
| | |9 | ||
| | | | ||
|- | |- | ||
| | |1 | ||
| | |10 | ||
| | | | ||
|} | |} | ||
==== | ==== App Tags ==== | ||
// TODO | |||
==== Group 3 Tags ==== | |||
// TODO | |||
==== Group 4 Tags ==== | |||
// TODO | |||
===== | ==== Grammar Tags ==== | ||
// TODO | |||
== Compiling the Sections == | == Compiling the Sections == |
edits