Serialization - Encoding
Serialization
-
Represents complex data structures (like structs, maps, arrays) as a sequence of bytes for transmission or storage.
Binary
CBOR (Concise Binary Object Representation)
-
2013
-
Why unpopular :
-
CBOR emerged in niche areas (IoT, security) outside mainstream web ecosystems.
-
Unlike JSON (native support), CBOR relies on external libraries in most languages.
-
Lacks backing from major platforms compared to Protobuf, JSON, or Avro.
-
Binary format is not human-readable, complicating debugging.
-
Few official serializers, almost no widely used CLI or visual tools.
-
Adds complexity not always needed.
-
-
RPC :
-
CBOR doesnโt implement RPC directly but can be used as a payload format within RPC systems like JSON or Protobuf.
-
Custom Binary Format
-
Advantages :
-
Extremely Efficient: Can optimize exactly what you need to send.
-
Full Control: Avoid extra bytes, sending minimal data.
-
-
Disadvantages :
-
Implementation & Maintenance Complexity: Requires skill and time, harder to modify later.
-
Capโn Proto
FlatBuffers
-
2014, developed by Google
-
Advantages :
-
Direct Access: Can read data without deserialization, boosting performance in games needing immediate data.
-
Compact and Fast: Produces small, fast-to-read files.
-
-
Disadvantages :
-
Implementation Complexity: Setup and usage can be tricky for newcomers.
-
Protocol Buffers (Protobuf)
-
2008 (v2) / 2016 (v3)
-
Internal since 2001, public 2008
-
Developed by Google
-
Advantages :
-
Compact: Generates much smaller data packets than JSON/XML, reducing bandwidth usage.
-
Flexible Structure: Allows defining data schemas for multiple programming languages.
-
Performance: Fast serialization/deserialization reduces client/server load.
-
-
Disadvantages :
-
Schema Dependency: Requires a
.protoschema file, adding a development step.
-
-
Bindings :
MessagePack
-
2008
-
Advantages :
-
Compact and Fast: Nearly as efficient as Protobuf, often 50% smaller than JSON.
-
Familiar Structure: Similar to JSON, easy to implement.
-
-
Disadvantages :
-
Less Flexible than Protobuf: Lacks some space/performance optimizations.
-
Thrift
-
Developed by Apache
-
Advantages :
-
Compact and Flexible: Smaller than JSON/XML.
-
RPC Support: Integrated remote procedure calls.
-
-
Disadvantages :
-
Higher Complexity: Requires more setup and learning.
-
Karmem
SQLite
.db
-
SQLite is embedded; each game instance has its own
.dbfile, complicating synchronization between players.
Godot: Built-In Encoding with
PacketPeer
-
Advantages : Integrated into Godot networking, supports common data types, simple and efficient for small/moderate network games.
-
Disadvantages : Less control and advanced optimization than Protobuf/FlatBuffers.
-
Recommendation: Good for small/medium games needing quick state synchronization without extreme optimization.
Godot: Custom Encoding
-
Creating a custom network encoding system
-
Highly optimized, but requires more work per data type.
-
Example: Sending a vector as 1โ8 instead of 20 bytes.
-
-
- `var_to_bytes([type, node_path, method_stringname, array_params])`
-
- Progressive encoding:
- `var_to_bytes(type)`
- `var_to_bytes(type) + var_to_bytes(node_path)`
- `var_to_bytes(type) + var_to_bytes(node_path) + var_to_bytes(method_stringname)`
- `var_to_bytes(type) + var_to_bytes(node_path) + var_to_bytes(method_stringname) + var_to_bytes(array_params)`
Non-Binary
-
Formats like JSON/XML are common but not ideal for games due to extra size and overhead bytes.
JSON
-
2001
-
Standardized as ECMA-404 (2013), RFC 8259
Encoding
-
Maps characters (text) to byte sequences and vice versa. Used for text fields in serialization.
Naming Confusion
-
When someone says "custom package encoding" , they usually mean:
-
A framing protocol (how message start/end is delimited).
-
A custom serialization/deserialization strategy.
-
A binary or textual format for transmitting structures over the network.
-
-
Using "encoding" for package framing strategies is technically valid but potentially ambiguous.
-
In networking, itโs better to use more specific terms.
-
The word "encoding" itself isnโt wrong but should be interpreted in the technical context.
Text
UTF-8
-
Unicode Transformation Format โ 8-bit
-
Size :
-
ASCII characters (0โ127) use 1 byte
-
Non-ASCII characters use up to 4 bytes
-
For languages with many non-ASCII characters (e.g., Chinese, Japanese), it can take more space than UTF-16
-
-
Web standard (used by HTML, JSON, XML, etc.)
-
Backward compatible with ASCII; valid ASCII text is valid UTF-8
-
Serialization:
-
UTF-8 can be considered a form of serialization, specifically for binary text serialization
-
UTF-16
-
Size :
-
BMP characters (Basic Multilingual Plane, U+0000 to U+FFFF) use 2 bytes
-
Characters outside BMP (e.g., emojis, historical scripts) use 4 bytes (surrogate pairs)
-
More efficient for languages with many BMP characters (e.g., many Asian languages)
-
-
Widely used in some APIs and programming languages (e.g., Java, Windows, .NET)
UTF-32
-
Size : All characters are 4 bytes, making manipulation and indexing easier
ASCII
-
American Standard Code for Information Interchange
-
Legacy system compatibility : For old systems or devices that only support ASCII
-
Simple English text : When text contains only basic characters (AโZ letters, 0โ9 digits, basic punctuation)
-
Simplicity : ASCII uses exactly 1 byte (8 bits) per character, simplifying processing in very basic systems