RGMP v2 Protocol
RGMP v2 (Rokoko General Motion Protocol, Version 2) is a binary protocol designed for real-time streaming of motion capture, sensor, and system data between applications. While it can be used for streaming data across a network, the protocol is primarily designed for inter-application communication.
It operates over a reliable, stream-oriented transport layer (TCP), using length-prefixed framing to separate messages.
RGMP v2 supports simultaneous streaming from multiple devices and data types, with a flexible grouping mechanism that allows different update rates and logical organization of data streams. Multiple clients can connect to an RGMP server at the same time, each receiving data from all devices connected to the server.
All values are encoded in little-endian format.
Protocol Header
Section titled “Protocol Header”The communication in RGMP v2 is structured as a sequence of frames, each consisting of a header followed by a payload. The frame contains the following:
| Field | Type | Size (bytes) | Description |
|---|---|---|---|
| msg_prefix | uint32 | 4 | Frame type (e.g., 1 for stream definition) |
| msg_len | uint32 | 4 | Payload length (bytes) |
| payload | byte[] | msg_len | Binary data for the frame |
Frame Types
Section titled “Frame Types”The msg_prefix field in the header indicates the type of frame being sent. The following frame types are defined in RGMP v2:
| msg_prefix | Frame Type | Description |
|---|---|---|
| 1 | Stream Definition | Describes all data streams, their types, semantics, and relationships |
| 2 | Data Frame | Contains values for one or more streams |
| 3 | Device Disconnect | Signals device disconnection |
Protocol Workflow Diagram
Section titled “Protocol Workflow Diagram”sequenceDiagram
participant Client
participant Server
Client->>Server: Connect (TCP)
Server-->>Client: Stream Definition Frame (Binary/JSON)
loop Streaming
Server-->>Client: Data Frame(s) (Binary)
end
opt Graceful Teardown
Server-->>Client: Device Disconnect Frame (Binary)
end
Client-->>Server: Close Connection
Workflow Rules:
-
Initialization: Upon device connection, the server sends a Stream Definition Frame for each connected client.
-
Error Handling: RGMP v2 relies entirely on TCP for packet reliability. Any application-level parsing errors must result in the connection being safely terminated.
-
Termination: Sessions are gracefully closed via a Disconnect Frame (to signal device disconnection) or standard TCP socket closure (to signal session end).
Stream Definition Frame
Section titled “Stream Definition Frame”The Stream Definition Frame is sent at the start of a session and its payload is a UTF-8 formatted JSON string. Each definition describes the data from a real or synthetic device (e.g., smartsuit, smartglove) and contains:
- Protocol Version & Name: Identifies protocol version and implementation.
- Device Information: Type (e.g., smartsuit, smartglove), address, hub ID, device ID, and debug info.
- Static Data: An array of static values (e.g., fixed transforms, calibration, hardware info) that do not change during the session. Each entry describes the data type, semantics, reference frames, and value.
- Stream Groups: Each group contains a set of streams, a name, and an expected rate (Hz).
See rgmp_v2_definition_frame.schema.json for the full JSON schema.
Note: For this payload, the msg_len field in the header specifies the number of bytes in the string and does not include any null terminator (\0). The null terminator is not transmitted or counted in the length.
Top-Level Fields
Section titled “Top-Level Fields”| Field | Type | Description |
|---|---|---|
| protocol_name | string | Name of the protocol (e.g., “RGMP”) |
| protocol_version | string | Protocol version (e.g., “2.0.0”) |
| device_id | number | Unique identifier for the device/session (must fit in a 32-bit unsigned integer) |
| device_type | string | Type of the device |
| timestamp_epoch | string | The epoch of the timestamps used in the data stream (e.g., “unix_epoch” or “device_boot”) |
| device_info | object | Device metadata (see below) |
| static_data | array | List of static data entries (see below) |
| groups | array | List of stream groups (see below) |
Timestamp Epoch
Section titled “Timestamp Epoch”The timestamp_epoch field specifies the reference point for all timestamps in the data stream. Common values include:
"unix_epoch": Timestamps are in microseconds since the Unix Epoch."device_boot": Timestamps are in microseconds since the device was powered on.
Device Info
Section titled “Device Info”The device object is optional and can be used to provide metadata about the source of the data.
Static Data
Section titled “Static Data”The static_data section allows the protocol to communicate fixed values such as sensor offsets, calibration matrices, or hardware metadata. Each entry includes the same semantic fields as a stream, plus a value field containing the static value. This enables clients to receive all necessary context for interpreting streamed data.
Stream Groups
Section titled “Stream Groups”Groups are a core concept in RGMP v2. Each group defines a logical collection of streams that are transmitted together at a specific rate. This enables, for example, high-frequency raw sensor data to be sent in one group, while lower-frequency processed pose data is sent in another. Each group has:
name: A human-readable name for the group.expected_rate_hz: The expected update rate for this group (in Hz). A value of zero indicates that the group is event-driven and does not have a fixed update rate.streams: An array of stream definitions, each describing the data type, semantics, and reference frames for a single stream.
Groups allow the protocol to efficiently support mixed-rate data and logical separation of different data types (e.g., raw IMU, pose, diagnostics). Data streams should be grouped according to their natural update rates and semantic relationships, since it is not possible to partially emit a group frame. If a partial update is needed, the group should be split into multiple groups.
Stream, Group, and Static Data Entry Fields
Section titled “Stream, Group, and Static Data Entry Fields”The following fields are used for both streams (within groups) and static_data entries. This unified format allows for consistent semantics and metadata across all types of data in the protocol.
| Field | Type | Description |
|---|---|---|
| data_type | string | Data type (e.g., “FLOAT[3]”, “FLOAT[7]”, “UINT32”) |
| measure_type | string | Semantic meaning of the data (e.g., “TRANSFORM”, “STATUS_FLAGS”) |
| target_frame | string | Name of the target object or sensor the data describes |
| reference_frame | string | (Optional) The coordinate frame the data is measured relative to. If omitted, defaults to target_frame itself. Both forms (omitted, or explicitly equal to target_frame) are accepted on the wire and treated as semantically identical for uniqueness and lookup. |
| custom_label | string | Required when measure_type is CUSTOM; rejected on every other measure_type |
| bit_mapping | object | (Optional) Mapping of flag bits to names (for flag types) |
| value | varies | (Only for static_data) The static value (array, object, scalar) |
*Note: If reference_frame is omitted, the data is assumed to be intrinsic to the target_frame (e.g., raw unrotated IMU data or hardware status flags). Setting reference_frame explicitly equal to target_frame is allowed and means the same thing. *
Data Types
Section titled “Data Types”| Data Type | Description | Binary Encoding |
|---|---|---|
| INT32 | 32-bit signed integer | int32_t (two’s complement) |
| UINT32 | 32-bit unsigned integer | uint32_t |
| INT64 | 64-bit signed integer | int64_t (two’s complement) |
| UINT64 | 64-bit unsigned integer | uint64_t |
| FLOAT | Single float value | float (IEEE 754 single-precision) |
| DOUBLE | Double float value | double (IEEE 754 double-precision) |
Vector and matrix types:
| Data Type | Description | Binary Encoding |
|---|---|---|
| TYPE[N] | Array of N values of base type TYPE | N values of base type TYPE |
| TYPE[N, M] | N by M matrix of base type TYPE | N*M values of base type TYPE, row-major order |
The dimensions N and M must each be at least 1; TYPE[0], TYPE[0,M], and TYPE[N,0] are not valid data types.
Measure Types
Section titled “Measure Types”| Measure Type | Description | Units |
|---|---|---|
| POSITION | 3D position vector | meters |
| ORIENTATION | 3D orientation (i.e., quaternion given in x,y,z,w order) | unitless |
| TRANSFORM | Rigid body transform (pose), given as a position followed by a quaternion (x,y,z,w) | position: meters, quaternion: unitless |
| ANGULAR_VELOCITY | Angular velocity vector | radians/second |
| LINEAR_VELOCITY | Linear velocity | meters/second |
| LINEAR_ACCELERATION | Linear acceleration | meters/second^2 |
| PROPER_ACCELERATION | Proper acceleration (includes gravity) | meters/second^2 |
| MAGNETIC_FIELD | Magnetic field vector | Gauss |
| STATUS_FLAGS | System or device status flags | unitless |
| CUSTOM | Custom semantic meaning defined by the user | unitless |
Reference Frames and Transforms
Section titled “Reference Frames and Transforms”When defining a stream, target_frame and reference_frame can be any frame name, but the following standard identifiers are recommended where applicable:
- LTP_NED: Local Tangent Plane (North, East, Down). Right-handed.
- LTP_ENU: Local Tangent Plane (East, North, Up). Right-handed.
Frame names should be written in snake_case, with the exception of the standard frames above. For example, left_hand, right_hand, head, etc.
A rotation or transform defined with the target_frame and reference_frame describes how a vector can be transformed from the target frame to the reference frame. I.e. if target_frame to reference_frame, then the position of object A in the reference frame can be computed as the following matrix vector product:
where TRANSFORM with a position
Where
Stream uniqueness within a group
Section titled “Stream uniqueness within a group”Within a single group, every stream must be uniquely identified. The uniqueness key depends on the measure_type:
- Standard measure types (
POSITION,ORIENTATION,TRANSFORM,ANGULAR_VELOCITY,LINEAR_VELOCITY,LINEAR_ACCELERATION,PROPER_ACCELERATION,MAGNETIC_FIELD,STATUS_FLAGS): keyed by(measure_type, target_frame, reference_frame). TwoPOSITIONstreams describing the sametarget_framebut expressed in differentreference_frames are distinct streams. CUSTOM: keyed by(measure_type, target_frame, reference_frame, custom_label).
Optional fields participate in the key as their presence: a stream with no reference_frame has a different key from one with reference_frame: "LTP_ENU". Lookup at the consumer is strict — passing no reference_frame will not match a stream that declared one.
Two streams in the same group sharing the same key are a protocol error.
Custom label
Section titled “Custom label”The custom_label field is the disambiguator for streams whose measure_type is CUSTOM. Two CUSTOM streams in the same group with the same target_frame and reference_frame are distinguished by their custom_label.
Validity is bidirectional:
custom_labelis required whenmeasure_typeisCUSTOM. ACUSTOMstream with no label is a protocol error.custom_labelis rejected on every othermeasure_type. Setting it onPOSITION,STATUS_FLAGS, or any standard measure is a protocol error.
The label does not affect the binary encoding of the data and is purely informational — it surfaces only in the JSON Stream Definition Frame, where it tells consumers what the otherwise-opaque CUSTOM stream represents ("battery_pct", "emf_residual", etc.).
Bit-mapped Flags
Section titled “Bit-mapped Flags”For streams with measure_type of STATUS_FLAGS, the bit_mapping field must be provided. This field is an object that maps each bit index (starting from 0) to a human-readable name for that flag. For example:
"bit_mapping": { "0": "is_tracking", "1": "is_calibrated", "2": "has_error"}Value (for static_data entries)
Section titled “Value (for static_data entries)”For entries in the static_data array, the value field must be provided and contain the static value corresponding to the defined data type. Array data must be provided as a flat JSON array, even for matrix types (e.g., a 3x3 matrix should be provided as an array of 9 values in row-major order).
Data Frame
Section titled “Data Frame”The Data Frame is sent repeatedly during streaming and contains the data values for all streams in the stream group given by group_id, in the order defined in the stream definition frame.
Data Frame Layout
Section titled “Data Frame Layout”Data Frame Header
| Field | Type | Size (bytes) | Description |
|---|---|---|---|
| device_id | UINT32 | 4 | Device/session identifier |
| group_id | UINT32 | 4 | Group index corresponding to the stream definition frame (starting from 0) |
| timestamp_us | UINT64 | 8 | Capture timestamp in microseconds since the defined epoch |
| payload | byte[] | according to stream definition | Binary data for all streams in the group |
Payload
| Field | Type | Description |
|---|---|---|
| Stream 1 | varies | Data for stream 1 (type and order defined by the stream definition) |
| Stream 2 | varies | Data for stream 2 (type and order defined by the stream definition) |
| … | … | … |
All stream values for the specified group must be present in the frame, in the order defined in the stream definition. Each value is packed according to its data_type.
All fields within the Data Frame are tightly packed without implicit memory padding. Because the header is exactly 16 bytes and all defined payload types are 32-bit, every value in the payload is guaranteed to fall on a natural 4-byte alignment boundary.
Timestamp Definition
Section titled “Timestamp Definition”The timestamp_us field must represent the Time of Validity (hardware capture time) of the measurements, not the time of network packet construction or transmission.
The reference epoch of this timestamp is defined by the timestamp_epoch field in the Stream Definition Frame.
Note: Timestamps must be strictly monotonically increasing within a single session.
Device Disconnect Frame
Section titled “Device Disconnect Frame”The Device Disconnect Frame is sent when a device is disconnected. It is a simple binary frame that signals the removal of a device from the session.
Device Disconnect Payload
| Field | Type | Size (bytes) | Description |
|---|---|---|---|
| device_id | uint32 | 4 | Device/session identifier |
Upon receiving this frame, the receiver should remove or mark the device as disconnected and clean up any associated resources. A new device with the same device_id may connect in the future, but it should be treated as a new session.