Skip to content

[Rust] add float16 to rust #3207

@chaokunyang

Description

@chaokunyang

Feature Request

Add full IEEE 754 half-precision floating point (float16, binary16) support to Fory/FDL, including a complete Rust runtime implementation that uses a strong type float16 (internally storing u16 bits), plus exhaustive tests for conversion/rounding/NaN/subnormal semantics.

Is your feature request related to a problem? Please describe

We want to use float16 in FDL to reduce payload size and memory footprint and to interoperate with other ecosystems (ML/graphics/etc.) where half precision is common. Currently Fory supports float32/float64 but not float16.

Rust stable does not provide a universally stable built-in half type with the exact APIs we need (nightly has f16, but it is unstable and the name may conflict with our public API). We need a portable, stable, first-party implementation with:

  • exact IEEE 754 binary16 bit representation (2 bytes),
  • well-defined IEEE conversion/rounding behavior,
  • a single strong type named float16 (to avoid conflicts with Rust f16 / nightly naming),
  • predictable serialization and cross-language compatibility.

Describe the solution you'd like

1) FDL / Type System

  • Introduce a new primitive type: float16.
  • Treat float16 as a true primitive (like float32/float64), usable in:
    • message fields
    • repeated fields
    • map values (and optionally keys if numeric keys are allowed; if not allowed today, keep consistent)
    • unions (if primitives are allowed)
  • Document the exact definition: IEEE 754 binary16 (“half precision”) per:
    https://en.wikipedia.org/wiki/Half-precision_floating-point_format

2) Wire Format / Serialization Semantics

  • Encode float16 as 2 bytes representing the raw IEEE 754 binary16 bit pattern (u16).
  • Define endianness exactly (match existing float32/float64 endianness rules).
  • NaN/Inf/±0/subnormal must round-trip correctly at the bit level.
    • If the framework canonicalizes NaNs today for float32/float64, specify whether float16 should:
      • preserve payload bits (preferred if feasible), or
      • canonicalize to a single quiet NaN pattern (acceptable but must be documented and consistent across languages).

3) Rust Runtime (core requirement): float16 strong type only

Provide a public strong type named float16. All runtime APIs must accept/return float16 only (no passing raw u16 bits around as public API).

3.1 Type definition
  • Provide a transparent, copyable value type:
    • #[repr(transparent)]
    • pub struct float16(u16);
  • Must be Copy, Clone, Default, Eq/Hash policy should be explicitly defined (see comparisons section).
  • No heap allocation.

Provide controlled construction and bit access:

  • pub const fn from_bits(bits: u16) -> float16
  • pub const fn to_bits(self) -> u16
3.2 Conversions (IEEE 754 compliant)
  • pub fn from_f32(x: f32) -> float16
    • Convert float32 -> binary16 using IEEE 754 rules and round-to-nearest, ties-to-even.
    • Must correctly handle:
      • NaN (produce a NaN in half; preserve payload if feasible, otherwise canonicalize; ensure quiet NaN if required)
      • ±Inf
      • ±0 (preserve sign)
      • normalized values
      • subnormals (gradual underflow)
      • overflow -> ±Inf
      • underflow -> subnormal/±0
  • pub fn to_f32(self) -> f32
    • Convert binary16 -> float32 (exact for all half values).

Nightly mirroring guidance:

  • If building on nightly, optionally provide From<f16> / Into<f16> behind a feature gate, but public type name stays float16 to avoid conflicts:
    • #[cfg(feature = "nightly-f16")]
    • conversions should be lossless at the bit level.
3.3 Classification (IEEE-consistent)

All operate on float16:

  • pub fn is_nan(self) -> bool
  • pub fn is_infinite(self) -> bool and/or pub fn is_infinite_sign(self, sign: i32) -> bool (sign: +1/-1/0)
  • pub fn is_zero(self) -> bool (treat +0/-0 as zero)
  • pub fn is_sign_negative(self) -> bool
  • pub fn is_subnormal(self) -> bool
  • pub fn is_normal(self) -> bool
  • pub fn is_finite(self) -> bool
3.4 Arithmetic (explicit methods + traits)

Rust does support operator overloading via traits. To make float16 feel like a numeric primitive, implement both:

  • explicit methods (for clarity and symmetry with other languages), and
  • standard traits (Add, Sub, Mul, Div, Neg).

Minimum explicit API:

  • pub fn add(self, rhs: float16) -> float16
  • pub fn sub(self, rhs: float16) -> float16
  • pub fn mul(self, rhs: float16) -> float16
  • pub fn div(self, rhs: float16) -> float16
  • pub fn neg(self) -> float16
  • pub fn abs(self) -> float16

Implementation rule for arithmetic (unless full half-FPU emulation is desired):

  • compute in f32 and round back to half each op:
    • float16::from_f32(self.to_f32() op rhs.to_f32())

Optional math parity (if needed by users):

  • sqrt, min, max, copysign, floor, ceil, trunc, round, round_ties_even
3.5 Comparisons + equality/hash policy (must be explicit)

Rust has strong expectations for Eq/Ord/Hash. IEEE floats are tricky due to NaN and signed zero.

Please choose and document one of these policies:

Policy A (recommended): bitwise Eq/Hash

  • Implement:
    • PartialEq/Eq/Hash based on to_bits() equality
  • Provide IEEE numeric comparison helpers separately:
    • pub fn eq_value(self, other: float16) -> bool (NaN != NaN, +0 == -0)
    • pub fn partial_cmp_value(self, other: float16) -> Option<Ordering> (None if NaN involved)

Policy B: IEEE-like PartialEq only (no Eq/Hash)

  • Implement PartialEq with IEEE rules, do not implement Eq/Hash.
  • This is closer to f32, but makes usage in hash maps harder.

Regardless of chosen policy, provide:

  • pub fn lt(self, other: float16) -> bool etc. (NaN => false)
  • pub fn partial_cmp(self, other: float16) -> Option<Ordering> aligned with f32 semantics
  • Optional: pub fn total_cmp(self, other: float16) -> Ordering (mirroring f32::total_cmp)
3.6 Formatting / parsing
  • Implement Display (format via to_f32()).
  • Implement Debug.
  • Optional:
    • FromStr (parse as f32, then convert to float16).

4) Rust Codegen requirement

  • Generated Rust fields for float16 must use float16 (not u16).
  • Repeated float16 should use Vec<float16>.
  • Map values should be HashMap<K, float16> (or the map type Fory uses).

5) Compiler / Reflection Integration

  • Update the FDL parser/type system so float16 is treated as a primitive type.
  • Ensure reflection/dynamic serialization recognizes Rust float16 as the float16 primitive (distinct from u16 integer).
  • Clarify schema evolution:
    • If float16 <-> float32 evolution is allowed, document conversion behavior/rounding; otherwise enforce strict matching.

6) Tests (must be exhaustive)

  1. Conversion tests (Rust)

    • ±0, ±Inf, NaN
    • max finite 65504
    • min normal 2^-14
    • min subnormal 2^-24
    • values around rounding boundaries
    • explicit ties-to-even cases (inputs exactly halfway between two representable half values)
    • overflow -> Inf, underflow -> subnormal/0
    • Optional stress: iterate all 65536 half bit patterns:
      • h = float16::from_bits(bits);
      • h2 = float16::from_f32(h.to_f32());
      • Verify bit preservation for all non-NaN values; for NaN validate the chosen policy (preserve payload vs canonicalize).
  2. Serializer/deserializer tests

    • Ensure wire output matches expected 16-bit patterns for known values (via to_bits()).
    • Round-trip for messages containing float16 fields, repeated float16 fields, maps with float16 values, optional fields, etc.
  3. Cross-language golden tests

    • Can be implemented in a future PR; must validate binary compatibility and NaN policy consistency.

Describe alternatives you've considered

  1. Store float16 as f32 in Rust and convert to float16 only during serialization.
  • Rejected: changes in-memory footprint, delays rounding to serialization time, and can produce cross-language semantic differences.
  1. Expose raw u16 in generated code and APIs, and only provide helper functions on bits.
  • Rejected: loses type safety and makes user code error-prone; we want float16 everywhere.
  1. Rely directly on nightly f16 as the public API type.
  • Rejected: nightly instability and naming conflicts; we want a stable public type named float16.
  1. Use third-party crates (e.g., half crate) as a hard dependency.
  • Possible, but we prefer a first-party minimal implementation to guarantee exact IEEE behavior, rounding mode, and avoid extra dependencies. (An optional feature-gated integration could be considered later.)

Additional context

#3099

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions