Some cleanup of the System.Number class #20619

tannergooding · 2018-10-25T23:01:27Z

This is the first part of https://github.com/dotnet/corefx/issues/33053, and does some initial cleanup of the System.Number class:

Resolve the general formatting errors that were showing up in Number.Formatting and Number.Parsing
Removing duplicated code from the Parse/TryParse calls (doing the same for Format/TryFormat is not as easy)
Some minor renames/relocations
Updating NumberBuffer to take custom-sized digit buffers

tannergooding · 2018-10-25T23:02:58Z

src/System.Private.CoreLib/shared/System/Number.Formatting.cs

        }

-        private static void FormatCurrency(ref ValueStringBuilder sb, ref NumberBuffer number, int nMaxDigits, NumberFormatInfo info)
+        private static unsafe void FormatCurrency(ref ValueStringBuilder sb, bool sign, int scale, char* dig, int nMaxDigits, NumberFormatInfo info)


The Format* functions previously took a ref NumberBuffer; they were changed to take the sign, scale, and digit pointer individually so they could be shared between the IntegerBuffer and NumberBuffer implementations.

tannergooding · 2018-10-25T23:05:30Z

src/System.Private.CoreLib/shared/System/Number.IntegerFormatting.cs

+            }
+        }
+
+        internal static unsafe void IntegerBufferToString(ref ValueStringBuilder sb, ref IntegerBuffer integer, char format, int nMaxDigits, NumberFormatInfo info)


The ref NumberBuffer number parameters for the integer case where changed to ref IntegerBuffer integer

tannergooding · 2018-10-25T23:06:25Z

src/System.Private.CoreLib/shared/System/Number.Dragon4.cs

    {
        private static unsafe void Dragon4(double value, int precision, ref NumberBuffer number)
        {
+            const double Log10V2 = 0.30102999566398119521373889472449;


These were moved out of the NumberBuffer.cs file because they are only used by Dragon4

tannergooding · 2018-10-25T23:09:39Z

CC. @danmosemsft, @eerhardt, @jkotas.

I explicitly called out the few cases where it wasn't just copy/paste. I made this PR separate since it is just the initial refactoring and will make the actual implementation PR easier to review.

stephentoub · 2018-10-25T23:22:21Z

src/System.Private.CoreLib/shared/System/Number.IntegerBuffer.cs

+            //  * 19 for int64
+            //  * 20 for uint64
+            //  * 39 for int128/uint128
+            internal const int MaxDigits = 50;


Based on the comment, will a subsequent PR lower the value of MaxDigits to 39?

This one can be fixed in a follow-up PR, but is not required for correctness.

src/System.Private.CoreLib/shared/System/Number.NumberBuffer.cs

stephentoub · 2018-10-25T23:24:12Z

src/System.Private.CoreLib/shared/System/Number.NumberBuffer.cs

+            //  *   113 for Single
+            //  *   768 for Double
+            //  * 11563 for Quad
+            internal const int MaxDigits = 50;


Similarly, the comment makes this sound wrong. I assume it'll be fixed subsequently as part of all of this work?

Yes, this will be fixed in a follow-up PR.

jkotas · 2018-10-25T23:56:10Z

+4,879 −3,775

The PR is adding a net new 1000+ lines. Where is it coming from? Is it a duplicated 1000 lines? I was expecting much smaller duplication out from this.

tannergooding · 2018-10-26T00:44:11Z

Split into 3 commits to make this easier to review:

The first just formats the document (to get rid of the noise from copying/pasting the code between files)
The second contains the refactoring (splitting NumberBuffer into IntegerBuffer and FloatingPointBuffer)
The third splits the code into the respective files

tannergooding · 2018-10-26T00:47:02Z

src/System.Private.CoreLib/shared/System/Number.Formatting.cs

        }

-        internal static unsafe void NumberToString(ref ValueStringBuilder sb, ref NumberBuffer number, char format, int nMaxDigits, NumberFormatInfo info)
+        internal static unsafe void IntegerBufferToString(ref ValueStringBuilder sb, ref IntegerBuffer integer, char format, int nMaxDigits, NumberFormatInfo info)


@jkotas, the bulk of the "new lines" comes from duplicating BufferToString and BufferToStringFormat.

I'll think about this some more and see if we can reduce this duplication.

tannergooding · 2018-10-26T00:49:36Z

src/System.Private.CoreLib/shared/System/Number.Parsing.cs

+            return false;
+        }
+
+        private static unsafe bool ParseNumber(ref char* str, char* strEnd, NumberStyles styles, ref FloatingPointBuffer floatingPoint, NumberFormatInfo info, bool parseDecimal)


This is the other big "duplicated" code, but it will likely need to change a bit between the two implementations. I'll also give it some more thought, to see if we can share some of the code here.

jkotas · 2018-10-26T00:53:36Z

It may be interesting to look at storing pointer to the buffer in the structure - it is an extra indirection in a few places, but it should not really show up on the radar. Or just use the big buffer everywhere.

tannergooding · 2018-10-26T00:56:22Z

It may be interesting to look at storing pointer to the buffer in the structure - it is an extra indirection in a few places, but it should not really show up on the radar. Or just use the big buffer everywhere.

Right. I was considering just using the big buffer everywhere for float/double. It shouldn't be expensive with the .locals init stripping we do and is what the MSVCRT does already.

tannergooding · 2018-10-26T15:20:16Z

src/System.Private.CoreLib/shared/System/Number.IntegerFormatting.cs

+            }
+        }
+
+        internal static unsafe void IntegerBufferToString(ref ValueStringBuilder sb, ref IntegerBuffer integer, char format, int nMaxDigits, NumberFormatInfo info)


As commented here, the majority of the code duplication comes from NumberBufferToString, NumberBufferToStringFormat, and ParseNumber.

After looking more closely at the methods, I don't believe there is much that can be done with code sharing. These methods depend on RoundIntegerBuffer and RoundFloatingPointBuffer on each code path which will end up dealing with the buffers in very different ways. The former thinks in terms of value, scale and uses "normal" rounding logic; while the latter needs to think in terms of mantissa, exponent, and uses IEEE-compliant rounding logic (which will ultimately make the end formatting/parsing logic differ between these as well).

However, I think there is a bit of duplication we can remove in other parts of these code files. For example, we have quite a bit of duplication between several of the X and TryX code paths, namely due to the X code path needing to decide if it wants to throw an OverflowException or FormatException (which could be handled by returning an enum or a secondary bool).

Hmmm, actually, we might be able to do it by keeping a single buffer:

ref struct NumberBuffer { public int precision; public int scale; public int exponent; // New field, only used by floating-point buffers public bool sign; public NumberBufferKind kind; public Span<char> digits; // Allocated at creation so it is sized as needed }

The rounding only happens in a few places, at the end of parsing and isn't in the hot-path, so an additional check on kind shouldn't hurt. We only need scale for integer types, but we can make use of both scale and exponent for floating-point (the former can still be used to help with the fast path check).

… TryParse

…ded lengths

tannergooding · 2018-10-26T20:55:25Z

@jkotas, @stephentoub. This has changed a bit.

We are back to having a single NumberBuffer type, but it now takes a custom-sized digit buffer to handle the differences between integer and floating-point values.

The NumberBuffer type will need to carry an additional exponent field that will be unused for Integer parsing/formatting, but I don't believe that will be a problem.

…nt for the rounding digit.

stephentoub · 2018-10-26T21:07:37Z

Could you do some perf testing to ensure that some of these refactorings don't impact things like int.TryParse?

tannergooding · 2018-10-27T18:06:34Z

src/System.Private.CoreLib/shared/System/Number.NumberBuffer.cs

-        // DriftFactor = 1 - Log10V2 - epsilon (a small number account for drift of floating point multiplication)
-        private const double DriftFactor = 0.69;
+        //  We need 1 additional byte, per length, for the terminating null
+        private const int DecimalNumberBufferLength = 50 + 1;


I kept DecimalNumberBufferLength at the current value (50 + 1). However, I believe we can make it 30 (29 digits + 1 for rounding).

If the rounding digit is 5 then up to 20 additional digits are checked in NumberBufferToDecimal (so 50 in total). The number 20 looks arbitrary, theoretically it would be correct to check the entire string...

@pentp, maybe I am misremembering, but I thought decimal only supported 29 digits max. That is, it has 96-bits for the mantissa and it uses 8-bits for the exponent, which is strictly restricted to a value between 0 and 28, inclusive.

This should mean that you need to consider 29 digits, plus 1 for rounding. Anything more than that shouldn't be impactful.

It supports only 29 digits, but if the rounding digit is 5 and the number is even, then it checks the following digits to decide if it should round up or down.

@pentp, So you still only need 30 digits, but you also need a bool that indicates whether or not there was any trailing non-zero digits in the rest of the input string (same as double and float), correct?

Yes, that's correct.

tannergooding · 2018-10-28T00:55:59Z

Here are the perf results from my workbox:

Perf-Profile-System.Runtime.Performance.Tests-Base-i7-7700.csv.txt
- The Baseline
Perf-Profile-System.Runtime.Performance.Tests-Span-i7-7700.csv.txt
- The current changes that are up, which tracks the buffer as a Span<char> field in the NumberBuffer struct
Perf-Profile-System.Runtime.Performance.Tests-Ptr-i7-7700.csv.txt
- A test change which just tracks the raw char* and length, rather than using Span<char>, generally better numbers, but without the checks that Span provides

Numbers for the Integer tests generally look good some are marginally faster/slower, but all looked to be within the tolerance ranges (running a few times, and looking at benchview, some of the tests aren't super stable). The floating-point tests are similar, but most tend to look slightly on the slower side for the parsing (which all went through TryParse); the formatting all tended to be faster.

tannergooding · 2018-10-28T02:04:22Z

Similar results on a Ryzen 1800X:

stephentoub · 2018-10-29T00:40:10Z

Thank you for checking.

tannergooding · 2018-10-29T16:35:58Z

@jkotas, @stephentoub. Any other feedback here?

jkotas

Thanks

* Formatting Number.Formatting.cs and Number.Parsing.cs * Removing some duplicated parsing code by having the Parse method call TryParse * Moving two constants from NumberBuffer to Dragon4 * Rename FloatPrecision to SinglePrecision * Updating the casing of the NumberBuffer fields * Updating NumberBuffer to allow taking a custom-sized digit buffer. * Updating the various NumberBufferLength constants to be the exact needed lengths * Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit. * Fixing TryParseNumber to use the correct maxDigCount * Ensure the TryParseSingle out result is assigned on success Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>

* Formatting Number.Formatting.cs and Number.Parsing.cs * Removing some duplicated parsing code by having the Parse method call TryParse * Moving two constants from NumberBuffer to Dragon4 * Rename FloatPrecision to SinglePrecision * Updating the casing of the NumberBuffer fields * Updating NumberBuffer to allow taking a custom-sized digit buffer. * Updating the various NumberBufferLength constants to be the exact needed lengths * Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit. * Fixing TryParseNumber to use the correct maxDigCount * Ensure the TryParseSingle out result is assigned on success

* Formatting Number.Formatting.cs and Number.Parsing.cs * Removing some duplicated parsing code by having the Parse method call TryParse * Moving two constants from NumberBuffer to Dragon4 * Rename FloatPrecision to SinglePrecision * Updating the casing of the NumberBuffer fields * Updating NumberBuffer to allow taking a custom-sized digit buffer. * Updating the various NumberBufferLength constants to be the exact needed lengths * Fixing DoubleNumberBufferLength and SingleNumberBufferLength to account for the rounding digit. * Fixing TryParseNumber to use the correct maxDigCount * Ensure the TryParseSingle out result is assigned on success Commit migrated from dotnet/coreclr@aef0bc2

tannergooding commented Oct 25, 2018

View reviewed changes

stephentoub reviewed Oct 25, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberBuffer.cs Show resolved Hide resolved

stephentoub reviewed Oct 25, 2018

View reviewed changes

Formatting Number.Formatting.cs and Number.Parsing.cs

feaa8ea

tannergooding commented Oct 26, 2018

View reviewed changes

tannergooding added 4 commits October 26, 2018 11:24

Removing some duplicated parsing code by having the Parse method call…

21143c2

… TryParse

Moving two constants from NumberBuffer to Dragon4

4799c2e

Rename FloatPrecision to SinglePrecision

28dfc6f

Updating the casing of the NumberBuffer fields

a664e06

tannergooding changed the title ~~Splitting the Number.Parsing and Number.Formatting code into "Number" and "Integer"~~ Oct 26, 2018

tannergooding added 2 commits October 26, 2018 13:50

Updating NumberBuffer to allow taking a custom-sized digit buffer.

66250b3

Updating the various NumberBufferLength constants to be the exact nee…

86f9ff3

…ded lengths

Fixing DoubleNumberBufferLength and SingleNumberBufferLength to accou…

acfdc7e

…nt for the rounding digit.

Fixing TryParseNumber to use the correct maxDigCount

7753d54

tannergooding commented Oct 27, 2018

View reviewed changes

Ensure the TryParseSingle out result is assigned on success

f53769a

jkotas approved these changes Oct 29, 2018

View reviewed changes

tannergooding merged commit aef0bc2 into dotnet:master Oct 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some cleanup of the System.Number class #20619

Some cleanup of the System.Number class #20619

Uh oh!

tannergooding commented Oct 25, 2018 •

edited

Loading

tannergooding Oct 25, 2018

tannergooding Oct 25, 2018

tannergooding Oct 25, 2018

tannergooding commented Oct 25, 2018

stephentoub Oct 25, 2018

tannergooding Oct 25, 2018

Uh oh!

stephentoub Oct 25, 2018

tannergooding Oct 25, 2018

jkotas commented Oct 25, 2018 •

edited

Loading

tannergooding commented Oct 26, 2018

tannergooding Oct 26, 2018

tannergooding Oct 26, 2018

jkotas commented Oct 26, 2018

tannergooding commented Oct 26, 2018

tannergooding Oct 26, 2018

tannergooding Oct 26, 2018

tannergooding commented Oct 26, 2018

stephentoub commented Oct 26, 2018

tannergooding Oct 27, 2018

pentp Oct 29, 2018

tannergooding Oct 29, 2018

pentp Oct 29, 2018 •

edited

Loading

tannergooding Oct 30, 2018

pentp Oct 30, 2018

tannergooding commented Oct 28, 2018 •

edited

Loading

tannergooding commented Oct 28, 2018

stephentoub commented Oct 29, 2018

tannergooding commented Oct 29, 2018

jkotas left a comment

Labels

4 participants

Some cleanup of the System.Number class #20619

Some cleanup of the System.Number class #20619

Uh oh!

Conversation

tannergooding commented Oct 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding commented Oct 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkotas commented Oct 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tannergooding commented Oct 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkotas commented Oct 26, 2018

tannergooding commented Oct 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding commented Oct 26, 2018

stephentoub commented Oct 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pentp Oct 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding commented Oct 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tannergooding commented Oct 28, 2018

stephentoub commented Oct 29, 2018

tannergooding commented Oct 29, 2018

jkotas left a comment

Choose a reason for hiding this comment

Labels

4 participants

tannergooding commented Oct 25, 2018 •

edited

Loading

jkotas commented Oct 25, 2018 •

edited

Loading

pentp Oct 29, 2018 •

edited

Loading

tannergooding commented Oct 28, 2018 •

edited

Loading