Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upEnhance some cmdlets with Culture parameter #167
Conversation
|
Thanks for writing this up! Couple small nits. Overall I think this is a good idea, but I'm not sure about the case-sensitive by default for most of the suggested values. It does match the c# underpinnings but it's also the reverse of PowerShell's default, where case-_in_sensitive is the assumption. I don't know about suddenly switching that around with these parameters. |
1-Draft/RFC0000-RFC-Enhance-some-cmdlets-with-Culture-and-Ordinal.md
Outdated
Show resolved
Hide resolved
1-Draft/RFC0000-RFC-Enhance-some-cmdlets-with-Culture-and-Ordinal.md
Outdated
Show resolved
Hide resolved
@vexx32 The RFC doesn't suggest to change the case defaults. |
|
Ah, you're right. That was the alternate proposal, I misread. I'm good with the primary proposal. |
Alternative proposal is not a breaking change. It only keep consistency - one API for script and binary. |
|
@SteveL-MSFT @vexx32 @mklement0 Could you please ping community to discuss the RFC? |
|
(As an aside: I personally cannot ping any community - I have no blog and have only a handful of Twitter followers; I rarely tweet myself.) As currently written, your RFC doesn't provide adequate motivation; there's more to be included from PowerShell/PowerShell#9348. As currently written, the RFC reads like a passive-aggressive resistance to the decision made in PowerShell/PowerShell#9348 (comment): Your desire for unconditional consistency between PowerShell and CoreFx / C# APIs is misguided, and such a discussion doesn't belong in this RFC. |
|
@PowerShell/powershell-committee had a limited review on this today (only 3 out of 6 of us in attendance). A couple notes:
|
|
Before adding commits, I want to discuss it here. Several cmdlets perform operations based solely on the current culture:
while the full set of capabilities should include operations with an indication of a specific culture, and not only linguistic operations but also non-linguistic operations. There are areas of scenarios that rather require non-linguistic operations (such as searching in logs). Linguistic operations include CurrentCulture, specific Culture and InvariantCulture. Non-linguistic is Ordinal. We can also see such a comment
(Another confirmation This suggests that OrdinalIgnoreCase is based on InvariantCulture (linguistic) .Net Core actively uses local performance optimization for ASCII characters. Measurements show that OrdinalIgnoreCase for ASCII characters is twice as fast as for other characters (which also confirms the use of InvariantCulture). Such optimization leads to the discrimination of languages other than English that looks unjustified. To achieve this, the CoreFXLab repository package is implemented with support for Simple Case Folding (SCF) based on Unicode standard. It should be noted that Unicode support in Core is based on the implementation of the underlying OS, which sometimes implies different behavior. With SFC, we will always have the same behavior on all platforms. (This is by the way important for PowerShell identifiers, but it’s not in the scope of this RFC) (Also there was an issue for HashSet fixed with InvariantCulture but it seems SFC is more suitable.) So the suggestion is to
Suggestions for implementation:
Select-String -Culture ru-RU -CaseSensetive # lingustic
Select-String -Culture ru-RU # lingustic
Select-String -Culture Ordinal -CaseSensetive # non-lingustic
Select-String -Culture Ordinal # lingustic - OrdinalIgnoreCase
Select-String -Culture Invariant -CaseSensetive # lingustic
Select-String -Culture Invariant # lingustic - InvariantCultureIgnoreCase
Select-String -Culture SimpleCaseFolding # non-lingustic
Select-String -Culture ru-RU -CaseSensetive # lingustic as expected for Culture term
Select-String -Culture ru-RU # lingustic
Select-String -Comparison Ordinal -CaseSensetive # non-lingustic
Select-String -Comparison Ordinal # lingustic - OrdinalIgnoreCase
Select-String -Comparison Invariant -CaseSensetive # lingustic
Select-String -Comparison Invariant # lingustic - InvariantCultureIgnoreCase
Select-String -Comparison SimpleCaseFolding # non-lingustic
Select-String -SimpleMatch "^Error.+count" # Current default is CurrentCulture - we'll have to change to SCF
Select-String -FastMatch "^Error.+count" # Based on SCF
Select-String -SimpleMatch "^Error.+count" -Match "<Full Regex>"
Select-String -FastMatch "^Error (.+) count" -Match "<Full Regex>" # First do fast search a candidate string in large file
# then parse it with full-featured RegexA separate problem is that the grep / ripgrep utilities have an option to interpret the input stream from the file as byte ("binary" in their terms) which can be useful. FileSystemProvider offers us the AsByteStream parameter, which can also be useful here. Select-String -Path source.txt -AsByteStream -SimpleMatch "qwerty"
Select-String -Path source.txt -Encoding Utf8 -SimpleMatch "qwerty" |
|
@PowerShell/powershell-committee reviewed your latest reply. First, really appreciate the additional examples and exposition. Feel free to incorporate them into the RFC when you're ready. Right now, we're leaning towards your proposal 2 with some notes:
|
I agree with best design ( what is best? :-) ). And it depends on how we resolve:
It's a vicious circle. :-) Core team want to see use-cases before. We want to see the API in .Net Core before. @joeyaiello Could you please discuss the SimpleCaseFolding design and incorporating in .Net Core with Core team? After that, a way will be opened without danger of making breaking changes in future. |
|
@mklement0 Could you please take another look? |
|
@PowerShell/powershell-committee reviewed some more:
|
Thanks! My main concern - if .Net Core will have another API design we will fall in a breaking change. If you could speed up the start of this design process it would be very good. Of cause we can postpone SimpleCaseFolding implementation without removing from the RFC. My current thoughts is to implement Culture, then Comparison, then FastMatch.
This was added only to show the full picture. It is not in the RFC and we could have another RFC for this proposal.
If you ask about My vision today is that we could split the work on 2 main parts:
I hope now the RFC ready for final review. |
This comment has been minimized.
This comment has been minimized.
essentialexch
commented on 1-Draft/RFC0000-RFC-Enhance-some-cmdlets-with-Culture-and-Ordinal.md in fb2c2a7
Oct 28, 2019
|
Minor nit: CaseSensitive is spelled incorrectly throughout this update. |
|
@joeyaiello @SteveL-MSFT The RFC is ready for final review. |
| ##### Samples | ||
|
|
||
| ```powershell | ||
| Select-String -Culture ru-RU -CaseSensitive # lingustic as expected for Culture term |
This comment has been minimized.
This comment has been minimized.
SteveL-MSFT
Oct 30, 2019
Member
Can you add some of the examples you already have in the code PR tests as examples here? Specifically looking to have both inputs and expected outputs.
This comment has been minimized.
This comment has been minimized.
|
Thanks for the updates, @iSazonov! We'll put this one on our list to review |

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

iSazonov commentedApr 24, 2019
No description provided.