Your problem is in your second capture group:
([a-zA-Z0-9]*\s{1})+
The parenthesized group is repeated, matching each of the words 'SOME'
, 'TEXT'
, and 'HERE'
individually, leaving your second capture group with only the final match, 'HERE'
.
You need to put the +
inside the capturing parenthesized groups, and use non-capturing parentheses (?:...)
to enclose your existing group. Non-capturing parentheses, which use (?:
to start the group and )
to end the group, are a way in a regular expression to group parts of your match together without capturing the group. You can use repetition operators (+
, *
, {n}
, or {n,m}
) on a non-capturing group and then capture the entire expression:
((?:[a-zA-Z0-9]*\s{1})+)
In total:
/^\d{1}\s{1}[a-zA-Z]{3}\s{1}\d{2}\s{1}([a-zA-Z]{3}\s{1}\d{2})\s{1}((?:[a-zA-Z0-9]*\s{1})+)(\d+.\d+)/
As a side note, this is a pretty clunky regex. You never really need to specify {1}
in a regex as a single match is the default. Similarly, \d\d
is one character less typing than \d{2}
. Also, you probably just want \w
instead of [a-zA-Z0-9]
. Since you don't seem to care about case, you probably just want to use the /i
option and simplify the letter character classes. Something like this is a more idiomatic regular expression:
/^\d [a-z]{3} \d\d ([a-z]{3} \d\d) ((?:\w* )+)(\d+.\d+)/i
Finally, though the Ruby documentation for regular expressions is a little thin, Ruby uses somewhat standard Perl-compatible regular expressions, and you can find more information about regular expressions generally at regular-expressions.info