# R删除重复的数字序列(R remove repeated digit sequences)

``````x <- 'foo123bar123baz123456abc1111def123456789'
``````

``````foo123barbazabcdef
``````

``````x <- 'foo123bar123baz123456abc1111def123456789'
gsub('(?:\\d+|\\G(?<!^)\\D*)\\K\\d*', '', x, perl=T)
# [1] "foo123barbazabcdef"
``````

``````(?:           # group, but do not capture:
\d+         #   digits (0-9) (1 or more times)
|            # OR
\G(?<!^)    #   contiguous to a precedent match, not at the start of the string
\D*         #   non-digits (all but 0-9) (0 or more times)
)\K           # end of grouping and reset the match from the result
\d*           # digits (0-9) (0 or more times)
``````

``````gsub('(?:^\\D*\\d+)?\\K\\d*', '', x, perl=T)
``````

``````gsub('^(\\D*\\d+)|\\d+', '\\1', x)
``````

``````^\D*\d+(*SKIP)(*F)|\d+
``````

`^\D*\d+`匹配从开始到第一个数字的所有字符。 `(*SKIP)(*F)`导致匹配失败，然后正则表达式引擎尝试使用位于`|`右侧的模式匹配字符`|``\d+`对剩余的字符串。 因为`(*SKIP)(*F)`是 PCRE 动词，所以您必须启用`perl=TRUE`参数。

``````> x <- 'foo123bar123baz123456abc1111def123456789'
> gsub("^\\D*\\d+(*SKIP)(*F)|\\d+", "", x, perl=TRUE)
[1] "foo123barbazabcdef"
``````