regex-all

Started by cameyo, June 13, 2025, 07:49:32 AM

Previous topic - Next topic

cameyo

Does anyone know of any more efficient methods than the following to calculate all the matches of a regex?
Thanks.
(define (regex-all regexp str all)
"Find all occurrences of a regex in a string"
  (let ( (out '()) (idx 0) (res nil) )
    (setq res (regex regexp str 64 idx))
    (while res
      (push res out -1)
      (if all
          (setq idx (+ (res 1) 1))        ; contiguos pattern (overlap)
          (setq idx (+ (res 1) (res 2)))) ; no contiguos pattern
      (setq res (regex regexp str 64 idx)))
    out))

(setq a "AAAaBAAAABcADccAAAB")
(regex "[A]{3}" a)
;-> ("AAA" 0 3)
(regex-all "[A]{3}" a)
;-> (("AAA" 0 3) ("AAA" 5 3) ("AAA" 15 3))
(regex-all "[A]{3}" a true)
;-> (("AAA" 0 3) ("AAA" 5 3) ("AAA" 6 3) ("AAA" 15 3))

rrq

#1
I suppose one might use find-all for that. Though, find-all doesn't provide the character index, and it doesn't provide the match overlap variant. It wouldn't be terribly hard to patch the source to make find-all offer the index (eg as $index) similar to the match count.

However, one can use collect and a small helper function to get the following alternative solution:
> (define (regex-all EXP STR OPT ALL)
  (let ((i 0)
     (move-i (fn (X) (setq i (+ (X 1) (if ALL 1 (X -1)))) X)))
      (collect (if (regex EXP STR OPT i) (move-i $it)))))

> (setq a "AAAaBAAAABcADccAAAB")
> (regex-all "[A]{3}" a 0)
(("AAA" 0 3) ("AAA" 5 3) ("AAA" 15 3))
> (regex-all "[A]{3}" a 0 true)
(("AAA" 0 3) ("AAA" 5 3) ("AAA" 6 3) ("AAA" 15 3))

EDIT: I fixed to use EXP rather than hard-coded search pattern.

cameyo

Thanks.
You reminder me of 'collect' ... :)