Multiple regex matches
I know I have written this code multiple times from scratch so I will blog it here for perpetuity (and some smarter people than me can suggest better solutions).
The problem is that I want to write a regex with a capturing group, then run it on a bunch of text and get back all the captured groups from all the matches. The best way I can figure out how to do this is:
public static List getMatches(Pattern pattern, String text) {
List matches = new ArrayList();
Matcher m = pattern.matcher(text);
int index = 0;
while(m.find(index)) {
matches.add(m.group(1));
index = m.end();
}
return matches;
}
This code is assuming that the pattern has exactly one capture group. It could be extended to handle multiple capture groups pretty easily.
So an example might be that I have some text (say a file of phone numbers) and I want to match each phone number and return all the area codes (just the area codes). So, the text might be:
Albert Pujols 111-456-7890
Darth Vader 222-123-4567
Carrot Top 333-123-4444
And the answer should be 111, 222, and 333. So, you’d do something like this with my method:
Pattern p = Pattern.compile("([0-9]{3})-[0-9]{3}-[0-9]{4}");
String text = "Albert Pujols\t111-456-7890\nDarth Vader\t222-123-4567\nCarrot Top\t333-123-4444\n";
List matches = RegexUtil.getMatches(p, text);
This works great for me. Is there a better way to do this? Is there some magical option on the regex pattern itself to avoid doing the loop?

Hi! My name is Alex Miller and I live in St. Louis. I write code for a living and currently work for
This seems to work just as well:
public static List getMatches(Pattern pattern, String text) {
List matches = new ArrayList();
Matcher m = pattern.matcher(text);
while(m.find()) { matches.add(m.group(1)); }
return matches;
}
Thanks! That’s better. As the Pattern javadoc says near the end (didn’t read that far before given that it’s mentioning the behavior of Matcher), find() starts where the last find match ended.