HWYDI: String splitting with exceptions August 22nd, 2008
the Problem
In an application I’m writing I’m parsing information from a Wiki and formatting it in XML. Some of the data I’m parsing needs to get split into an array, for example
"To Do: Update description, add more details (client, date, ...), categorize, publish" # Should become {"To Do" => ["Update description", "add more details (client, data, ...)", "categorize", "publish"]}
As you can guess the problem is the the commas between brackets. I can’t just split on commas because then I’d get [...(clients", "date", "...)" ...]
my Solution
Nothing yet, I tried something that looped over every char with a flag whether to split or not, but that (ofcourse) doesn’t work with a Regexp, so I’m back to square #1.
How would You do it?
l
Once had a project that needed to split in quite the same manner. Tried to whip up a nifty RegEx, yet it was a massiv FAIL.
Then went for the Briek & Brak solution, in order to get results fast (fast as in ‘today’, not ‘time it takes to execute’):
Told you it was B&B (Quick & Dirty) :-P
That’s exactly what I proposed yesterdayafternoon, but Jan is purist :)
I’d throw “CSV regex” into google, look for a regex that works and then adapt it (quotes become braces, no big deal). :/
You could replace all commas within parentheses with some other character or sequence of characters, then split it by commas, and then replace the replacements with commas.
class String def split_ignoring_parentheses(delim = ',', temp_replace = '``') split_with_replacements = gsub(/\(.*?\)/) { |s| s.gsub(delim, temp_replace) }.split(delim) split_with_replacements.map { |e| e.gsub(temp_replace, delim).strip } end end