Title |
Test
Find
Pattern Title
|
Expression |
(?<=(?:^|,)")(?:[^"]|"")+ |
Description |
A regex that will split a CSV file when used for MATCH function.
All values must be in quotes, and seperated by commas.
Ex. "test1","test2","test3"
Quotes themselves are not captured
Note: Only works in regex engines that support backreferences (Java, .NET, php, etc). JavaScript is not one of them. |
Matches |
"test1","test2" |
Non-Matches |
test1,test2 |
Author |
Rating:
Andrew Stakhov
|
Source |
|
Your Rating |
|
Title: cont.
Name: Tony
Date: 1/21/2005 2:22:59 PM
Comment:
those double-quoted values...
remain double-quoted after parsing
Title: In fact
Name: Tony
Date: 1/21/2005 2:20:16 PM
Comment:
the following add-on to the original regular expression
(?<=(?:^|,\s*)")(?:[^"]|"")*|(?<=(?:^|,))(?:\s*)(?=(?:,|$))
parses csv input like
"Tony", "had","a",,"little",,, ,"", "lamb",
correctly, e.i. if all values are in quotes and separated by commas, then resulting match collection contains proper number of matches, allowing for empty values inside quotes, pure empty values (two consequtive commas) and spaces between commas. As in the original, quotes themselves are not captured.
The only problem that remains unhandled - those double-quoted values as in the following csv:
"Richard ""Deadmeat"" Jones","never", "saw it","comming"
Title: Add-on
Name: Tony
Date: 1/20/2005 5:00:38 PM
Comment:
Changing original to
(?<=(?:^|,\s*)")(?:[^"]|"")+
allows spaces in between the quoted values
Title: Another Option
Name: Jeff
Date: 12/7/2004 12:50:07 AM
Comment:
This one does a better job, but allows the comma in each result:
(?:^|,)(\x22(?:[^\x22]+|\x22\x22)*\x22|[^,]*)
Title: Oh dear
Name: Barry
Date: 11/18/2004 10:24:52 AM
Comment:
What a shame, throw in anything complicated like spaces and this expression simply dies.