Post #163,105
7/6/04 3:57:31 PM
|
reg-ex pair matching help
I am having some regular-expression difficulty expanding something that works for characters into something that works for phrases.
The following finds an XML statement:
<\\w[^>]*>
But when I try to extrapolate this to phrases instead of single characters, it does not work. For example, if I want to extract content between matching opening and closing "foo" XML tags, this does NOT work:
<foo>(^(</foo>))*</foo>
Most examples I can find only deal with character matching, not phrase matching. Note that the middle part is to keep it from finding "wide" pairs, such as the opening of pair number 1 with the closing of pair number 99.
________________ oop.ismad.com
|
Post #163,108
7/6/04 4:10:06 PM
|
Which language?
You just need to tell it not to use greedy matching, and <foo>(.*)</foo> will work as expected.
And why don't you just use a parser?
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #163,131
7/6/04 5:10:14 PM
|
It's ColdFusion
You just need to tell it not to use greedy matching If I turn off greedy matching, it does not return the length of the match. For some reason they wed the length info to greedy set on. By the way, I need to match across newlines. Is there an "any character" symbol? I tried this, but it does not work: <foo>([.|\\n]*)</foo> And why don't you just use a parser Am I reaching the limit of reg-ex such that a parser is warrented? I don't necessarily need perfect matching in this case. Thanks.
________________ oop.ismad.com
|
Post #163,138
7/6/04 5:37:07 PM
7/6/04 5:44:08 PM
|
Re: It's ColdFusion
.* will match anything, including newlines. If you're using a group, you shouldn't need the length, nicht war? I don't know the ColdFusion API, however. Every other regexp library I've used lets you just retrieve the actual text that was matched. Edit: yes, it does: <cfset sLenPos=REFind("<foo>(?.*)</foo>", someString, 1, "True")> Then do this: <cfoutput>\n #mid(someString, sLenPos.pos[1], sLenPos.len[1])#\n</cfoutput>
According to [link|http://www.dantor.com/support/cfdocs/Developing_ColdFusion_MX_Applications_with_CFML/regexp5.html#1099114|this] (at the bottom), minimal matching *does* return the length. Will the text in between the foo tags ever include a < symbol? If not, just use <foo>([^<]*</foo> . Another alternative would be finding <foo> with REFind, then finding </foo> with REFind and a starting index: REFind("</foo>", someString, indexFromFirstREFind + 4) , then picking out the string in between. As far as parsers go, the XMLparse() function looks easy enough to use. You can use that in combination with XMLsearch() to do XPath searches on the parsed data. A parser will automatically handle things like CDATA sections and the like.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
Edited by admin
July 6, 2004, 05:44:08 PM EDT
|
Post #163,152
7/6/04 7:47:38 PM
|
Re: It's ColdFusion
Oops, I forgot to hit the "next page" arrow icon to see the "Specifying minimal matching" section of that document in my offline Help version when looking for a solution. Personally I would make those navigation arrow icons much bigger than MacroMedia did. For some reason many think that tiny fonts and tiny icons looks more "professional". Maybe I am just getting old and blind. Either that, they think esthetics are more important than usebility.
Anyhow, I will play with "?". It does have other XML tags, such as CFIF, in between tags in the text, so ">" searching alone won't work.
You are right about the newlines. I was looking at JavaScript reg-ex, which seems to treat "." differently for some reason.
Todays lessons:
1. Make sure you don't miss the "next page" icon when browsing offline docs. 2. Different languages treat some of the common symbols differently.
Thanks.
________________ oop.ismad.com
|
Post #163,157
7/6/04 8:24:46 PM
|
No problem.
Any opportunity to look at something new. Even if it's ColdFusion.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #163,159
7/6/04 8:29:48 PM
|
Re: No problem.
When you can't find anything new to look at, time to start your own.
Morph forum interface into a general purpose idiom. Everything is a polylog. A forum is an organized polylog.
Can a database be a polylog? The problem you are working on is just that.
-drl
|
Post #163,163
7/6/04 8:55:19 PM
|
WTF is a polylog?
To deny the indirect purchaser, who in this case is the ultimate purchaser, the right to seek relief from unlawful conduct, would essentially remove the word consumer from the Consumer Protection Act - [link|http://www.techworld.com/opsys/news/index.cfm?NewsID=1246&Page=1&pagePos=20|Nebraska Supreme Court]
|
Post #163,164
7/6/04 8:59:19 PM
|
An undeveloped frog?
|
Post #163,166
7/6/04 9:04:00 PM
|
An n-dimensional logarithm?
|
Post #163,168
7/6/04 9:07:13 PM
|
Re: WTF is a polylog?
A lot of interconnected dialogs.
I think the right word is actually polylogy.
-drl
|
Post #163,169
7/6/04 9:08:31 PM
|
Multiple BMs...
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #163,173
7/6/04 9:13:22 PM
|
"We have top men working on it!"
-drl
|
Post #163,175
7/6/04 9:48:40 PM
|
"Tells me my cookies are off."
|
Post #163,180
7/6/04 10:08:15 PM
|
Sorry, you are not a winner.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #163,233
7/7/04 10:41:43 AM
|
Blue Valkyrie needs food badly!
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey
Heard near the SCOG employee entry/exit way:
Security: We got another Mass Exodus Doorway Jam.
|
Post #163,269
7/7/04 1:31:11 PM
|
How convenient is that??
Alex
"If I seem unduly clear to you, you must have misunderstood what I said." -- Alan Greenspan, Federal Reserve chairman
|
Post #163,111
7/6/04 4:12:37 PM
|
Try argument substitution?
<(foo)>.*</\\1>
|
Post #163,116
7/6/04 4:20:52 PM
|
I don't think that's the problem.
He's trying to avoid matching things like:
<foo>Something</foo><foo>Something else</foo>
with <foo>.*</foo> and getting Something</foo><foo>Something else as the value, which is what a greedy qualifier will do.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|