IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New reg-ex pair matching help
I am having some regular-expression difficulty expanding something that works for characters into something that works for phrases.

The following finds an XML statement:

<\\w[^>]*>

But when I try to extrapolate this to phrases instead of single characters, it does not work. For example, if I want to extract content between matching opening and closing "foo" XML tags, this does NOT work:

<foo>(^(</foo>))*</foo>

Most examples I can find only deal with character matching, not phrase matching. Note that the middle part is to keep it from finding "wide" pairs, such as the opening of pair number 1 with the closing of pair number 99.
________________
oop.ismad.com
New Which language?
You just need to tell it not to use greedy matching, and <foo>(.*)</foo> will work as expected.

And why don't you just use a parser?
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New It's ColdFusion
You just need to tell it not to use greedy matching


If I turn off greedy matching, it does not return the length of the match. For some reason they wed the length info to greedy set on.

By the way, I need to match across newlines. Is there an "any character" symbol? I tried this, but it does not work:

<foo>([.|\\n]*)</foo>

And why don't you just use a parser


Am I reaching the limit of reg-ex such that a parser is warrented? I don't necessarily need perfect matching in this case.

Thanks.
________________
oop.ismad.com
New Re: It's ColdFusion
.* will match anything, including newlines.

If you're using a group, you shouldn't need the length, nicht war? I don't know the ColdFusion API, however. Every other regexp library I've used lets you just retrieve the actual text that was matched. Edit: yes, it does:

<cfset sLenPos=REFind("<foo>(?.*)</foo>", someString, 1, "True")>

Then do this:
<cfoutput>\n  #mid(someString, sLenPos.pos[1], sLenPos.len[1])#\n</cfoutput>

According to [link|http://www.dantor.com/support/cfdocs/Developing_ColdFusion_MX_Applications_with_CFML/regexp5.html#1099114|this] (at the bottom), minimal matching *does* return the length.

Will the text in between the foo tags ever include a < symbol? If not, just use <foo>([^<]*</foo>.

Another alternative would be finding <foo> with REFind, then finding </foo> with REFind and a starting index: REFind("</foo>", someString, indexFromFirstREFind + 4), then picking out the string in between.

As far as parsers go, the XMLparse() function looks easy enough to use. You can use that in combination with XMLsearch() to do XPath searches on the parsed data. A parser will automatically handle things like CDATA sections and the like.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
Expand Edited by admin July 6, 2004, 05:44:08 PM EDT
New Re: It's ColdFusion
Oops, I forgot to hit the "next page" arrow icon to see the "Specifying minimal matching" section of that document in my offline Help version when looking for a solution. Personally I would make those navigation arrow icons much bigger than MacroMedia did. For some reason many think that tiny fonts and tiny icons looks more "professional". Maybe I am just getting old and blind. Either that, they think esthetics are more important than usebility.

Anyhow, I will play with "?". It does have other XML tags, such as CFIF, in between tags in the text, so ">" searching alone won't work.

You are right about the newlines. I was looking at JavaScript reg-ex, which seems to treat "." differently for some reason.

Todays lessons:

1. Make sure you don't miss the "next page" icon when browsing offline docs.
2. Different languages treat some of the common symbols differently.

Thanks.
________________
oop.ismad.com
New No problem.
Any opportunity to look at something new. Even if it's ColdFusion.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Re: No problem.
When you can't find anything new to look at, time to start your own.

Morph forum interface into a general purpose idiom. Everything is a polylog. A forum is an organized polylog.

Can a database be a polylog? The problem you are working on is just that.
-drl
New WTF is a polylog?
To deny the indirect purchaser, who in this case is the ultimate purchaser, the right to seek relief from unlawful conduct, would essentially remove the word consumer from the Consumer Protection Act
- [link|http://www.techworld.com/opsys/news/index.cfm?NewsID=1246&Page=1&pagePos=20|Nebraska Supreme Court]
New An undeveloped frog?
New An n-dimensional logarithm?
New Re: WTF is a polylog?
A lot of interconnected dialogs.

I think the right word is actually polylogy.
-drl
New Multiple BMs...
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New "We have top men working on it!"
-drl
New "Tells me my cookies are off."
New Sorry, you are not a winner.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Blue Valkyrie needs food badly!
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey

Heard near the SCOG employee entry/exit way:

  Security: We got another Mass Exodus Doorway Jam.
New How convenient is that??
Alex

"If I seem unduly clear to you, you must have misunderstood what I said." -- Alan Greenspan, Federal Reserve chairman
New Try argument substitution?
<(foo)>.*</\\1>
New I don't think that's the problem.
He's trying to avoid matching things like:

<foo>Something</foo><foo>Something else</foo>

with <foo>.*</foo> and getting Something</foo><foo>Something else as the value, which is what a greedy qualifier will do.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
     reg-ex pair matching help - (tablizer) - (18)
         Which language? - (admin) - (15)
             It's ColdFusion - (tablizer) - (14)
                 Re: It's ColdFusion - (admin) - (13)
                     Re: It's ColdFusion - (tablizer) - (12)
                         No problem. - (admin) - (11)
                             Re: No problem. - (deSitter) - (10)
                                 WTF is a polylog? -NT - (ben_tilly) - (9)
                                     An undeveloped frog? -NT - (ChrisR) - (1)
                                         An n-dimensional logarithm? -NT - (Another Scott)
                                     Re: WTF is a polylog? - (deSitter)
                                     Multiple BMs... -NT - (admin) - (5)
                                         "We have top men working on it!" -NT - (deSitter) - (4)
                                             "Tells me my cookies are off." -NT - (Another Scott) - (3)
                                                 Sorry, you are not a winner. -NT - (admin) - (2)
                                                     Blue Valkyrie needs food badly! -NT - (folkert) - (1)
                                                         How convenient is that?? -NT - (a6l6e6x)
         Try argument substitution? - (ChrisR) - (1)
             I don't think that's the problem. - (admin)

5 out of 7, perfect.
136 ms