IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Oracle: An inverse of the TRANSLATE function?
I just keep thinking 'there must be a better way' about this task, so I'll throw it to the IGM, who are bound to come up with a better idea...

This data is in an Oracle table, and whatever I do to fix it must be done in a select statement (long, not-very-interesting story). There are at least a million and one ways of doing it in numerous programming languages - alas they're currently not an option for this task.

Basically, I want to take any non-numeric data out of a character field, leaving only the digits. So '123ab 45' should become 12345.

So I'm currently doing a rather ugly translate on the data. eg:

SELECT TRANSLATE(textfield, '0123456789ABCDEFGHIJKLMNOPQRSTUVWXY!@#$%^&*()-+./?><|][}{ ', '0123456789') FROM Table;



As you can see, neither pretty, nor foolproof. But I'll be stumped if I can think of a better way of doing it at the moment. Anyone know of a 'dont-translate-anything-but-these' rather than a 'translate-those' type of function?




Two out of three people wonder where the other one is.
New No, seems to be exactly what Oracle recommends for that task
As per this docco page: [link|http://download-uk.oracle.com/docs/cd/A87860_01/doc/server.817/a85397/function.htm#79574|http://download-uk.o...unction.htm#79574] .

Why, what's (most) wrong with it -- the "neither pretty", or the "nor foolproof" bit?

As for the first, I'm sorry, but I suspect this is as pretty as it gets...

As for the second, how big a concern is this? Could it be ("suffuciently") fixed by expanding the 'From' string in your code (so as to include at least lowercase letters and quote characters, for instance)?

If not, and you want it "more foolproof", that could be done -- at the expense of becoming a LOT less "pretty", though... And possibly introducing some real performance concerns. So, before I present you my SUSTR()- and DECODE()-based solution, please tell us: How long is your "textfield"? How many gazillion records are there in your table? And what, if any, limits are there to your run-time-window and server resource usage?


   [link|mailto:MyUserId@MyISP.CountryCode|Christian R. Conrad]
(I live in Finland, and my e-mail in-box is at the Saunalahti company.)
Your lies are of Microsoftian Scale and boring to boot. Your 'depression' may be the closest you ever come to recognizing truth: you have no 'inferiority complex', you are inferior - and something inside you recognizes this. - [link|http://z.iwethey.org/forums/render/content/show?contentid=71575|Ashton Brown]
New Wrong attitude
In this case, "more foolproof" should not be in quotes. Because the TRANSLATE solution is buggy. Depending on the rest of the application, one piece of unexpected input and you'll crash a process. Sucks when it is a batch process running in the middle of the night. (Been there, done that.)

Now I'll accept that Oracle hasn't built any function to make this (fairly common) task accomplishable in a simple and reliable way. However it isn't the programmer's fault for wanting the equivalent of s/^\\d//g. It exists in every other programming environment. It is faster to negate a small character class than it is to try to translate a large set of characters to nothing. It is more reliable.

It just doesn't happen to exist in Oracle. And Oracle doesn't care that their offered workaround is so suboptimal. (Probably because they know that nobody would ever make a purchase decision based on the presence of absence of this feature.)

Cheers,
Ben
To deny the indirect purchaser, who in this case is the ultimate purchaser, the right to seek relief from unlawful conduct, would essentially remove the word consumer from the Consumer Protection Act
- [link|http://www.techworld.com/opsys/news/index.cfm?NewsID=1246&Page=1&pagePos=20|Nebraska Supreme Court]
New Trouble is, the process doesn't crash,
which seems to be more a feature of the ETL tool, than anything else. As soon as Oracle returns an 'Invalid Number' error, it just gives the indication that it has read in all the records it needs to, and continues with the rest of the program. Which is a bit scary.

The fun part has been altering a colleague's attitude from 'Your program is wrong!' to 'Oh, uh, you fixed my SQL? Umm, thanks...' :)
Two out of three people wonder where the other one is.
New OK, here's the "foolproof-but-fugly" solution:
Like I said, based on SubStr() and Decode():

Select
  Decode(SubStr(TextField, 1, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 2, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 3, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 4, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 5, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 6, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 7, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 8, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) ||
  Decode(SubStr(TextField, 9, 1), '0', '0', '1', '1', '2', '2', '3', '3',
    '4', '4', '5', '5', '6', '6', '7', '7', '8', '8', '9', '9', NULL) Num_Text
From Table;


As you can see, this kind of does the same thing as Translate() -- translate '1' to '1', '2' to '2', etc -- only, the "Otherwise" parameter(*) to Decode() guarantees that everything else gets translated into nothing at all. The key "magic" is the fact (? I certainly hope so!) that concatenating a string with NULL yields, not NULL, but the original string. The downside is, obviously, that we have to do it one character at a time. But, on the gripping hand: It works.

Now, this is based (and tested) on a nine-character "TextField"; if yours is longer, you just duplicate and adapt the long butt-ugly line(#) for each extra character; increment the second parameter to SubStr() (the bold red digits above) and keep concatenating each potentially-numeric result character to your total result. If yours is shorter, you delete a few lines from the end of the above, of course -- just remember to remove the "||" concatenation operator from the last one. The result of this operation will still be a string variable (i.e, of type VarChar2, presumably), albeit containing only numeric characters; if you want it as a genuine numeric one, then wrap the whole shebang in a call to the To_Num() function.

[Edit - forgot to add:] What I didn't test, is performance. If you recall the thread here about Decode() a little while ago, opinion about its use was divided; it can have a significant impact on performance sometimes; in other situations, it's negligible. Nothing for it but to try it out, I suspect... Oh, what the heck, I'll try it out forya! OK, results: Selecting, as per the above, the numerics of a VarChar2(10) field from a 208,000-row table took 4.1 seconds. This field, although defined as character, seems to contain only numbers. (I wrapped it in Max() and Min(), so as to eliminate 208,000 rows using up our bandwidth and scrolling up my screen; same reult both times.) Tried another field, VarChar2(30) (this one doesn't seem to contain any numerics at all), the same way; that took 15 seconds.

HTH!




(*): Very useful, that "default" parameter -- the rule-of-thumb to remember is, if the number of parameters to Decode() is even, then every source value not specified in the previous parameters stays as-is; if the number of parameters is odd, then every non-specified value gets translated into the last, "odd-man-out", parameter value.

(#): Each "conceptual" line is split into two "actual" lines above, so as to stave off horizontal scrolling in your browser; the "line" that you are supposed to duplicate is the text from "Decode(SubStr(" to "'9', NULL) ||".


   [link|mailto:MyUserId@MyISP.CountryCode|Christian R. Conrad]
(I live in Finland, and my e-mail in-box is at the Saunalahti company.)
Your lies are of Microsoftian Scale and boring to boot. Your 'depression' may be the closest you ever come to recognizing truth: you have no 'inferiority complex', you are inferior - and something inside you recognizes this. - [link|http://z.iwethey.org/forums/render/content/show?contentid=71575|Ashton Brown]
Expand Edited by CRConrad May 7, 2004, 04:14:18 AM EDT
New Believe me I came close to writing something similar
That's tops!

It's a 20 charater field, so it would be managable.

I'd have to test the 'concatinating Null to something still gives you something' though. You can't do TRANSLATE(NumField,'ABCDetc...','') because anything evaluated with a Null makes the whole thing Null (Oracle have a better explanation than I!)

Even if NULL isn't allowed, it could be DECODEd to something, (eg: 'Z') and then a further TRANSLATE could get rid of the Z. So there you go - I could take this code from fugly to ... fuglier.

Anyway, it goes like this: the requestor of this field knows the issue, knows it's not foolproof, and is likely to sign off on it anyway. It still bugs me that the solution isn't perfect, and I am reasonable sure the requestor would not claim ownership of the SUBSTR-each-character SQL, technically perfect though it is.

Whatever. It's Friday evening. I'm home now. What am I doing talking about work?!

/me cracks open a beer...

Edit: Typo in subject line. I blame the beer...
Two out of three people wonder where the other one is.
Expand Edited by Meerkat May 7, 2004, 04:16:30 AM EDT
New *shudder*
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey

Give a man a match, he'll be warm for a minute.
Set him on fire, he'll be warm for the rest of his life!
New What version of Oracle?
If not 10g, upgrade.

[link|http://otn.oracle.com/oramag/webcolumns/2003/techarticles/rischert_regexp_pt1.html|http://otn.oracle.co...t_regexp_pt1.html]

hehehee.

Yeah, like anyone would get permssion to go with a bleeding edge version of Oracle just for regular expressions.
New That's cool
After reading the article, I'm guessing that they might be using a DFA engine, but I'd love to be proven wrong.

Cheers,
Ben
To deny the indirect purchaser, who in this case is the ultimate purchaser, the right to seek relief from unlawful conduct, would essentially remove the word consumer from the Consumer Protection Act
- [link|http://www.techworld.com/opsys/news/index.cfm?NewsID=1246&Page=1&pagePos=20|Nebraska Supreme Court]
New Gah.
Sheesh, we just finished the conversion to 9 this year.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Well spotted!
For the problem I wrote about, I'd be tempted to push for an upgrade just for that one feature :-)

Trouble is, I fear my pleadings would mostly be met with the reply "...Regular expression?"
Two out of three people wonder where the other one is.
     Oracle: An inverse of the TRANSLATE function? - (Meerkat) - (10)
         No, seems to be exactly what Oracle recommends for that task - (CRConrad) - (2)
             Wrong attitude - (ben_tilly) - (1)
                 Trouble is, the process doesn't crash, - (Meerkat)
         OK, here's the "foolproof-but-fugly" solution: - (CRConrad) - (2)
             Believe me I came close to writing something similar - (Meerkat)
             *shudder* -NT - (folkert)
         What version of Oracle? - (broomberg) - (3)
             That's cool - (ben_tilly)
             Gah. - (admin)
             Well spotted! - (Meerkat)

Department of Redundancy Department
112 ms