Unicode and ASCII

IWETHEY v. 0.3.0 | TODO

1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User

Welcome to IWETHEY!

Post #198,260

3/12/05 6:54:00 PM

Unicode and ASCII

It's my understanding that Unicode includes ASCII as a subset.

\r\nNitpick - A particular variable length encoding of Unicode (UTF-8, using 1 to 6 8-bit bytes) is compatible with the 7 bit encoding of ASCII when only characters from the 7-bit ASCII encoding is used.

Post #200,487

by jb4

Picture of jb4

3/25/05 1:40:46 PM

Re: Unicode and ASCII - Nitpick II

ASCII and UNICODE define a set of code points, a binary representation of a character. As it turns out, The ASCII code points are identical to the UNICODE code poitns for the characters represented by ASCII

UTF-8, UTF-16 (both versions*), UTF-32, UCS-2, etc. are all encoding schemes; that is mechanisms through which the code points can be represented. In ASCII, such things are not necessary because ASCII is defined to be fully representable in a singe byte. UNICODE is not, and so we have come up with all sorts of ways to represent the 97,000+ characters that UNICODE currently represents (and more coming RSN!). The encoding schemes listed above (along with UCS-4) are specifically for UNICODE. So talking about representing ASCII as UTF-8 is (pedantically) meaningless. You can represent the ASCII subset of UNICODE using UTF-8, however (its a "null translation"), but then you're really representing UNICODE.

* Both versions means big-endian and little-endian, but you already nkew that...

jb4
shrub\ufffdbish (Am., from shrub + rubbish, after the derisive name for America's 43 president; 2003) n. 1. a form of nonsensical political doubletalk wherein the speaker attempts to defend the indefensible by lying, obfuscation, or otherwise misstating the facts; GIBBERISH. 2. any of a collection of utterances from America's putative 43rd president. cf. BULLSHIT

What the heck is text? - (systems) - (56) - March 12, 2005, 11:48:38 AM EST

It depends on the context. - (Another Scott) - (2) - March 12, 2005, 12:23:39 PM EST

Unicode and ASCII - (StevenYap) - (1) - March 12, 2005, 06:54:00 PM EST

Re: Unicode and ASCII - Nitpick II - (jb4) - March 25, 2005, 01:40:46 PM EST

you are confusing text with display - (boxley) - (12) - March 12, 2005, 06:53:13 PM EST

Uhhh..Not quite, Bill - (jb4) - (11) - March 25, 2005, 01:50:46 PM EST

And that is one thing that sucks about Unicode - (ben_tilly) - (9) - March 25, 2005, 03:09:44 PM EST

At least they're consistent - (jb4) - (8) - March 25, 2005, 06:02:42 PM EST

But it is a problem - (ben_tilly) - March 25, 2005, 06:13:17 PM EST

Except for that full width/half width ascii thing - (tuberculosis) - (5) - March 27, 2005, 09:40:24 PM EST

I dunno... - (jb4) - March 28, 2005, 12:11:15 PM EST

My personal take on it - (jake123) - (3) - March 28, 2005, 12:15:20 PM EST

Perhaps, but it makes searching tricky - (tuberculosis) - (2) - March 28, 2005, 02:49:36 PM EST

Well, if it was an easy problem - (jake123) - March 28, 2005, 03:29:43 PM EST

ICLRPD (new thread) - (jb4) - March 29, 2005, 07:57:14 PM EST

Have you all seen the HUGE unicode poster? - (FuManChu) - March 28, 2005, 04:33:12 PM EST

close enough to debug a table entry :-) - (boxley) - March 26, 2005, 09:26:51 AM EST

Text is not as simple as it seems - (ben_tilly) - March 12, 2005, 11:00:34 PM EST

This is one thing that Java handles pretty well - (bluke) - March 13, 2005, 11:35:09 AM EST

Rule #1 - Everything you think you know is wrong - (tuberculosis) - (29) - Aug. 21, 2007, 06:28:08 AM EDT

Why xenophobic? - (drewk) - (28) - March 14, 2005, 10:44:16 AM EST

Because they didn't think... - (pwhysall) - March 14, 2005, 11:24:31 AM EST

Because if they had spent any time at all - (tuberculosis) - (25) - Aug. 21, 2007, 06:30:01 AM EDT

Now how about addressing my example - (drewk) - (17) - March 14, 2005, 12:13:47 PM EST

The best explanation that I've seen of why 2 digits... - (ben_tilly) - March 14, 2005, 12:46:32 PM EST

No, but they were xenophobic etc - (jake123) - (15) - March 14, 2005, 02:50:01 PM EST

xenophobic's probably the wrong word - (SpiceWare) - (14) - March 14, 2005, 04:12:18 PM EST

Yeah, you're right - (jake123) - (13) - March 14, 2005, 05:02:32 PM EST

How about "escessively humble"? - (drewk) - (4) - March 14, 2005, 05:35:25 PM EST

Look, the point about the two digits for a year is well - (jake123) - (1) - March 14, 2005, 06:09:09 PM EST

Disagree - (jb4) - March 25, 2005, 02:15:41 PM EST

Maybe... - (tuberculosis) - (1) - Aug. 21, 2007, 06:31:43 AM EDT

How about simply "provincial". - (a6l6e6x) - March 14, 2005, 08:02:42 PM EST

The people who coded for teletypes and green terminals - (Arkadiy) - (7) - March 14, 2005, 05:37:55 PM EST

Yes, a typographer - (jake123) - (3) - March 14, 2005, 06:10:49 PM EST

Internationalization would not have been so easy - (ben_tilly) - March 14, 2005, 07:23:05 PM EST

Text layout in 80 by 24 grid of monspaced font? - (Arkadiy) - (1) - March 14, 2005, 07:59:49 PM EST

Phone books back then - (jake123) - March 15, 2005, 11:18:10 AM EST

Please don't use the letter "e" in your code. - (pwhysall) - (2) - March 14, 2005, 06:20:45 PM EST

I certainly used to do without "e" - (Arkadiy) - March 14, 2005, 08:01:39 PM EST

I couldn't use "e" either ... - (JimWeirich) - March 15, 2005, 05:52:09 PM EST

Oh, come ON already - (jb4) - (6) - March 25, 2005, 02:20:51 PM EST

The C++ standard i18n library is awful - (tuberculosis) - (5) - March 27, 2005, 09:56:44 PM EST

Dont know ICU - (jb4) - (4) - March 28, 2005, 12:13:20 PM EST

ICLRPD (new thread) - (drewk) - March 28, 2005, 12:33:42 PM EST

You can find it here - (tuberculosis) - (2) - March 28, 2005, 02:44:08 PM EST

Time line? - (jb4) - (1) - March 29, 2005, 07:59:43 PM EST

Released in 1988 - (tuberculosis) - March 29, 2005, 08:07:25 PM EST

Actually, Algol 68 was designed from the ground up - (Arkadiy) - March 14, 2005, 12:16:18 PM EST

Re: What the heck is text? - (JayMehaffey) - (3) - March 14, 2005, 10:51:50 AM EST

I must correct you - ASCII is a 7-bit encoding - (tuberculosis) - Aug. 21, 2007, 06:30:04 AM EDT

Whoa, there. - (ubernostrum) - (1) - March 14, 2005, 01:31:43 PM EST

Your right mostly - (JayMehaffey) - March 14, 2005, 02:36:05 PM EST

Using a pencil, it's unambiguous. -NT - (mmoffitt) - (3) - March 28, 2005, 01:14:03 PM EST

You haven't seen my handwriting.... -NT - (Another Scott) - (2) - March 28, 2005, 02:04:44 PM EST

Uh-oh. I wouldn't confess that ;0) - (mmoffitt) - (1) - March 28, 2005, 02:56:59 PM EST

My father's handwriting was so bad... - (broomberg) - March 28, 2005, 09:21:00 PM EST

I could go on Oprah touting his evilness. Write articles. I would be famous. Fat, but famous.
95 ms