IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New programming methodology question
you can read this because of pattern matching by your brain/eye. If I was to use ascii art the pattern matching would still take place. Now lets embed words into a picture using pixel delineation. The brain reads it easily. A computer cant.

Now if I get a picture, separate into patterns by color shift, I could then shrink the size of the patterns to say 12 points. Then fuzzily match each pattern to a similar bit representation of the 26 letters of the alphabet. Do it quickly so I could process several hundred pictures per second.

thats the problem, best method to research on how to write such a program, suggested languages or approaches.

thanx,
bill
Any opinions expressed by me are mine alone, posted from my home computer, on my own time as a free american and do not reflect the opinions of any person or company that I have had professional relations with in the past 50 years. meep
New Historically speaking
Marvin Minsky did a lot of the early work on character recognition back in the '50s. Most of the research over the years has been done in Lisp and Prolog (being associated with artificial intelligence), but most of the practical implementations have been done in C (for performance reasons). But I don't really think the programming language matters. It's more of a data representation and algorithm sort of problem.
New is C still quick compared to
perl which is fast at sorting, which essentially this would do, or a visual type language like squeak? I think would need to look at rendering software like jpeg, gif etc to see how they code the data reductions then I would know what I was looking for. Also licensing issues. It seems that gpl 2 is extremely open but my owners may have a say in that. If I could get a framework GPL would be a faster method to get to a finished product than other license types, which would be my preference.
thanx,
bill
Any opinions expressed by me are mine alone, posted from my home computer, on my own time as a free american and do not reflect the opinions of any person or company that I have had professional relations with in the past 50 years. meep
New Pattern recognition
Character recognition is a subset of the more general problem of pattern recognition. The programming language really doesn't matter - Perl is about as good as any other language. My hunch would be that C can be tuned for performance but for exploratory purposes, whatever PL you're comfortable with will suffice.

Pattern recognition is not a solved problem. There are algorithms that can be used for specialized domains. And you can get 2/3rds recognition in a lot of instances, but getting that last third gets exponentially more difficult. Comparing grids of bits with another grid of bits in a lookup table has the problem that the lookup table would have to be enormous. Using pixels has the problem in that you can't know a priori where one character starts and the next ends (they may even overlap). And the pixels set for a particular pattern can have many sizes and distortions (hand written is much harder than typeset).

There is long history of studying pattern recognition. Best bet would be to locate an open source project that does character recognition. I can't recall the name of the one I looked at, but the quality of recognition wasn't quite high enough in correct recognition for the project I had in mind.
New Smalltalk has been used to do OCR
on Sanskrit.

[link|http://www.blackbagops.net/?p=75|http://www.blackbagops.net/?p=75]



[link|http://www.blackbagops.net|Black Bag Operations Log]

[link|http://www.objectiveclips.com|Artificial Intelligence]

[link|http://www.badpage.info/seaside/html|Scrutinizer]
New I'm led to believe that Ocaml is pretty darned quick


Peter
[link|http://www.no2id.net/|Don't Let The Terrorists Win]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
[link|http://kevan.org/brain.cgi?pwhysall|A better terminal emulator]
[link|http://darwinia.co.uk/|[image|http://i66.photobucket.com/albums/h262/pwhysall/Misc/saveus.png|0|Darwinia||]]
New according to the ocaml resources C is faster
and for better or worse I already know how to mangle C.
thanx,
bill
Any opinions expressed by me are mine alone, posted from my home computer, on my own time as a free american and do not reflect the opinions of any person or company that I have had professional relations with in the past 50 years. meep
New I really think this is going to take genetic algo
The reason humans can pattern match text is that we've seen so much of it. Get a highly-ornate English script font and most people have a hard time of it. And when you do figure it out, a lot of it is looking at two characters and saying, "Well, this one looks more like an 'I' than that one does, so the other one must be the 'L'."

A deterministic program works from the assumtion that we're automating a well-understood process. I don't think alphabet recognition is well understood quantitatively. Which pretty much means you don't write the program, you train it.
===

Kip Hawley is still an idiot.

===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
New so give me something to train
Any opinions expressed by me are mine alone, posted from my home computer, on my own time as a free american and do not reflect the opinions of any person or company that I have had professional relations with in the past 50 years. meep
New Hey, I'm an idea man
And my idea was that you find someone who knows what they're doing to write the code.
===

Kip Hawley is still an idiot.

===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
Expand Edited by drewk Dec. 3, 2006, 11:43:03 PM EST
New ICLRPD (new thread)
Created as new thread #274753 titled [link|/forums/render/content/show?contentid=274753|ICLRPD]
--
Steve
[link|http://www.ubuntulinux.org|Ubuntu]
     programming methodology question - (boxley) - (10)
         Historically speaking - (ChrisR) - (5)
             is C still quick compared to - (boxley) - (4)
                 Pattern recognition - (ChrisR)
                 Smalltalk has been used to do OCR - (tuberculosis)
                 I'm led to believe that Ocaml is pretty darned quick -NT - (pwhysall) - (1)
                     according to the ocaml resources C is faster - (boxley)
         I really think this is going to take genetic algo - (drewk) - (3)
             so give me something to train -NT - (boxley) - (2)
                 Hey, I'm an idea man - (drewk) - (1)
                     ICLRPD (new thread) - (Steve Lowe)

Ohh... they're still working.
119 ms