Current speech recognition software usually has to be trained (e.g. voice dictation stuff like IBM's Via Voice), and has a limited vocabulary.

[link|http://www.sensoryinc.com/|Sensory, Inc.] makes chips for voice activation and speech synthesis.

[link|http://www.wired.com/wired/archive/8.05/tpmap.html|Wired] has an article on universal translation issues.

And NSA and other government agencies probably have people working on this problem too...

In the problem you posed, you'd have to be able to distinguish between things like heavy dialects and sloppy grammar, and language differences. And just identifying the language is only the first step - after all you want to know what they're yammering, not just the language. :-)

I remember hearing a seminar from someone doing research for the Air Force in the late 1970s who was trying to figure out how to define "B-ness" as a first step toward designing a system to read printed text. Independent of the font. It's not trivial, but it's a problem that's pretty well solved now (for some values of "solved").

He also told of how people would say:

"Merry Mary got married" in different regions of the country. "Meery meery got meeried" and dozens of variations.

Think of the different ways people say "water".

I'm no expert, so take my thoughts with a grain of salt. :-)

Happy hunting!

Cheers,
Scott.