- Automatic Language Identification
with computational identification language linguistics
- Comparing methods of Language ID
Abstract: In this work three different statistical language identification methods are compared, and a detailed study of the influence on those systems of some basic parameters is
performed. The analyzed parameters are the size of the train set, the amount of text that we want to classify and the languages the system is able to distinguish (it will be studied not only the influence of the number of languages but also the influence of which are the considered languages).
with application computational language linguistics object sepln04-pp.pdf
- Eidetica guesser
claims 70 languages
with identification language
- Inxight LinguistX_FinalWeb
with computational inxight language linguistics linguistx_finalweb
- Kevin Scannel Corpora
api corpora corpus crawler density google languages low minority nlp pi-languages spell spellchecker spider under-resourced web
with computational corpora kevin language linguistics scannel
- Language Guesser
Claims 70 languages.
Eidetica - hosted knowledge
hosting internet knowledge mining search text
with computational guesser language linguistics
- Language Tags mailing list
with computational language linguistics
- Languages and Character Set ID
with character computational language languages linguistics
- Lextek International
Language Identifier. Claims "260 different languages and character encodings."
api component document engine findex free full full-text index indexing language onix retrieval search software text toolkit
with identifier language
- LISTSERV 14.4
with computational language linguistics
- MIT Lincoln Laboratory: Information Systems Technology
Includes the Speech Processing group, which has worked in: o speech recognition o speaker recognition (identification, verification, and authentication) o language and dialect identification o word spotting o speech coding o speech and audio signal enhancement
with identification language
- Polyglot
automatic language identifier
automatic language identifier
guesser identifier language languages recognize recognizer
with computational identifier language linguistics
- Polyglot
automatic written language identifier, claims 400 languages
with computational identifier language linguistics
- Utrac
It is a command line tool and a library that recognize the encoding of an input file (ex: UTF-8, ISO-8859-1, CP437...) and its end-of-line type (CR, LF, CRLF).
Universal Text Recognizer And Converter. Detect encoding and end of line type.
charset code codepage converter convertion detection encoding page recognition recognizer utf-8
with character encoding identification
- vannord Language ID Tools
with computational language linguistics tools vannord
Bookmarks 1 - 15