Computational Linguistics BY ..ADNAN BALOCH Group - 12
Contents: Introduction Origin Main Application Areas Approaches Conclusion
Introduction The Association for Computational linguistics defines CL as the scientific study of language from a computational perspective. Computational linguists are interested in providing computational models of various kinds of linguistic phenomena. Work in computational linguistics is in some cases motivated from a scientific perspective in that one is trying to provide a computational explanation for a particular linguistic or psycholinguistic phenomenon.
Definition Computational linguistics is the application of linguistic theories and computational techniques to problems of natural language processing. Grishman (1986) defines Computational linguistics as the study of computer systems for understanding and generating natural language.
Structure
Purpose The purpose of CL is to develop applications that deal with computer tasks related to human language, like development of software for grammar correction, word sense disambiguation, compilation of dictionaries and corpora, automatic translation from one language to another, etc.
Origin Computational linguistics is often grouped within the field of artificial intelligence but was present before the development of artificial intelligence. Computational linguistics originated with efforts in the United States in the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian scientific journals, into English. Since computers can make arithmetic (systematic) calculations much faster and more accurately than humans, it was thought to be only a short matter of time before they could also begin to process language. Computational and quantitative methods are also used historically in the attempted reconstruction of earlier forms of modern languages and sub-grouping modern languages into language families. Earlier methods, such as lexicostatistics and glottochronology, have been proven to be premature and inaccurate. However, recent interdisciplinary studies that borrow concepts from biological studies, especially gene mapping, have proved to produce more sophisticated analytical tools and more reliable results. When machine translation (also known as mechanical translation) failed to yield accurate translations right away, automated processing of human languages was recognized as far more complex than had originally been assumed. Computational linguistics was born as the name of the new field of study devoted to developing algorithms and software for intelligently processing language data. The term "computational linguistics" itself was first coined by David Hays, a founding member of both the Association for Computational Linguistics (ACL) and the International Committee on Computational Linguistics (ICCL).
Origin To translate one language into another, it was observed that one had to understand the grammar of both languages, including both morphology (the grammar of word forms) and syntax (the grammar of sentence structure). To understand syntax, one had to also understand the semantics and the lexicon (or 'vocabulary'), and even something of the pragmatics of language use. Thus, what started as an effort to translate between languages evolved into an entire discipline devoted to understanding how to represent and process natural languages using computers. Nowadays research within the scope of computational linguistics is done at computational linguistics departments, computational linguistics laboratories, computer science departments, and linguistics departments. Some research in the field of computational linguistics aims to create working speech or text processing systems while others aim to create a system allowing human-machine interaction. Programs meant for human-machine communication are called conversational agents.
Main Application Areas Machine Translation Natural Language Interface Speech Recognition Intelligent word processing: Grammar Checking, spelling correction
Machine Translation: Machine translation (MT) is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. The process of translation involves, Moving texts from one (human) language (source language) to another (target language) in a way that preserves meaning. Machine translation (MT) automates the process, or part of the process. Fully automatic translation Computer-aided (human) translation Is MT good or not?
Natural Language interface: Natural Language Interface (NLI) is an interface that allows users to interact with the computer using a human language. NLI essentially provides an abstract layer between users and computers by enabling computers to understand human language instead of the other way around. It allows the user to enter natural language search queries in written or spoken text . Proven Capabilities: Natural-language (NL) interfaces built so far have primarily addressed the problem of accessing information stored in conventional data base systems. Among the proven capabilities exhibited by these systems are those that, Provide reasonably good access to specific data bases . Answer direct questions. ( What is Komal’s salary?) Coordinate multiple files. ( ‘‘What is Komal’s location? Translates into What is the location of the department of K omal ?’’)
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. How does it work? Speech recognition software works by breaking down the audio of a speech recording into individual sounds, analyzing each sound, using algorithms to find the most probable word fit in that language, and transcribing those sounds into text. Speech Recognition Digital Assistants: Digital assistants are designed to help people perform or complete basic tasks and respond to queries with the ability to access information from vast databases and various digital sources . For Example: Amazon’s Alexa, Apple’s Siri and Google’s G oogle A ssistant Speech Recognition:
Grammar Checking: Grammar checking is the task of detection and correction of grammatical errors in the text. English is the dominating language in the field of science and technology. Therefore, the non-native English speakers must be able to use correct English grammar while reading, writing or speaking. This generates the need of automatic grammar checking tools . Grammar Checking Tools: The trend of developing such tools has been evolved from 80’s till now. Earliest grammar checking tools e.g., Writer’s Workbench were aimed at detecting punctuation errors and style errors. In 90’s, many tools were made available e.g., Right Writer. In recent decades, rapid development has been seen in this field. For example, Park et al developed a grammar checker as a web application for university. ESL students. Types of Errors: Sentence Structure Error, Spelling Error, Punctuation Error, Syntax Error, Preposition Error, Semantic Error etc.
Approaches: Just as computational linguistics can be performed by experts in a variety of fields and through a wide assortment of departments, so too can the research fields broach a diverse range of topics. The following sections discuss some of the literature available across the entire field broken into four main area of discourse: developmental linguistics, structural linguistics, linguistic production, and linguistic comprehension. Developmental Approach Language is a cognitive skill that develops throughout the life of an individual. This developmental process has been examined using several techniques, and a computational approach is one of them. Human language development does provide some constraints which make it harder to apply a computational method to understanding it.
Approaches: Structural Approach To create better computational models of language, an understanding of language's structure is crucial. To this end, the English language has been meticulously studied using computational approaches to better understand how the language works on a structural level. One of the most important pieces of being able to study linguistic structure is the availability of large linguistic corpora or samples. Production Approach The production of language is equally as complex in the information it provides and the necessary skills which a fluent producer must have. That is to say, comprehension is only half the problem of communication. The other half is how a system produces language, and computational linguistics has made interesting discoveries in this area. Text-based interactive Approach By this method, words typed by a user trigger the computer to recognize specific patterns and reply accordingly, through a process known as keyword spotting.
Approaches: Speech-based interactive Approach Recent technologies have placed more of an emphasis on speech-based interactive systems. These systems, such as Siri of the iOS operating system, operate on a similar pattern-recognizing technique as that of text-based systems, but with the former, the user input is conducted through speech recognition. Comprehension Approach Much of the focus of modern computational linguistics is on comprehension. With the proliferation of the internet and the abundance of easily accessible written human language, the ability to create a program capable of understanding human language would have many broad and exciting possibilities, including improved search engines, automated customer service, and online education.
Conclusion Now a days research within the scope of CL is done at computational linguistics departments, CL laboratories, computer science departments, and linguistics departments.