|
Named entity translation is very important in multilingual natural language processing such as cross-lingual information retrieval, statistical machine translation as well as cross-lingual question answering. In this course project, I mainly focus on English / Chinese named entity translation. Plan to make use of Chinese language features to improve the named entity translation quality. |
|
Named entity
translation is an important issue for several research areas such as
cross-lingual information retrieval, machine translation and cross-lingual
question answering. Currently, researchers prefer to give a general algorithm
to solve the problem. But different languages have their own features. For
example in Chinese, the named entities are always translated into Pin-yin, such as Also most of the researchers more focus on English language, we have commercial named entity identification tool such as BBN identifier. But we don’t have some tools for Chinese named entity extraction. In the project, I will also put some efforts on Chinese named entity extraction. |
Working on …
|
|
· BBN Named Entity Tool · LDC Chinese Dictionary with Pinyin · English/Chinese parallel corpus: Hong Kong News |
Schedule
|
n Sep 15- Oct 15: n Chinese named entity identification n Oct 16 – Nov 10 n English/Chinese named entity alignment n Result: get translation dictionary n Check the queries including named entities n Nov 11 – Dec n Experiments n Comparing the performance with other methods n Report |
|
[1]Learning Translations of Named-Entity Phrases from Parallel Corpora. Robert C. More. [2]Improve Named Entity translation and Bilingual Named
Entity Extraction. Fei Huang and Stephan
Vogel. [3]Chinese Named Entity Identification Using Class-based Language Model. Jian Sun, Jianfeng Gao, Lei Zhang, Ming Zhou, Changning Huang. |