6th Natural Language Processing Pacific Rim Symposium
Post-Conference Workshop
Language Resources in Asia
November 30, 2001
National Center of Sciences
Tokyo, Japan
Preface
This volume contains the papers presented at the workshop on Language
Resources in Asia, held on 30 November 2001 in conjunction with the
6th Natural Language Processing Pacific Rim Symposium (NLPRS 2001).
Language resources play an essential role in empirical approaches to
natural language processing (NLP). Previous concerted efforts on
construction of language resources, particularly in the US and EU,
have laid a solid foundation for the pioneering NLP researches in
these two communities over the last decade. In comparison,
availability and accessibility of Asian language resources is still
very limited; even though Asia can boast of much richer linguistic
contents in terms of cultural, historical, and structural variations.
The purpose of this workshop is to give a chance to investigate and
discuss many problems related to the construction, dissemination and
NLP research based on Asian language resources. According to the
increase of not only the demand of multi-lingual NLP but also the
size of language resources, it is very important for us to look for
the way of collaborations among Asian countries in developing,
sharing, and exchanging Asian language resources. We hope this first
workshop can also contribute to solve these issues.
Program Committee
- Tanaka, Hozumi (Chair) - Tokyo Institute of Technology (Japan)
- Chu-Ren Huang (Co-chair) - Academia Sinica (Taiwan)
- Tokunaga, Takenobu (Co-chiar) - Tokyo Institute of Technology (Japan)
- Choi, Key-Sun - Korea Advanced Institute of Science and
Technology (Korea)
- Hammam Riza - BPPT (Indonesia)
- Iida, Hitoshi - Sony Computer Science Laboratory (Japan)
- Kawtrakul, Asanne - Kasetsart Univerity (Thailand)
- Kurohashi, Sadao - University of Tokyo (Japan)
- Matsumoto, Yuji - Nara Institute of Science and
Technolgy (Japan)
- Rim, Hae-Chang - Korea University (Korea)
- Sangal, Rajeev - Indian Institute of Information Technology (India)
- Shirai, Kiyoaki - Japan Advanced Institute of Science and
Technolgy (Japan)
- Sornlertlamvanich, Virach - NECTEC (Thailand)
- Tsou, Benjamin - City University of HongKong (China)
- Kim, Jin-Dong - University of Tokyo (Japan)
Table of Contents
- A multilingual news database and its application to a
translation memory system
- Isao Goto, Naoto Kato and Terumasa Ehara
..........1
- The language resources development and language processing
service for Thai
- Asanee Kawtrakul, Yuen Poovorawan, Frederic Andres, Mukda
Suktarajarn, Patcharee Varasrai, Nithiwat Kampanya, Supavat
Vongwatthaporn, Nattakan Pengphon and Chaiwat Ketsuvarn
..........7
- Development of very large corpora in Thailand
- Rachod Thongprasirt, Thatsanee Charoenporn, Wasin Sinthupinyo
and Virach Sortlertlamvanich
..........15
- Japanese-English paraphrase corpus
- Satoshi Shirai, Kazuhide Yamamoto and Francis Bond
..........23
- The open language archives community and Asian language
resources
- Steven Bird, Gary Simons and Chu-Ren Huang
..........31
(PostScript file for printing)
- A bilingual corpus in the legal domain and its applications
- Oi Yee Kwong, Benjamin K. Tsou, Tom B.Y. Lai, Robert W.P. Luk,
Lawrence Y.L. Cheung and Francis C.Y. Chik
..........39
- Defining principled but practically manageable lexical units in
Japanese textual corpora
- Maho Okada, Koichi Takeuchi, Masaharu Yoshioka, Kyo Kageura and
Teruo Koyama
..........47
- Towards a reference tagset for Japanese
- Yasuhiro Kawata
..........55
- Using multiple pivots to align Korean and Japanese lexical
resources
- Kyonghee Paik, Francis Bond and Shirai Satoshi
..........63
- LERIL : Collaborative effort for creating lexical
resources
- Akshar Bharati, Dipti M Sharma, Vineet Chaitanya, Amba P
Kulkarni and Rajeev Sangal
..........71
- Combining the lexicon knowledge base with Chinese corpus
processing
- Duan Huiming, Hu Junfeng, Zhu Xuefeng and Yu Shiwen
..........81