COLING 2002
Post-Conference Workshop
The 3rd Workshop on Asian Language Resources and International Standardization
Center of Academia Activities, Academia Sinica
Taipei, Taiwan
August 31, 2002
last modified: Wed, Jul 3, 2002
Description
Language resources play an important role in recent corpus-based natural language processing research. A lot of effort has been focused on compiling various kinds of language resources, particularly in the us and european countries. In addition, standards represent a necessary step to consolidate technological achievements in this sector, to enhance and foster the exchange of know-how between research and industry, and to define infrastructures for the re-use and sharing of existing language resources through the specification of common formats and frameworks. Since 1993 the commission of the european union has been actively supporting the standardization process in human language technology, in particular by sponsoring the eagles initiative. This activity has extended to the framework of the eu-us international research co-operation, supported by nsf and the european union ( http://lingue.ilc.pi.cnr.it/eagles96/isle/isle_home_page.htm).
Compared to english and many european languages the availability and accessibility of asian language resources is still limited. Moreover, there is more diversity of asian languages from viewpoints of character sets and grammatical properties. Because of these peculiarities, asian languages do not always fit with the existing linguistic resource standardization frameworks.
We have held two workshops on the same topic, the first was in january of 2001 at tokyo on invited basis and the second was in conjunction with the 6th natural language processing pacific rim symposium (nlprs 2001) in november of 2001 at tokyo ( http://www.cl.cs.titech.ac.jp/ALR/WS/2nd/index.html). In this third workshop, we would like to put emphasis on standardization of Asian language resources, and to provide a chance to discuss research results and the possibilities of international collaboration on the development of Asian language resources in the future. The workshop also aims to introduce the status of Asian language resources to researchers in other regions.
We invite papers on all topics related to language resources, in particular Asian language resources and their development including, but not limited to:
- Text corpora
- Machine-readable dictionaries
- Lexicons
- Grammars
- Exchange and annotation schemata
- Infrastructure for constructing and sharing language resources
- Exchange formats
- Best practices for creating and disseminating language resources
- Metadata for resource classification and discovery
- Strategies and priorities for EU-US and Asian cooperation
- Standards for language resources (lexicons, corpora, ontologies, etc.)
- Lexical standards and multilinguality
- Standards for content management
- Standards and applications
- Standards and evaluation
Program Committee
- Nicoletta Calzolari (co-chair) - Istituto di Linguistica Computazionale CNR (Italy)
- Key-Sun Choi (co-chair) - Korea Advanced Institute of Science and Technology (Korea)
- Asanee Kawtrakul (co-chair) - Kasetsart University (Thailand)
- Alessandro Lenci (co-chair) - Dipartimento di Linguistica - Universita di Pisa (Italy)
- Tokunaga Takenobu (co-chair) - Tokyo Institute of Technology (Japan)
- Steven Bird - University of Melborne (Australia)
- Nuria Bel - GILCUB (Spain)
- Ehara Terumasa - NHK (Japan)
- Christiane Fellbaum - Princeton University (USA)
- Ralph Grishman - New York University (USA)
- Chu-Ren Huang - Academia Sinica (Taiwan)
- Hammam Riza - BPPT (Indonesia)
- Kurohashi Sadao - University of Tokyo (Japan)
- Martha Palmer - University of Pennsylvania (USA)
- Hae-Chang Rim - Korea University (Korea)
- Rajeev Sangal - International Institute of Information Technology Hyderabad (India)
- Shirai Kiyoaki - Japan Advanced Institute of Science and Tecchnology (Japan)
- Virach Sornlertlamvanich - NECTEC (Thailand)
- Gregor Thurmair - SAIL Labs (Germany)
- Benjamin Tsou - City University of HongKong (China)
- Antonnio Zampolli - Istituto di Linguistica Computazionale CNR (Italy)
Schedule
| Paper submission due |
May 10, 2002 (Closed) |
| Notification of acceptance |
June 17, 2002 (Done) |
| Deadline for camera-ready papers |
June 30, 2002 (Closed) |
| Workshop date |
August 31, 2002 |
Venue
Center of Academia Activities, Academia Sinica, Taipei, Taiwan.
Program
- 8:30-9:00
- Registration
- 9:00-9:30
- A State of the Art of Thai Language Resources and Thai Behavior Analysis and Modeling
Asanee Kawtrakul, Mukda Suktarachan, Patcharee Varasai, Hutchatai Chanlekha
- 9:30-10:00
- Broadening the Scope of the EAGLES/ISLE Lexical Standardization Initiative
Nicoletta Calzolari, Alessandro Lenci, Francesca Bertagna, Antonio Zampolli
- 10:00-10:30
- Lexicon-based Orthographic Disambiguation in CJK Intelligent Information Retrieval
Jack Halpern
- 10:30-11:00
- (Break)
- 11:00-11:30
- Decomposition for ISO/IEC 10646 Ideographic Characters
Lu Qin, Chan Shiu Tong, Li Yin, Li Ngai Ling
- 11:30-12:00
- Efficient Deep Processing of Japanese
Melanie Siegel, Emily M. Bender
- 12:00-12:30
- Urdu and the Parallel Grammar Project
Miriam Butt, Tracy Holloway King
- 12:30-13:30
- (Lunch)
- 13:30-14:00
- A Study in Urdu Corpus Construction
Dara Becker, Kashif Riaz
- 14:00-14:30
- Automatic Word Spacing Using Hidden Markov Model for Refining Korean Text Corpora
Do-Gil Lee, Sang-Zoo Lee, Hae-Chang Rim, and Heui-Seok Lim
- 14:30-15:00
- Constructing of a Large-Scale Chinese-English Parallel Corpus
Le Sun, Weimin Qu, Song Xue, Xiaofeng Wang,Yufang Sun
- 15:00-15:30
- AnnCorra: Building Tree-banks across Indian Languages
Rajeev Sangal, Vineet Chaitanya, Amba Kulkarni, Dipti Misra Sharma
- 15:30-16:00
- (Break)
- 16:00-16:30
- OLACMS: Comparisons and Applications in Chinese and Formosan Languages
Ru-Yng Chang, Chu-Ren Huang
- 16:30-17:30
- Discussion
- 18:00
- Shuttle bus leaves
Proceedings
(PDF file)
Tokunaga, Takenobu