Corpora, translation, terminology… and beyond: objectives and perspectives

Authors

  • Belinda Maia Universidade do Porto. Centro de Linguística

DOI:

https://doi.org/10.11606/issn.2317-9511.v37p10-29

Keywords:

Corpus linguistics, Translation technology, Natural Language Processing (NLP)

Abstract

This paper will not describe any specific research in corpus linguistics. Instead, it will first reflect on the way many of us teaching languages and translation in university departments develop and use corpora in our research and teaching methodology. One of the objectives is to highlight the work by Professor Stella Tagnin and those of us with whom she has worked over twenty years, even if it does not bring anything new to the immediate area. It will go on to analyze how, apart from the didactic uses of these resources, and related research, their potential for Natural Language Processing (NLP) became increasingly important, and demonstrate how the methodology of corpus linguistics is now used in various disciplines, especially in interdisciplinary research. This analysis was prompted by involvement in a project to advise universities in two Central Asian countries on the creation of a masters’ degree in computational linguistics. The languages of these countries are very different from Western European languages, which obliged a re-assessment of my experience in linguistics and NLP in the context of English and Portuguese, when considering how the world’s less-resourced languages could join the mainstream of computational linguistics.

Downloads

Download data is not yet available.

Author Biography

  • Belinda Maia, Universidade do Porto. Centro de Linguística

    Centro de Linguística da Universidade do Porto, Portugal

References

BERNARDINI, S.; ZANETTIN, F. I Corpora nella didattica della traduzione – Corpus Use and Learning to Translate. University of Bologna: 2000.

CAO, D. Consideration in Translating English/Chinese Contracts. Meta, v. 42, n. 4, 1997, p. 661–669. https://doi.org/10.7202/002199ar

CHOMSKY, N. Syntactic Structures. Paris: Mouton; Co. 1957.

COULTHARD, M.; JOHNSON, A. An Introduction to Forensic Linguistics – Language as Evidence. London: Routledge. 2007.

COULTHARD, M.; JOHNSON, A. (Ed.) The Routledge Handbook of Forensic Linguistics. London: Routledge. 2010.

HALLIDAY, M. A. K. Explorations in the Functions of Language. London: Edward Arnold. 1973.

HALLIDAY, M. A. K. An Introduction to Functional Grammar. 2nd Edition. London: Edward Arnold. 1985.

LEECH, G.; HUNDT, M.; MAIR, C.; SMITH, N. Change in Contemporary English – a Grammatical Study. Cambridge University Press. 2009.

LEWANDOWSKA-TOMASZCZYK, B. (Ed.) PALC 2001: Practical Applications in Language Corpora. Frankfurt: Peter Lang. 2003.

LEWANDOWSKA-TOMASZCZYK, B.; THELEN, M. (Ed.). Translation and Meaning, Part 2. Proceedings of the Lódz Session of the 1990 Duo Colloquium on ‘Translation and Meaning, held in Lódz, Poland, 20-22 September, 1990. Maastricht: Euroterm. 1990.

LEWANDOWSKA-TOMASZCZYK, B.; MELIA, P. J. (Ed.) Proceedings of Practical Applications of Language Corpora. University of Lodz Press. 1997.

MAIA, B. Do-it-yourself corpora… with a little bit of help from your friends! In: LEWANDOWSKA-TOMASZCZYK, B.; MELIA, P. J. pp. 403-410. 1997.

MAIA, B. Training Translators in Terminology and Information Retrieval using Comparable and Parallel Corpora. In: ZANETTIN ET AL. pp. 43-54. 2003.

MAIA, B.; SARMENTO, L.; SANTOS, D.; CABRAL, L.; PINTO, A. S. The Corpógrafo - a Web-based environment for corpus research. Proceedings from the Corpus Linguistics 2005 Conference Series; Corpus Linguistics Conference (Birmingham, UK, 14-17 July 2005), s/pp. ISSN: 1747-9398

SANTOS, D. Linguateca's infrastructure for Portuguese and how it allows the detailed study of language varieties. OSLa: Oslo Studies in Language, v. 3, n. 2, 2011, pp. 113-128. At https://www.linguateca.pt/Diana/download/SantosOSLa2010.pdf

SIMÕES, A.; BARREIRO, A.; SANTOS, D.; SOUSA-SILVA, R.; TAGNIN, S. E. O. Linguística, Informática e Tradução – mundos que se cruzam. Oslo Studies in Language, v. 7, n. 1, 2015. https://journals.uio.no/osla/issue/view/100

TAGNIN, S. COMET – a multilingual corpus for teaching and translation. In: LEWANDOWSKA-TOMASZCZYK, B. (Ed.) pp. 535-540. 2003.

TAGNIN, S. Ed. Cadernos de Tradução – Tradução e Corpora. v. 1, n. 9. Universidade de Santa Catarina, 2002.

TAGNIN, S. (Guest editor). CROP – vol. 10. São Paulo: FFLCH-USP, 2010.

TAGNIN, S. (Ed.) Tradterm, n. 10, 2004. http://www.revistas.usp.br/tradterm/issue/view/3912

TAGNIN, S.; TEIXEIRA, E. Lingüística de Corpus e Tradução Técnica - Relato da montagem de um corpus multivarietal de culinária. In: Tradterm, n. 10, dec. 2004, p. 313-358. https://doi.org/10.11606/issn.2317-9511.tradterm.2004.47184

TAGNIN, S. Corpus driven glossaries in translator training courses. In: SIMÕES ET AL. pp. 359-377. 2015

THELEN, M.; LEWANDOWSKA-TOMASZCZYK, B. (Ed.). Translation and Meaning, Part 1. Proceedings of the Maastricht Session of the 1990 Duo Colloquium on ‘Translation and Meaning, held in Maastrict, The Netherlands, 4-6 January 1990. Maastricht: Euroterm. 1990.

THELEN, M.; LEWANDOWSKA-TOMASZCZYK, B. (Ed.). Translation and Meaning, Part 3. Proceedings of the Maastricht Session of the 2nd International Maastricht~Lódz Duo Colloquium on ‘Translation and Meaning, held in Maastrict, The Netherlands, §9-22 April, 1995. Maastricht: University of Maastrict. 1995.

VARANTOLA, K. Translators and Disposable Corpora. In: ZANETTIN ET AL. pp. 55-70. 2003

ZANETTIN, F.; BERNARDINI, S.; STEWART, D. Corpora in Translator Education. (Ed.) Manchester: St. Jerome Pub. Co. 2003.

Internet references – all sites last accessed May 2020.

British National Corpus (BNC) Official site - http://www.natcorp.ox.ac.uk Also consultable at: https://www.english-corpora.org/bnc/ & http://corpora.lancs.ac.uk/bnc2014/

COBUILD project - https://www.collinsdictionary.com/cobuild/

CLASS: Interdisciplinary Master Program on Computational Linguistics at Central Asian Universities – http://erasmus-class.eu

CoMET – Corpus Multilingue para Ensino e Tradução – http://comet.fflch.usp.br/corporamultilingue

COMPARA/DISPARA – online parallel corpus of Portuguese/English literary texts. Part of the Linguateca project. https://www.linguateca.pt/COMPARA/dispara.php?language=en

CORPÓGRAFO – a set of online tools for creating corpora and terminology databases. Part of the Linguateca project. https://www.linguateca.pt/corpografo/

DIRECTORATE GENERAL OF TRANSLATION OF THE EUROPEAN COMMISSION http://cdt.europa.eu/en/partners/european-commission-directorate-general-translation

ECKHARD BICK –VISL project – s research and development project at the Institute of Language and Communication at the University of Southern Denmark. https://visl.sdu.dk

ELSEVIER JOURNALS – Applied Corpus Linguistics - https://www.journals.elsevier.com/applied-corpus-linguistics

EUROPEAN LANGUAGE INDUSTRY PLATFORM – LIND https://ec.europa.eu/info/departments/translation/language-industry-platform-lind_pt

GOOGLE TRANSLATE - https://translate.google.com

IATE - Interactive Terminology for Europe https://iate.europa.eu/home

OPUS – open-source parallel corpus – compiled and organized by Jorg Tiedemann http://opus.nlpl.eu

LINGUATECA – a distributed language resource centre for Portuguese - https://www.linguateca.pt

LREC - International Conference on Language Resources and Evaluation - http://www.lrec-conf.org

MARK DAVIES’ CORPORA PROJECT, Brigham Young University - https://corpus.byu.edu/overview.asp

MARK DAVIES’ ENGLISH CORPORA at https://www.english-corpora.org

MARK DAVIES’ PORTUGUESE CORPORA at https://www.corpusdoportugues.org/

Quora – a Question and Answer platform that invites one to participate in debates https://pt.quora.com

SDL-Trados – well-know translation technology software https://www.sdltrados.com

SKYPE - https://www.skype.com/en/

TURKLANG CONFERENCES – conferences dedicated to the computational study of the Turkic languages – http://www.turklang.net/en

WHATSAPP - https://www.whatsapp.com

Published

2021-01-29

How to Cite

Maia, B. (2021). Corpora, translation, terminology… and beyond: objectives and perspectives. TradTerm, 37(1), 10-29. https://doi.org/10.11606/issn.2317-9511.v37p10-29