This page shows the work I did while doing my PhD at University of Paris (defended in 2016). PhD UFR d'Études anglophone
Since then I have moved on:
2019-? Associate professors Linguistics and ESP teacher, Centre de langues Université Rennes 2https://perso.univ-rennes2.fr/thomas.gaillat.
2018 Posdoctoral Researcher at Insighthttps://nuig.insight-centre.org/kdu/
2004-2018 Teacher: English for Specific Purposes - Medical and pharmaceutical schools of Université de Rennes 1 (SCELVA)
ResearchGate | https://www.researchgate.net/profile/Thomas_Gaillat/ |
GitHub | https://github.com/tgaillat |
PhD defended on June 16, 2016 Awarded Summa cum Laude
Reference in Interlanguage: the case of THIS and THAT. From annotation to interoperability
Director : Nicolas Ballier — Université Paris Diderot
Co-director : Pascale Sébillot — INRIA/INSA de Rennes
Second Language Acquisition is a domain in which Learner Corpora have been playing an increasing role over the last two decades. They provide an insight into many aspects of learner language and help address research questions related to such fields as phonology, syntax and semantics. In fact, they convey a large variety of language features used by learners, which makes them a powerful tool for access into the intricacies of the learners’ linguistic system called ‘Interlanguage’ (IL) (Selinker 1972). In IL, it is possible to observe language cognitive processes, one of them being reference. Reference is a crucial component of human speech which is composed of two concepts: deixis and anaphora. It is an area of study that has received a lot of interest in native languages including English (Cornish 1999; Kleiber 1992; Ariel 1994; Halliday & Hasan 1976). However, learner-language research on the subject has not been as intense. Many questions remain unanswered about the role of reference in IL. Developmental patterns on reference still need to be identified and little is known about specific reference-related learner features. One approach to the study of reference in IL involves the study of deixis and anaphora. This thesis aims to follow this line of research by analysing how both concepts interact within IL to enable learners to point to referents in their discourse.
The resolution of reference is usually carried out without any hindrance as each deictic or anaphoric form refers to a unique referent in a given situation or context. However, the case of THIS and THAT seems specific because both concepts partake of these two forms which are not simply lexical in nature. These two forms evolve in a multidimensional system which not only includes the syntagmatic and paradigmatic dimensions of speech but also the pragmatic dimension of discourse.
This thesis is a research on THIS and THAT with a special focus on comparing their use between several L1s and on taking into account other competitor forms. Through annotation and automated analysis we design an experimental framework that supports corpus interoperability.
LONGDALE, Diderot-LONGDALE, ICE-GB, Penn Treebank WSJ, ANGLISH, NOCE
Tools LCA, L2SCA, TAALES, TAACO, TAASC Classifier TiMBL, Treetagger, Stanford Tregex/Tsurgeon,
Structures NITE NXT Search
Languages R, XML, PERL
Ballier, N., Gaillat, T., Zarrouk, M., Simpkin, A., Bouyé, M., & Stearns, B. (2019). A Franco-Irish project for the automatic identification of criterial features in learners of English. Proceedings of the EuroCALL Conference. Presented at the EUROCALL 2019, Louvain-la-Neuve, Belgium.
Gaillat, T., & Ballier, N. (Accepted). Expérimentation de feedback visuel des productions écrites d’apprenants francophones de l’anglais sous MOODLE. Actes de La Conférence EIAH2019. Presented at the Environnements Informatiques pour l’Apprentissage Humain (EIAH’19), Paris , France.
Gaillat, T., & Ballier, N. (Accepted). Investigating the Scope of Textual Metrics for Learner Level Discrimination and Learner Analytics. Proceedings of 5th Learner Corpus Research Conference –. Presented at the LCR 2019, Warsaw, Poland.
Gaillat, T., Ballier, N., Zarrouk, M., Simpkin, A., Bouyé, M., & Stearns, B. (Accepted). Investigating Criterial Features of Learner English and Predicting CEFR Levels in French Learners of English. Proceedings of the EuroCALL Conference. Presented at the EUROCALL 2019, Louvain-la-Neuve, Belgium.
Gaillat, T., Ballier, N., Zarrouk, M., Simpkin, A., Bouyé, M., & Stearns, B. (Accepted). Investigating Criterial Features of Learner English and Predicting CEFR Levels in French Learners of English. Invited talk, NUI Galway, Data Science Institute, Ireland
Gaillat, Thomas. 2017. NLP with the SSIX project. Taxonomy Bootcamp London, Oct 16-18 2017. London, UK.
Gaillat, Thomas. 2017. Defining NLP tasks for data mining in security-related URLs. Workshop, Jan 13 2017. Thales, Rennes, France
Ballier, Nicolas, Thomas Gaillat. 2016. Classification d’apprenants francophones de l’anglais sur la base des métriques de complexité lexicale et syntaxique. JEP-TALN-RECITAL 2016, Jul 2016, Paris, France. Actes de la conférence conjointe JEP-TALN-RECITAL 2016, 9, pp.1-14, 2016, ELTAL. <https://jep-taln2016.limsi.fr/>
Gaillat, Thomas, Pascale Sébillot, Nicolas Ballier. 2015. Comparing corpora to identify learner-specific features of English: The case of this, that and it. Learner Corpus Research Conference (LCR 2015), Sep 2015, Radboud, Netherlands. Book of Abstracts LCR 2015, pp.68-69, 2015
Gaillat, Thomas, S. Ali 2013d. L’actionnel et le phonétique : quelle mise en œuvre dans un didacticiel?. Colloque DILEM EA LIDILE “Franchir le mur des sons” - Rennes - France. http://www.sites.univ-rennes2.fr/lidile/?p=1290
Gaillat, Thomas. 2013c. “Les emplois adverbiaux de this et that dans les corpus d’apprenants » Journée d’étude en hommage à Martine Schuwer. Université de Rennes 2
Gaillat, Thomas. 2013b. Comparing French/Spanish L1 transfers in two English learner corpora: the case of indexicals ‘it’,‘this’ and ‘that’. Learner Corpus Research conference. Bergen, Norway. http://lcr2013.b.uib.no/files/2013/09/abstracts-book.pdf
Gaillat, Thomas. 2013a. Annotation automatique d’un corpus d’apprenants d’anglais avec un jeu d’étiquettes modifié du Penn Treebank. TALN/RECITAL 2013 conference in Sables-d’Olonne - France. http://www.taln2013.org/actes/www/index.html
Gaillat, Thomas. P. Sébillot & N. Ballier 2012d. Automated classification of unexpected uses of this and that in a learner corpus of English. ICAME33, Université Catholique de Louvain. Louvain, Belgique. http://wwwling.arts.kuleuven.be/icame33/_pdf/icame33abstracts.pdf
Gaillat, Thomas. 2012c. « This et that dans les domaines de spécialité du corpus ICE-GB : quelles caractéristiques distributionnelles ? ». GERAS 2012. Grenoble. http://www.geras.fr/bibliotheque/File/GERAS2012_prog_detaille.pdf
Gaillat, Thomas, N. Ballier & D. Meurers. 2012b. « Des contraintes distributionnelles aux propriétés co-référentielles des pro-formes. » LASELDI : La cooccurrence : du fait statistique au fait textuel. Besançon.
Gaillat, Thomas. 2012a. « La place de la grammaire dans un parcours en ligne d’apprentissage de l’anglais ». RANACLES 2011. Rennes.
Gaillat, Thomas. 2011. « Towards multi-corpora analysis of this and that, from manual to automatic annotation. » Learner Corpus Reasearch 2011. Louvain-la-neuve, Belgique. http://www.uclouvain.be/en-328146.html
Gaillat, Thomas. 2010, « The issues of deixis and anaphora in speech construction among non-native English speakers. », Atelier ALOES. Paris 13.
Gaillat, Thomas & Belan, Sophie. 2009. « Les tâches collaboratives dans un parcours d’autoformation guidée en langue étrangère ». Journée d’étude InterGAP. Université de Paris4
Gaillat, Thomas, S. Belan. 2008. « Analyse d’un parcours d’auto-formation guidée sur la plate-forme Moodle », GERAS 2008. Orléans.
Arnold, T., Ballier, N., Gaillat, T., & Lissòn, P. (2018). Predicting CEFRL levels in learner English on the basis of metrics and full texts. ArXiv:1806.11099 [Cs]. Retrieved from http://arxiv.org/abs/1806.11099
Ballier, N., Arnold, T., Balvet, A., & Gaillat, T. (submitted). The implicit metalinguistic discourse of tagsets for English: retagging the Brown corpus. Presented at the Les discours métalinguistiques 3, Paris 13 , France.
Gaillat, T. (Accepted). A Multifactorial analysis of this, that and it proforms in anaphoric constructions in learner English. Cahiers de Praxématique, 71.
Gaillat, T., & Ballier, N. (Accepted). Expérimentation de feedback visuel des productions écrites d’apprenants francophones de l’anglais sous MOODLE. Actes de La Conférence EIAH2019. Presented at the Environnements Informatiques pour l’Apprentissage Humain (EIAH’19), Paris , France.
Gaillat, T., & Ballier, N. (Accepted). Investigating the Scope of Textual Metrics for Learner Level Discrimination and Learner Analytics. Proceedings of 5th Learner Corpus Research Conference –. Presented at the LCR 2019, Warsaw, Poland.
Gaillat, T., Sousa, A., Zarrouk, M., & Brian, Davis. (2018). FinSentiA: Sentiment Analysis in English Financial Microblogs. Proceedings of the TALN-CORIA 2018. Presented at the CORIA-TALN-RJC, Rennes, France.
Gaillat, T., Stearns, B., Sridhar, G., McDermott, R., Zarrouk, M., & Davis, B. (2018). Implicit and Explicit Aspect Extraction in Financial Microblogs. Proceeding of ECONLP ECONLP 2018 – 1st Workshop on Economics and Natural Language Processing at ACL2018. Presented at the ECONLP Workshop ACL 2018, Melbourne, Australia.
Gaillat, T., Zarrouk, M., Freitas, A., & Davis, B. (2018). The SSIX Corpora: Three Gold Standard Corpora for Sentiment Analysis in English, Spanish and German Financial Microblogs. In N. C. (Conference chair), K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, … T. Tokunaga (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Paris, France: European Language Resources Association (ELRA).
Krishnamoorthy, K., & Gaillat, T. (2018). An Ensemble Classifier for Error Detection and Recommendation in the Use of Articles by Learners of English. Proceedings of the 11th International Conference on Innovation in Language Learning. Presented at the Innovation in Language Learning, Italy, Florence.
Ballier, Nicolas, Thomas Gaillat. 2016. “Classification d’apprenants francophones de l’anglais sur la base des métriques de complexité lexicale et syntaxique”. JEP-TALN-RECITAL 2016, Jul 2016, Paris, France. Actes de la conférence conjointe JEP-TALN-RECITAL 2016, 9, pp.1-14, 2016, ELTAL. <https://jep-taln2016.limsi.fr/>
Gaillat, Thomas, Martine Schuwer & Sophie Belan. 2014b. Combiner les scénarios linguistique et actionnel au sein d’un parcours en ligne d’apprentissage de l’anglais. Recherche et pratiques pédagogiques en langues de spécialité. Cahiers de l’Apliut (Vol. XXXIII N° 3). 107–120.http://apliut.revues.org/4929
Gaillat, Thomas, Pascale Sébillot, and Nicolas Ballier. 2014a. Automated Classification of Unexpected Uses of This and That in a Learner Corpus of English, in Recent Advances in Corpus Linguistics: Developing and Exploiting Corpora, ed. by Lieven Vandelanotte, Kristin Davidse, Caroline Gentens, and Ditte Kimps, Amsterdam/New York: Rodopi. 309–24
Gaillat, Thomas. 2013c. This et that dans les domaines spécialisés du corpus ICE-GB : quelles caractéristiques distributionnelles ? » ASp. la revue du GERAS 64 (2013): 161-183. http://asp.revues.org/3890
Gaillat, Thomas. 2013b. Annotation automatique d’un corpus d’apprenants d’anglais avec un jeu d’étiquettes modifié du Penn Treebank. Proceedings of TALN 2013 conference in Sables-d’Olonne - France. http://www.taln2013.org/actes/www/index.html
Gaillat, Thomas. 2013a. “This and that in native and learner English: From typology of use to tagset characterisation”. Twenty years of learner research: looking back, moving ahead. Éd. par Sylviane Granger, Gaëtanelle Gilquin, et Fanny Meunier. 1 vol. Louvain-la-Neuve, Belgique: Presses universitaires de Louvain, 2013. 167-177. Corpora and Language in Use
Directrice : Pr Natalie Kübler
Centre de Linguistique Inter-langues,
de Lexicologie, de Linguistique Anglaise
et de Corpus-Atelier de Recherche sur la Parole
EA 3967
8 place Paul Ricœur
75013 Paris
Case courrier 7002
5 rue Thomas Mann
75205 Paris cedex 13