A Corpus Processor - A Linguistic Development Environment -  A Linguistic Engine for developing Natural Language Processing software Applications.


A Linguistic Text Analyzer

⟶ Develop linguistic resources to formalize various linguistic phenomena at the orthographical, lexical, morphological, syntactic and semantic levels, for any natural language (Linguistics).
⟶ Create your own corpora of texts, apply these linguistic resources to them and then perform various statistical analyses in Corpus Linguistics and Digital Humanities.
⟶ Use NooJ's engine and linguistic resources to construct Natural Language Processing applications.

NooJ's linguistic engine offers the four types of grammars of the Chomsky hierarchy: Regular grammars, Context-Free Grammars, Context-Sensitive Grammars and Unrestricted Grammars. NooJ's parsers are optimized and can be applied to large texts in real time.

All linguistic information is represented by annotations stored in the Text Annotation Structure (TAS) which are added to, or removed from the TAS in cascade. Annotations can represent agglutinated and intra-word  linguistic units (e.g. "cannot" = <can,V> <not,ADV>), simple words (e.g. <table,N>), multiword units (e.g. <as a matter of fact,ADV>), discontinuous expressions and phrasal verbs (e.g. "turn ... off" = <turn off,PV>), as well as any syntactic (constituent or dependency) or semantic structure. Texts can be exported with its TAS to an XML file.

NooJ has been used in a dozen NLP applications, including: Text Mining, Text Information extraction, Business Intelligence, Named Entity Recognition, Paraphrase generation, Machine Translation, RDF to Text Generation, Automatic Semantic Annotation, etc.


"NooJ and NLG" Training Session, COST Multi3Generation Workshop, September 19-24 2022, Venice, Italy

The ATISHS software (and several other NooJ applications) presented at JADT2022, July 6th 2022, Naples, Italy

ATISHS presented at the Facultad de Filosofía y Letras, Universidad de Buenos Aires, June 13 2022, Buenos Aires, Argentina.

The new RA linguistic engine v0.9 has been released!

L'Analyseur de Textes Innovant pour les Sciences Humaines et Sociales (ATISHS) est disponible sur le site

International NooJ Conference 2022
, June 14-16 2022, Rosario, Argentina

Semaine NooJ à l’INALCO, 3-7 janvier 2022, Paris.

NooJ Intensive Tutorial Workshop, Univ. of Rosario, Argentina (Teams Virtual platform, December 13-17, 2021). Contact to register.

Atelier intensif de formation à NooJ, Univ. de Damanhour, Egypte (Damanhour, 7-11 novembre, 2021).

Presentation of NooJ’s automatic Text Generation engine and tutorial workshop at the COST Multi3Generation meeting (Lisbon, October 5-8, 2021).

of the Russian NooJ module at Corpora 2021 (St Petersbourg) by Vincent Bénet, (from 21:23 to 44:53).

A Google group for NooJ users, moderated by Prihantoro

International NooJ Conference 2021, 9-11 June 2021, Besançon

NEW: Linguistic resources for Romanian, thanks to Maria-Diana Manescu (Politehnica University of Bucharest, in partnership with Grenoble Alpes University).

A NooJ publication! Journal Aprendo con NooJ/Learning with NooJ, First issue:

Une application d’étude de marketing de ATISHS présentée à la 15e journée de recherche en marketing horloger, 18 février 2021, Neuchâtel.

Un nouveau dictionnaire de verbes arabes, par Héla Fehri (Université de Gabès).

Semaine NooJ à l'INALCO, 11-15 janvier 2021, Paris

NooJ et ATISHS mentionnés dans un podcast.

Chinese Resources for NooJ, thanks to Zhen Cai (Université de Franche-Comté).

Détecter des fausses informations avec ATISHS. Présenté à la journée d"études Désinformation, journalisme et publics en Suisse, Université de Neuchâtel, 12 novembre 2020.

Indonesian Resources for NooJ, thanks to Prihantoro (Lancaster University).

Présentation de l'outil ATISHS. Séminaire virtuel linglunch à l'Université Diderot : 24 septembre 2020, Paris.

