Wiktionnaire:Actualités/031-October-2017

Définition, traduction, prononciation, anagramme et synonyme sur le dictionnaire libre Wiktionnaire.
Cette page est une version traduite de la page Wiktionnaire:Actualités/031-octobre-2017 et la traduction est terminée à 100 %.
Autres langues :
Logo Wiktionnaire-Actualités.svg

Wiktionnaire:Actualités is a monthly periodical about French Wiktionary, dictionaries and words, published online since April 2015. Everyone is welcome to contribute to it. You can sign in to be noticed of future issues, read old issues and participate to the draft of the next edition. You can also have a look at Regards sur l’actualité de la Wikimedia. If you have any comments, critics or suggestions, our talk page is open!

Actualités - Issue 31 - October 2017
Peinture.

Announcement

On November 17 and 18 two study days on dictionary creation will take place in Lyon. The first day will allow discussion about methods and practices with a session dedicated to the Wiktionary project. The second day will be participative with group training in dictionary writing for the Wiktionary project, focused on the ten words of the Francophonie. And it’s co-organized by Lyokoï and Noé, two editors of Actualités!

Highlights

  • In August, we were talking about a Quebec-based association for the promotion of the Shawiya language. An interview in the newspaper L’initiative allows you to learn more about their activities, which include a workshop on contribution to Wiktionary!
  • In an article in the newspaper Le Soir, Michel Francard takes interest in the term tote-bag and notes that “Wiktionary, well known for its lexical scouting, mentions one [attestation] as early as June 2013”.
  • The Swiss newspaper 24 heures reports on a court case in which Wiktionary was called upon because it was the only dictionary to offer a definition for a problematic term. At the time of the first judgment the court had to determine whether the term used by the accused was a racist insult and relied solely on the definition given in Wiktionary. On appeal, the Federal Court pointed out that “Wiktionary has no official status and its definitions are open to modifications”. An entirely correct analysis and especially true in this case, since the page in question doesn't yet have any usage examples that could help identify the context in which the term is used. The case was transferred to the Cantonal Court of Vaud.
  • A columnist of the website Jeuxvideo.com does not hesitate to rely on the “knowledge of Wiktionary” to define video game in the introduction to a controversial article.
  • In a long paper published on the website Les Numériques, Jérôme Cartegini presents and compares the encyclopedia Universalis 2018, Le Grand Robert online and Wikipedia. Too bad Wiktionary was not included in the comparison, as it contains more entries than the Grand Robert and also has more usage examples.
Painting.

Statistics

From mid-September to mid-October (from 09/20/2017 to 10/20/2017)
  • French entries increased by 1,508 and quotations increased by 1,080. There are now 357,235 lemmas, 527,416 definitions and 331,833 quotations or examples.
  • The three other languages which progressed the most are Northern Sami (+ 6,554 entries), Italian (+ 1,209 entries) and Esperanto (+ 280 entries).
  • Four languages were added in the project (here with French names): lhomi (+1), merei (+1), limilngan (+1) and lorrain (+1).
  • In October 11,272 entries were created for 76 languages!
New lexicons
Words of the month

Statistics provided by Wikiscan:

Thesaurus

A vote to split existing thesauri with ambiguous titles led to the creation of: cirque (naturel) and cirque (spectacle); langue (anatomie) and langue (linguistique); paresseux (animal) and paresseux (personne); assimilation culturelle and assimilation (biologie); racine (végétale), racine (odontologie), racine (linguistique), racine (informatique), racine (géologie) and racine (figuré et sociologique).

By the way, Assassas77 is still working on thesauri in Tagalog, adding six more!

As of October 31 2017, the French Wiktionary contains 317 thesauri in French and a total of 452 thesauri in 54 languages!
23 new thesauri this month, five of which in French: punition [punishment], peine de mort [death penalty], prison [jail] (first thesaurus creation by Classiccardinal!), armure [armour] and tissage [weaving].

Miscellaneous
  • There are 32,855 illustrations (images and videos) in the French Wiktionary entries and 258 have been added since last month.

The Questions on words page (WT:QM) records 189 questions in October, compared to 197 in September, 141 in August and 124 in July.

Identifying a root

In automatic language processing, several operations can be used to produce tools for a language. Richard Khoury and Francesca Spasford have tried to create a tool for Latin stemming from the English Wiktionary, which they report in their article “ Latin word stemming using Wiktionary ” (in Digital Scholarship in the Humanities, volume 31, number 2, June 2016, pages 368–373). Their approach uses the database and links between pages that are specified in very precise declination models in order to link the roots to endings for verbs and suffixes for nouns. From a database dump of May 2015, they proceeded with three cleaning steps and then obtained 655,434 word forms for 32,860 roots.

The best tool before their experiments, the Schinke Stemmer, works on a different principle and is based on a set of rules that automatically stem by creating hypothetical roots, not always producing valid words but nevertheless reducing their overall number in a text, making it easier to search it with a search engine.

By comparing both tools, they observe that the one based on Wiktionary misses words that it does not know, but nevertheless reduces the vocabulary of a text much more effectively. Additionally it allows you to access a dictionary of definitions directly afterwards, something not possible with the previous tool. They even plan to improve their use of the Wiktionary database to integrate the part of speech categories of entries in order to produce an additional tool for morphosyntactic labelling of a corpus.

These uses show that the Wiktionary projects contain data that are not only usable as a dictionary, but also allow, through their regular structures, reuse by machines to create new tools — a review by Noé.

About patrol and patrollers

Some remarks about the role of patrollers:

Patrollers are editors who spend some of their time to read contributions made on Wiktionary.

They have a tool which tells them about changes still in need of patrolling. Only anonymous contributions or edits made by users lacking the "auto-patrol" flag have to be checked.

After proofreading they can mark a contribution as patrolled.

Being patrolled means free of vandalism in a broad sense, which implies:

  • deletion of clearly defamatory material
  • deletion of material containing personal information
  • deletion of information irrelevant to the page title
  • deletion of copyrighted information
  • restoring of correct information after deletion or corruption

These are the basic actions of the patroller. They may, in this context, if they are not administrators, be required to request that contributions containing defamation, personal information and copyright violations be concealed by them.

Then, the patroller can, wether he wishes, go further by operating on the presentation of various possible additional actions such as:

  • to correct a page to conform to the expected structure of a Wiktionary page
  • to correct typography
  • to correct spelling
  • to correct or add templates
  • to correct or add categories
  • to check the sources and references that are used.

Last but not least, and by far the most interesting, it can investigate the substance, ensuring the accuracy of a contribution, or even providing additional information or corrections.

It must be said, this part is by far the longest and also the least easy.

Thus, it is possible:

  • to add missing inflections
  • to add quotes
  • to add pronunciations, anagrams, etc.
  • to verify the accuracy of translations.

Concerning this last point, it is necessary to have a certain level of linguistic skills, very rich material on a large number of languages and knowledge of the grammar of several languages — which is not the case for everyone.

Translation errors are indeed numerous, although made in good faith, often because of the metonymy processes are not the same for all languages. This means that it is sometimes fatal to copy a translation found elsewhere (dictionary, Wikipedia, etc.).

For example, many languages distinguish by different names the action from its result, the content from its container, the building from the institution, etc., where the French language does not necessarily do so. Thus, in Finnish: loading (action): kuormaus / loading (what is loaded): kuormitus; the town hall (the building): kaupungintalo / the town hall (administration): pormestarin

And of course, we find the same problem in the opposite direction Finnish/French.

It is, however, quite rare to encounter real mistranslations. I remember one, several years ago, on the English Wiktionary who had amused me: intrigued by the fact that I found several pages on the net giving the word anaullaut in Inuktitut, and knowing that this word meant stick I found, after some research, that the origin was that a contributor had found in an Inuktitut/English dictionary: anaullaut : bat and has created this entry on the English Wiktionary by specifying Category:Animals; this has been reused and translated into French by other websites.

Yet, alas for him, it was the English word bat but in his meaning of batte — for example in baseball — and not bats (animal) ...
If you have also noticed some crazy or funny contributions, do not hesitate to report them here for a future issue. — a chronicle by Unsui

Painted branches.

Detail of a picture by Albarubescens during the photo challenge of October.

Dictionary of the month

Yann Lukas, Les Mots celtes clandestins, coop breizh, 2017, ISBN 978-2-84346-834-6

What happens when Wiktionary becomes a reference against its own will? When discussing the sources of our project it becomes clear that they aren't at all structured like on Wikipedia. We don't share the same attitude towards original research and could even count as a source ourselves. Well, actually we're doing this already. And I can prove it with the little dictionary of the month. A pocket reference which gives an overview of “French vocabulary borrowed from Gaulish, Breton and the Celtic languages”. Yann Lukas shows us some familiar words and some with an unexpected Celtic origin. He suggests Celtic roots for some slang words where standard dictionaries are lost: à dache, loufer, morfal and many more.

But on page 62 we find a funny turn of phrase: Tamis: although disputed, the Gaulish etymology of tamis is tempting. In his Dictionnaire des étymologies obscures (Payot, 1982), Pierre Guiraud opts for a Latin origin stamen, also the root of étamine. Wiktionary prefers the Low-Franconian tamisa (source of Old Dutch teems). [...] So we are cited in a recent etymological analysis. And our hypothesis for tamis isn't very solid. Actually it was added by an IP without giving sources, and other users have added more on top. Still, it shouldn't be discarded entirely since an etymologist has attested a certain value.

Apart from this small appearance which might bring us fame (or not), or at least acknowledgment, this short dictionary of Celtic words is filled with anecdotes about Celtic languages that allow us to get a better understanding of them in our world today. We also get to wonder about tortured Breton which got words that don't suit it: menhir (the Bretons say peulvan), dolmen (they say lichaven), kermesse (from the Flamish kerkmisse) or even triskèle (from Greek and written as triskell to make it look more Celtic). — a chronicle by Lyokoï

Painting.

On video

This section gives you a monthly selection of videos related to linguistics or the French language, don't hesitate to add more videos you find!

LexiSession on punishment

Initiated by the Tremendous Wiktionary User Group, the LexiSessions suggest monthly themes to simultaneously engage all Wiktionaries.

The themes are suggested in advance on Meta and announced every month on Wikidémie, the main community portal.

The October LexiSession was about punishment and led to the creation of three thesauri!

For the month of November we invite you to take an interest in toilets!
A workshop.

Detail of a photo of the Kizhi Pogost by MatthiasKabel.

Wikiconvention francophone 2017

Three days of encounters provided plenty of time to to meet the around 100 people who came to talk about their projects, ranging from personal contributions to dynamic collectives that spring up everywhere in the world.

The team of "Actualités du Wiktionnaire" was present in Strasbourg to cover the event and to collect enough material to fill the next few numbers, and of course to promote Wiktionnaire in every conversation!

Counting two talks and one project meeting, Wiktionnaire was well represented among a number of high quality presentations.

Just to mention a few topics brought up by participating editors of Wiktionnaire: inclusion of African languages, support for new participants, audio recordings with Lingua Libre, organisation of Edit-a-thons and the vitality of collective initiatives.

At this occasion we also met two researchers from the Logoscope project who are interested in cooperating with Wiktionnaire. Expect some updates on this very soon!

Curiosity: The phatic function

Among the six big functions of language defined by Roman Jakobson, the phatic function ensures that the communication channel works well.

These are words and expressions like "you see" or "do you follow?" but also words used at the begin of a telephone conversation such as allô?.

Marina Yaguello extends the analysis to all discourses which only have the goal to maintain the conversation without sharing anything at all.

By concentrating on the level of sentences and words it is difficult for a dictionary to describe these usages. One one hand there exist a great deal of variation in the used terms, but finding written attestations isn't always straightforward.

On the other hand it is because of the difficulty to explain the function of these terms well.

Sometimes they are entire sentences, including a verb, emptied of its meaning, to fullfil an entire communicational purpose.

— a chronicle by Noé