A tempting paper in math linguistics
Oct. 10th, 2010 05:58 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-userinfo.gif)
Perhaps someone would want to comment on this, or would find the reference interesting.
http://arxiv.org/abs/1003.4394
"Mathematical Foundations for a Compositional Distributional Model of Meaning
Authors: Bob Coecke, Mehrnoosh Sadrzadeh, Stephen Clark
(Submitted on 23 Mar 2010)
Abstract: We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. [...]"
Update: I think understand it completely now.
It's a paper worth reading. It's not too difficult, and is a very inexpensive way to get a review of quite a few cool things it relies upon.
The objective of this paper is to provide a civilized, compositional semantics, but with meanings naturally represented by ordinary vectors, and this is what I'd like to see very much.
What remains to be seen is whether this can be further developed to be applicable in practice.
One remark about their particular examples is that the meanings for all their example sentences are represented by fuzzy Booleans, that is, a sentence "John likes Mary" is represented by a fuzzy Boolean expressing to which extent the statement is true (and this fuzzy Boolean is represented as a vector in a two-dimensional space generated by atoms "true" and "false").
In practice, one would probably want a more intensional approach, preserving information about the identities of John and Mary in the combined meaning of the sentence. But then one would lose the property that all sentences take their meaning in the same vector space, and restoring that would result in a very large space. Perhaps, one would need to consider a projection of that very big space onto the space expressing meanings of the set of sentences under consideration at a given point, or something like that.
Update 2: Of course, this does not even start to model the connotations of the word "likes". It might be possible to introduce those connotations simply as the lexical vector of nearby words, but the paper does not even start to explore how this should/might be combined with the functions of word "likes" as an operator containing the sentence type and yielding the sentence value.
no subject
Date: 2010-10-12 04:10 am (UTC)Теперь, вроде бы, добрались до самой темы. Ура.
Я бы не сказал, что я всё пока понял, но меня смутило то же, что и тебя.
"likes" представляет собой безумную матрицу, а обсуждаемое высказывание определяет в матрице одну ячейку; никаких намёков на конструктивные соображения о том, как же выглядит эта матрица целиком. Но да, много чего увязано, и увязано, на мой дилетантский взгляд, весьма добротно.