Neural Network Methods for Quantifiers

January 2018 Coordinated Project @ ILLC

News and Updates

Scroll down for a detailed description of this project's motivation and structure.

If you are interested in this project, please contact the instructor -- Shane Steinert-Threlkeld -- by e-mail: S.N.M.Steinert-Threlkeld (at)

  • 13 Dec: updated reading list
  • 06 Dec: website created!

Project Description

In this project, students will develop tools and run (computational) experiments to test the hypothesis that semantic universals arise because expressions satisfying them are easier to learn than those that do not.

A semantic universal is a property of meaning shared by (almost) all natural languages (possibly conditional on the languages having additional properties). Because languages vary quite a bit, when one finds a universal, one naturally wonders whether there's an explanation for it. Why do all languages have this semantic property? We are interested in exploring the following hypothesis.

Hypothesis: Semantic universals arise because they make meaning systems easier to learn.

Of course, the hypothesis can only be supported or refuted when a model of learning a semantic system has been specified. Thus, the hypothesis naturally gives rise to a challenge.

Challenge: Provide a model of learning which makes good on the Hypothesis (at least, for some semantic universals).

In recent work, Steinert-Threlkeld and Szymanik attempt to meet the Challenge by training recurrent neural networks to learn the meanings of quantifiers, a domain where many semantic universals have been posited. They use this framework to explain universals like monotonicity and quantity. The monotonicity universal works as follows. Consider the following two sentences.

  1. Many French people smoke cigarettes.
  2. Many French people smoke.

Sentence (1) entails sentence (2): the former cannot be true without the latter being true. Notice that all we have done is replaced the term "smoke cigarettes" with the strictly more general term "smoke". Also notice that the inference pattern holds for any choice of the restrictor (not just "French people") and pairs of nuclear scope that stand in the same specific-general relation. Because of this, we say that the quantifier "many" is upward monotone. If "many" is replaced by "few", the inference pattern reverses. "few" is downward monotone. The proposed universal then states:

Monotonicity: All simple determiners are monotone.

In the paper, neural networks are trained to learn monotone and non-monotone quantifiers and it is shown that the former are learned significantly faster than the latter. We also show that this pattern holds for another universal called Quantity, which states that quantifiers only care about the sizes of sets, and not the identity of objects or their position in a structure. (This is related to a conception of logicality.)

In this project, students will develop new tools and run more experiments in order to further develop the explanation of semantic universals in terms of learnability. They will be doing original research that could become (part of) publications. Existing code as well as access to computing infrastructure will be provided. Possible topics that can be addressed are:

  • Running larger experiments, with many more quantifiers being learned at once, using their semantic properties as statistical factors in their learning rate.
  • Developing tools to look inside the resulting neural networks and see how they actually process quantifiers. Do they behave similarly to other proposals from the semantics literature, like semantic automata?
  • Extending the framework to explain more universals for quantifiers (such as Extensionality).
  • Extending the framework to explain semantic universals in other domains.
  • Developing theoretical connections between learnability in neural networks and semantic properties.
  • A project of your own choice!


The class will meet 3 times a week in the four weeks from January 8 to February 2. We will provide necessary background in the first two weeks, then transition into coding / experimenting sessions in the last two weeks. The course should be mostly self-contained, though some pre-requisites are listed.


Week 1: theoretical background on semantic universals and quantifiers (possibly color as well)

Week 2: background on training neural networks, tutorials on how to run your own experiments by modifying provided code

Week 3: run experiments! We will have in-class coding sessions for support and question answering.

Week 4: finish experiments; write up the results and deliver short presentation


Working knowledge of Python will be very valuable. Knowledge of specific libraries (Numpy, TensorFlow/Keras) will be helpful, but can be learned on the fly. While we will cover neural networks and their training and evaluation in the second week, familiarity with those topics will also help.


Students will be conducting their own (computational) experiments. They will be expected to produce a short write-up (~5 pages) of at least one experimental result, explaining the motivation for their experiment and what they found. On the final day of class, there will be short presentations of results and discussion.

Reading & Resources

Here we include both reading and coding resources.

Required Reading

  • Barwise and Cooper 1981, "Generalized Quantifiers and Natural Language"
    This very influential paper introduced generalized quantifiers in to natural language semantics and developed many semantic universals.
  • Steinert-Threlkeld and Szymanik 2017/2018, "Learnability and Semantic Universals"
    We attempt to meet the Challenge above by training neural networks to learn the meanings of quantifiers. This paper forms the motivational basis for this project.

Semantic Background

  • Westerståhl 2011, "Generalized Quantifiers"
    A modern overview of generalized quantifiers.
  • Szymanik 2016, Quantifiers and Cognition (especially chapter 4):
    An overview of quantifiers integrating logic and cognitive science. Chapter 4 introduces the semantic automata approach, which could be connected to the approach in our paper above.

Neural Network Background

  • 3Blue1Brown 2017, "Deep learning"
    Very well-produced videos introducing neural networks, gradient descent, and back-propagation.
  • Nielsen 2015, "Neural Networks and Deep Learning"
    Good e-book explaining the basic concepts behind neural networks and their training. (I find the notation can be a bit cumbersome here.)
  • LeCun, Bengio, & Hinton 2015, "Deep learning"
    A great scientific review in Nature of deep learning.

Coding Resources

  • shanest/quantifier-rnn-learning: source code for Steinert-Threlkeld and Szymanik paper.
    Note: I will be re-factoring this code before the project starts to make it easier for other people to run their own experiments and/or extend it to other cases. So you can read the code now to get some understanding, but wait a bit to fork it or modify it.
  • Keras: library for building and training neural network models
  • TensorFlow: Google's open source machine learning library; this is what we used in our paper and is the back-end behind Keras