Paper Reading - A Joint Many-Task Model, Growing a Neural Network for Multiple NLP Tasks

Jul 3, 2019
1 minute read

Problem

Perform multi-tasks in a hierarchical manner. Train a multi-layer model for multitasks. Different layers handle different tasks, from morphology, syntax to semantics.

Key Ideas

Different layers handle different tasks.
Low-level layer handle easy task, high-level layer handle difficult task.
Tasks are stacked: POS - CHUNK - DEP - Relatedness - Entailment.
Train tasks sequentially, from easy to hard; add regularization term to prevent significant catastrophic forgetting.

Model

Joint Many-Task Model

Structure
- Each task utilizes one layer BiLSTM
- n + 1 layer dependends on n-th layer output.
Data

Use different existing labeled training data
Training

Train tasks in sequence: POS, CHUNK, DEP, Relatedness, Entailment (from low-level to high level). Add successive regularization: make the previous layer output not change too much after training current layer.

Performance

Joint model performs better than single task models.
Joint performance are comparable with existing single models.
Sequential training is better than shuffle.
Regularization is useful for tasks with small amount of data.

Comments

Multi-task, task hierarchy are useful.
Train from bottom to top.
We can learn task dependency structure.
We can utilize better techniques for catastrophic forgetting. Besides, maybe sampling to improve tasks with limited data can help.

Reference

Paper link: https://aclweb.org/anthology/D17-1206