Problem
Perform multi-tasks in a hierarchical manner. Train a multi-layer model for multitasks. Different layers handle different tasks, from morphology, syntax to semantics.
Key Ideas
- Different layers handle different tasks.
- Low-level layer handle easy task, high-level layer handle difficult task.
- Tasks are stacked: POS - CHUNK - DEP - Relatedness - Entailment.
- Train tasks sequentially, from easy to hard; add regularization term to prevent significant catastrophic forgetting.
Model
-
Structure
- Each task utilizes one layer BiLSTM
- n + 1 layer dependends on n-th layer output.
-
Data
Use different existing labeled training data
-
Training
Train tasks in sequence: POS, CHUNK, DEP, Relatedness, Entailment (from low-level to high level). Add successive regularization: make the previous layer output not change too much after training current layer.
Performance
- Joint model performs better than single task models.
- Joint performance are comparable with existing single models.
- Sequential training is better than shuffle.
- Regularization is useful for tasks with small amount of data.
Comments
- Multi-task, task hierarchy are useful.
- Train from bottom to top.
- We can learn task dependency structure.
- We can utilize better techniques for catastrophic forgetting. Besides, maybe sampling to improve tasks with limited data can help.
Reference
Paper link: https://aclweb.org/anthology/D17-1206