Our data science team is always learning and experimenting which means they are closely connected to other research in this area around the world.
Bidirectional LSTM-CRF models for sequence tagging
- BY Michael Zhang
- DATE: April 14, 2020
Why we love it
I enjoy this paper since it introduces a new method for entity extraction, called BI-LSTM-CRF. The method proposed by this paper is more accurate than previous methods in various wide-applicable tasks in NLP. This paper inspires us to explore the possibility of applying deep learning methods in our work.
What I learnt from it
I learnt that (1) BI-LSTM-CRF out-performs previous models such as CRF or pure LSTM in POS tagging, chunking, and NER tasks. (2) The stacked structure in BI-LSTM-CRF allows it to factor in both word-level features via an LSTM layer and sentence-level feature via the CRF layer, so that it can make use of syntactic and contextual information in the language efficiently.
Why it’s a must
Published in 2015, this paper is one of the first work showing that LSTM based method can be applied to NLP tasks. According to Google Scholar, this paper has more than 1000 citations.
Who should pay attention?
NLP researchers, Machine Learning Engineers