Abstract :
It is well known that there are three basic tasks in
Natural language processing(NLP) (Tokenization, Part-OfSpeech tagging, Named Entity Recognition), which in turn
can be divided into two levels, lexical and syntactic. The
former level includes tokenization. The latter level includes
part of speech (POS) and the named entity recognition
(NER) tasks. Recently, deep learning has been shown to
perform well in various natural language processing tasks
such as POS, NER, sentiment analysis, language modelling,
and other tasks. In addition, it performs well without the
need for manually designed external resources or timeconsuming feature engineering. In this study, the focus is on
using Long Short-Term Memory (LSTM), Bidirectional
Long Short-Term Memory (BLSTM), Bidirectional Long
Short-Term Memory with Conditional Random Field
(BLSTM-CRF), and Long Short-Term Memory with
Conditional Random Field (LSTM-CRF) deep learning
techniques for tasks in Syntactic level and comparing their
performance. The models are trained and tested by using the
KALIMAT corpus. The obtained results show that a BLSTMCRF model overcame the other models in the NER task. As
for the POS task, the BLSTM-CRF model obtained the
highest F1-score compared to the other models.
Keyword :
Natural Language Processing, Deep learning, Part-of-Speech tagging, Named-Entity Recognition.