Overview of the BERT + LSTM model.

figure

posted on 2022-07-28, 17:53 authored by Elias Jacob de Menezes-Neto, Marco Bruno Miranda Clementino

The text of a court ruling is split into chunks containing 512 tokens each. They are passed to a Portuguese BERT model, from which we collect the embeddings from the CLS token. We feed an LSTM with the embeddings from the previous step, condensing them into one vector. We pass this vector to a classifier head to get a final classification.