ml
- Document context vs. CRF decoding on ModernBERT for CoNLL-2003 NER
A 2×2 ablation on CoNLL-2003; document-level context helped, but stacking CRF on top did not.
- What is the big deal with transformers?
jotting down thoughts while reviewing the seminal paper by Vaswani et al.