Download the huwiki corpus
The
huwiki
corpus is the Hungarian part of hunNERwiki. The data is divided into four gzip compressed files.
The huwiki corpus
File
Tokens
Size (compressed)
huwiki.1.ner.tsv.gz
7266903
41M
huwiki.2.ner.tsv.gz
4823538
27M
huwiki.3.ner.tsv.gz
3803409
22M
huwiki.4.ner.tsv.gz
3214747
18M