Instructions to use LazarusNLP/NusaBERT-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LazarusNLP/NusaBERT-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="LazarusNLP/NusaBERT-base")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("LazarusNLP/NusaBERT-base") model = AutoModelForMaskedLM.from_pretrained("LazarusNLP/NusaBERT-base") - Inference
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ tags:
|
|
| 25 |
|
| 26 |
# NusaBERT Base
|
| 27 |
|
| 28 |
-
NusaBERT Base is a multilingual encoder-based language model based on the [BERT](https://arxiv.org/abs/1810.04805) architecture. We conducted continued pre-training on open-source corpora of [sabilmakbar/indo_wiki](https://huggingface.co/datasets/sabilmakbar/indo_wiki), [acul3/KoPI-NLLB](https://huggingface.co/datasets/acul3/KoPI-NLLB), and [uonlp/CulturaX](https://huggingface.co/datasets/uonlp/CulturaX). On a held-out subset of the corpus, our model achieved:
|
| 29 |
|
| 30 |
- `eval_accuracy`: 0.6866
|
| 31 |
- `eval_loss`: 1.4876
|
|
|
|
| 25 |
|
| 26 |
# NusaBERT Base
|
| 27 |
|
| 28 |
+
[NusaBERT](https://arxiv.org/abs/2403.01817) Base is a multilingual encoder-based language model based on the [BERT](https://arxiv.org/abs/1810.04805) architecture. We conducted continued pre-training on open-source corpora of [sabilmakbar/indo_wiki](https://huggingface.co/datasets/sabilmakbar/indo_wiki), [acul3/KoPI-NLLB](https://huggingface.co/datasets/acul3/KoPI-NLLB), and [uonlp/CulturaX](https://huggingface.co/datasets/uonlp/CulturaX). On a held-out subset of the corpus, our model achieved:
|
| 29 |
|
| 30 |
- `eval_accuracy`: 0.6866
|
| 31 |
- `eval_loss`: 1.4876
|