Skip to content


PyPI Download Build Last Commit

Getting started

Installing via pip

pip install tfkit
  • You can use tfkit for model training and evaluation with tfkit-train and tfkit-eval.

Running TFKit on the task you wanted

First step - prepare your dataset

The key to combine different task together is to make different task with same data format.


  • All data will be in csv format - tfkit will use csv for all task, normally it will have two columns, first columns is the input of models, the second column is the output of models.
  • Plane text with no tokenization - there is no need to tokenize text before training, or do re-calculating for tokenization, tfkit will handle it for you.
  • No header is needed.

For example, a sentiment classification dataset will be like:

how dare you,negative


For the detail and example format on different, you can check here


nlprep is a tool for data split/preprocessing/argumentation, it can help you to create ready to train data for tfkit, check here

Second step - model training

Using tfkit-train for model training, you can use

Before training a model, there is something you need to clarify:

  • --model what is your model to handle this task? check here to the detail of models.
  • --config what pretrained model you want to use? you can go to search for available pretrained models.
  • --train and --test training and testing dataset path, which is in csv format.
  • --savedir model saving directory, default will be in '/checkpoints' folder

you can leave the rest to the default config, or use tfkit-train -h to more configuration.

An example about training a sentiment classifier:

tfkit-train \
--model clas \
--config xlm-roberta-base \
--train training_data.csv \
--test testing_data.csv \
--lr 4e-5 \
--maxlen 384 \
--epoch 10 \
--savedir roberta_sentiment_classificer

Third step - model eval

Using tfkit-eval for model evaluation.
- --model saved model's path.
- --metric the evaluation metric eg: emf1, nlg(BLEU/ROUGE), clas(confusion matrix).
- --valid validation data, also in csv format.
- --panel a input panel for model specific parameter.

for more configuration detail, you may use tfkit-eval -h.

After evaluate, It will print evaluate result in your console, and also generate three report for debugging.
- *_score.csv overall score, it is the copy of the console result.
- *each_data_score.csv score on each data, 3 column predicted,targets,score, ranked from the lowest to the highest.
- *predicted.csv csv file include 3 column input,predicted,targets.


nlp2go is a tool for demonstration, with CLI and Restful interface. check here


Use distilbert to train NER Model

nlprep --dataset tag_clner  --outdir ./clner_row --util s2t
tfkit-train --batch 10 --epoch 3 --lr 5e-6 --train ./clner_row/train --test ./clner_row/test --maxlen 512 --model tag --config distilbert-base-multilingual-cased 
nlp2go --model ./checkpoints/  --cli     

Use Albert to train DRCD Model Model

nlprep --dataset qa_zh --outdir ./zhqa/   
tfkit-train --maxlen 512 --savedir ./drcd_qa_model/ --train ./zhqa/drcd-train --test ./zhqa/drcd-test --model qa --config voidful/albert_chinese_small  --cache
nlp2go --model ./drcd_qa_model/ --cli 

Use Albert to train both DRCD Model and NER Model

nlprep --dataset tag_clner  --outdir ./clner_row --util s2t
nlprep --dataset qa_zh --outdir ./zhqa/ 
tfkit-train --maxlen 300 --savedir ./mt-qaner --train ./clner_row/train ./zhqa/drcd-train --test ./clner_row/test ./zhqa/drcd-test --model tag qa --config voidful/albert_chinese_small
nlp2go --model ./mt-qaner/ --cli 

You can also try tfkit in Google Colab: Google Colab


Thanks for your interest.There are many ways to contribute to this project. Get started here.


PyPI - License

Icons reference

Icons modify from Freepik from
Icons modify from Nikita Golubev from