Link Search Menu Expand Document

Token Classification Basic Usage

Initialization

Initialize a HappyTokenClassification object for token classification

Initialization Arguments:

  1. model_type (string): specify the model name in all caps, such as “ROBERTA” or “ALBERT”
  2. model_name(string): potential models can be found here

classify_token()

Inputs:

  1. sentence_a (string): Text you wish to classify. Be sure to provide full sentences rather than individual words so that the model has more context.

Returns: A list of objects with the following fields: word: The classified word score: the probability of the entity entity: the predicted entity. Each model has it’s own unique set of entities. index: The index of the token within the tokenized text start: The index of the string where the first letter of the predicted word occurs end: The index of the string where the last letter of the predicted word occurs

Example 5.1:

    from happytransformer import HappyTokenClassification
    # --------------------------------------#
    happy_toc = HappyTokenClassification(model_type="BERT", model_name="dslim/bert-base-NER")
    result = happy_toc.classify_token("My name is Geoffrey and I live in Toronto")
    print(type(result))  # <class 'list'>
    print(result[0].word)  # Geoffrey
    print(result[0].entity)  # B-PER
    print(result[0].score)  # 0.9988969564437866
    print(result[0].index)  # 4
    print(result[0].start) # 11
    print(result[0].end)  # 19
    print(result[1].word)  # Toronto
    print(result[1].entity)  # B-LOC