Text Generation Settings

A class called GENSettings() is used to control which algorithm is used and its settings. It is passed to the “args” parameter for HappyGeneration.generate_text().

from happytransformer import GENSettings

GENSettings() contains the fields shown in Table 1.0

Table 1.0:

Parameter Default Definition
min_length 10 Minimum number of generated tokens
max_length 50 Maximum number of generated tokens
do_sample False When True, picks words based on their conditional probability
early_stopping False When True, generation finishes if the EOS token is reached
num_beams 1 Number of steps for each search path
temperature 1.0 How sensitive the algorithm is to selecting low probability options
top_k 50 How many potential answers are considered when performing sampling
top_p 1.0 Min number of tokens are selected where their probabilities add up to top_p
no_repeat_ngram_size 0 The size of an n-gram that cannot occur more than once. (0=infinity)
bad_words None List of words/phrases that cannot be generated.

Example 1.2:

from happytransformer import HappyGeneration, GENSettings

#---------------------------------------------------
happy_gen = HappyGeneration()

greedy_settings = GENSettings(no_repeat_ngram_size=2,  max_length=10)
output_greedy = happy_gen.generate_text(
    "Artificial intelligence is ",
    args=greedy_settings)

beam_settings = GENSettings(num_beams=5,  max_length=10)
output_beam_search = happy_gen.generate_text(
    "Artificial intelligence is ",
    args=beam_settings)

generic_sampling_settings = GENSettings(do_sample=True, top_k=0, temperature=0.7,  max_length=10)
output_generic_sampling = happy_gen.generate_text(
    "Artificial intelligence is ",
    args=generic_sampling_settings)

top_k_sampling_settings = GENSettings(do_sample=True, top_k=50, temperature=0.7,  max_length=10)
output_top_k_sampling = happy_gen.generate_text(
    "Artificial intelligence is ",
    args=top_k_sampling_settings)

top_p_sampling_settings = GENSettings(do_sample=True, top_k=0, top_p=0.8, temperature=0.7,  max_length=10)
output_top_p_sampling = happy_gen.generate_text(
    "Artificial intelligence is ",
    args=top_p_sampling_settings)

bad_words_settings = GENSettings(bad_words = ["new form", "social"])
output_bad_words = happy_gen.generate_text(
    "Artificial intelligence is ",
    args=bad_words_settings)
    
print("Greedy:", output_greedy.text)  # a new field of research that has been gaining
print("Beam:", output_beam_search.text) # one of the most promising areas of research in
print("Generic Sampling:", output_generic_sampling.text)  #  an area of highly promising research, and a
print("Top-k Sampling:", output_top_k_sampling.text)  # a new form of social engineering. In this
print("Top-p Sampling:", output_top_p_sampling.text)  # a new form of social engineering. In this
print("Bad Words:", output_bad_words.text) # a technology that enables us to help people deal