Local Chatbot with GPT-Neo and Hugging Face Transformers

I’m interested in AI stuff and the chatbots are pretty fun. I have a dream of installing my own local copy to assist in some way. To be honest, I don’t know the available ways yet.

I’m somewhat comfortable with Python because I spent the fall attempting to make a “robot”. So I thought, might as well get better at that and make some kinda chat thing.

Setup Steps

Install python
Create a custom environment
Libraries to pip: argparse, transformers, torch

Code Stuff

Save the following in your environment and run with:

python FILENAME.py "Whatever question you like?"

Python Script

import argparse
from transformers import GPTNeoForCausalLM, GPT2Tokenizer # Add GPTNeoForCausalLM import here
import torch

def generate_text(input_text):
# Load pre-trained model and tokenizer
model_name = "EleutherAI/gpt-neo-1.3B" # Specify model name
model = GPTNeoForCausalLM.from_pretrained(model_name) # Load GPT-Neo model
tokenizer = GPT2Tokenizer.from_pretrained(model_name) # Load tokenizer for GPT-Neo

# Ensure that the pad_token_id is set to the eos_token_id for GPT-2 models
model.config.pad_token_id = model.config.eos_token_id

# Encode the input prompt
inputs = tokenizer.encode(input_text, return_tensors="pt")

# Create the attention mask (all ones, since there's no padding involved)
attention_mask = torch.ones(inputs.shape, device=inputs.device)

# Generate text from the model with adjustments for more sensible output
output = model.generate(
    inputs, 
    attention_mask=attention_mask, 
    max_length=100,  # Set max_length to a reasonable value
    num_return_sequences=1,  # Only generate one sequence
    do_sample=True,  # Enable sampling for some variety
    temperature=0.6,  # Lower temperature for more deterministic output
    top_p=0.85,  # Nucleus sampling with a smaller top_p for more focused output
    top_k=30,  # Limit to top 30 tokens for sampling
    pad_token_id=model.config.pad_token_id  # Ensure pad_token_id is set correctly
)

# Decode the output
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)

# Print the result
print(decoded_output)

if name == "main":
# Set up argument parsing
parser = argparse.ArgumentParser(description="Generate text with GPT-2 or GPT-Neo")
parser.add_argument("input", type=str, help="Input text prompt for GPT-2")

# Parse the arguments
args = parser.parse_args()

# Call the generate_text function with the user-provided input
generate_text(args.input)

Sample Results

Q: why is the sky blue?

A: I’m wondering why the sky is blue. I’ve always wondered this, but I don’t know where to start. I’ve seen the sky blue in the past, but I can’t remember when. I know it was when I was a kid, but I don’t remember.

I have a friend who is a blue-collar worker, and he says that blue is a color that is not used in the sky.

Q: do you remember my last question?

A: I asked you to find out if the world is ending.

I asked you to find out if the world is ending.

Q: do you remember my last question? think harder

It’s a question that I’ve been asking myself for years now. I’ve been asking myself this question since I was a teenager, and I’ve been asking myself this question since I was a young man. I’ve asked myself this question a lot, and I’ve been asking myself this question a lot.

Conclusion

The results seemed a little un-settling so far. But from what I’ve read, this is to be expected. There are settings to adjust how varied the responses are as well as models that you have to “train”. The latter I have no idea about, but I’ll keep tinkering.

Local Chatbot with GPT-Neo and Hugging Face Transformers

Setup Steps

Code Stuff

Sample Results

Conclusion

More posts

Enhance

Local Chatbot with GPT-Neo and Hugging Face Transformers

Hello world!