ChatGPT Plugin Tutorial: How to build ChatGPT Plugin for image generation using Stable Diffusion
Introduction
ChatGPT Plugins, are tools that connect ChatGPT to external applications. These plugins expand ChatGPT's abilities by allowing it to interact with specific APIs created by developers. With the help of these plugins, ChatGPT can perform various tasks such as fetching up-to-date information like sports scores, stock prices, or the latest news, assist users with tasks such as booking a flight or ordering food, providing helpful guidance and support. Stable Diffusion, is a new generative model that can generate high-resolution images with a single forward pass. It is based on the Diffusion Models and StyleGAN2 architectures.
What we are going to do?
I'm going to be walking you through step by step the simple and straightforward process of building a ChatGPT Plugin for image generation using Stable Diffusion. As a bonus at the end of this tutorial, I'll also show you how to integrate your plugin to ChatGPT and test it out. So sit back, relax, and let's get started!
Prerequisites
Download Visual Studio Code compatible with your operating system, or use any other code editor like: IntelliJ IDEA, PyCharm, etc. To use ChatGPT Plugins API you need join plugins waitlist. To use Stable Diffusion we need API Key, go to Dream Studio, create an account and grab your API key.
Getting started
Step 1 - Create a new project
Let's start by creating new folder for our project. Open Visual Studio Code and create new folder named chatgpt-plugin-stable-diffusion
by running the following command in the terminal:
mkdir chatgpt-plugin-stable-diffusion
cd chatgpt-plugin-stable-diffusion
Quick Note: In order to plugin work properly, we need to go through following steps:
- Building an API (Flask, FastAPI, etc.) that implements the OpenAPI specification
- Creating the JSON manifest file that will define relevant metadata for the plugin
- Documenting the API in the OpenAPI yaml or JSON format
Step 2 - API implementation
As a first step we should implement the API. In this tutorial, I will use Flask, a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. However, you are free to use any other framework like: FastAPI, Django, Starlette, etc.
Let's create new file named app.py
. Open the terminal and run the following command:
touch app.py
It's time to install the required dependencies. Run the following command in the terminal:
pip install flask
pip install stability-sdk
Now, we can start implementing the API. First, we need to import the required dependencies:
from flask import Flask, request, jsonify, send_file, Response
from stability_sdk import client
import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation
import base64
Next, we need to set up the Flask app and the Stable Diffusion API client:
# Set up the Flask app
app = Flask(__name__)
# Set up the Stable Diffusion API client
stability_api = client.StabilityInference(
key='', # Replace with your Stability API key
verbose=True, # Set to True to enable verbose logging
engine="stable-diffusion-xl-beta-v2-2-2" # Replace with the engine you want to use
)
And last but not least, we need to define the API endpoint for generating images and method that starts a development server locally:
# Define the API endpoint for generating images
@app.route('/generate-image', methods=['POST'])
def generate_image():
# Get the prompt and other parameters from the request
data = request.json # Get the request payload
prompt = data.get('prompt') # Prompt to generate the image from
seed = data.get('seed', None) # Set to None to use a random seed
steps = data.get('steps', 30) # Number of steps to run the diffusion for
cfg_scale = data.get('cfg_scale', 8.0) # Scale of the diffusion model
width = data.get('width', 512) # Width of the generated image
height = data.get('height', 512) # Height of the generated image
samples = data.get('samples', 1) # Number of samples to generate
# Generate the image using Stable Diffusion
answers = stability_api.generate( # Call the generate() method
prompt=prompt,
seed=seed,
steps=steps,
cfg_scale=cfg_scale,
width=width,
height=height,
samples=samples,
sampler=generation.SAMPLER_K_DPMPP_2M
)
# Retrieve the generated image(s) from the response
# Extract and encode the generated image(s) so that they can be easily transmitted in the API response as a JSON object
generated_images = []
for resp in answers:
for artifact in resp.artifacts:
if artifact.type == generation.ARTIFACT_IMAGE:
encoded_image = base64.b64encode(artifact.binary).decode('utf-8') # encodes the binary data of the image using Base64 encoding, then decoded to a UTF-8 string using
generated_images.append(encoded_image)
# Return the generated image(s) as the API response
return jsonify(images=generated_images)
# Run the Flask app locally
if __name__ == '__main__':
# Set debug=True to enable auto-reloading when you make changes to the code
app.run(debug=True, host='127.0.0.1', port=5000)
Quick Note: The debug=True
argument enables automatic reloading of the server when changes are made to the code, which is useful during the development process. The host='127.0.0.1'
argument sets the IP address for the server to localhost, and the port=5000
argument sets the port number for the server to listen on.
Perfect! Now we can run the Flask app and test the API endpoint:
Create new test.py
file and copy/paste following code:
import requests
import json
import base64
# Define the API endpoint URL
url = 'http://127.0.0.1:5000/generate-image'
# Set up the request payload
payload = {
'prompt': 'a donut',
'seed': 992446758,
'steps': 30,
'cfg_scale': 8.0,
'width': 512,
'height': 512,
'samples': 1
}
# Send the POST request to generate the image
response = requests.post(url, json=payload)
# Check the response status code
if response.status_code == 200:
# Get the generated images from the response
data = response.json()
generated_images = data['images']
# Process the generated images
for i, encoded_image in enumerate(generated_images):
# Decode the base64-encoded image
decoded_image = base64.b64decode(encoded_image)
# Save the image to a file
image_filename = f'generated_image_{i}.png'
with open(image_filename, 'wb') as image_file:
image_file.write(decoded_image)
print(f'Saved generated image {i+1} as {image_filename}')
else:
print(f'Request failed with status code {response.status_code}')
Don't forget to install requests
library by running pip install requests
in your terminal.
Now, run the Flask app: python app.py
, make sure the Flask app is running correctly. Then, open new terminal window and run the test script: python test.py
, after a couple of seconds you should see the generated image in your folder.
Congratulations! We've successfully built the API for image generation using Stable Diffusion. Feel free to explore further and experiment with different prompts and parameters to unleash the full potential of image generation using Stable Diffusion.
Step 3 - Plugin manifest
Every plugin requires a ai-plugin.json
file, which will be hosted later, on the API’s domain. Read more here.
Create new file named ai-plugin.json
and paste following code:
{
"schema_version": "v1",
"name_for_human": "Image Generation Plugin",
"name_for_model": "image-generation",
"description_for_human": "Plugin for generating high-quality images using Stable Diffusion.",
"description_for_model": "Plugin for generating high-quality images using Stable Diffusion.",
"auth": {
"type": "none" # No authentication required, but you can use "api_key" or "oauth2" instead
},
"api": {
"type": "openapi",
"url": "http://127.0.0.1:5000/openapi.yaml" # URL to the OpenAPI specification
},
"logo_url": "http://127.0.0.1:5000/logo.png", # URL to the plugin logo
"contact_email": "[email protected]",
"legal_info_url": "https://example.com/legal"
}
Step 4 - OpenAPI specification
The OpenAPI specification is a standard for describing REST APIs. It is used to define the API that the plugin will use to communicate with the model. Read more here.
Create new file named openapi.yaml
and copy/paste following code:
openapi: 3.0.1
info:
title: Image Generation Plugin
description: A plugin that generates high-quality images using Stable Diffusion.
version: "v1"
servers:
- url: http://127.0.0.1:5000 # URL to the Flask app
paths:
/generate-image:
post:
operationId: generateImage
summary: Generate an image
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/generateImageRequest"
responses:
"200":
description: OK
content:
application/json:
schema:
$ref: "#/components/schemas/generateImageResponse"
components:
schemas:
generateImageRequest:
type: object
required:
- prompt
properties:
prompt:
type: string
description: The prompt for image generation.
required: true
seed:
type: integer
description: The seed for deterministic generation.
generateImageResponse:
type: object
properties:
images:
type: array
items:
type: string
description: The generated image(s) in base64 format.
Now, go back app.py
file and add endpoints for plugin's logo, manifest and OpenAPI specification:
# Define the API endpoint for the plugin logo
@app.route('/logo.png', methods=['GET'])
def plugin_logo():
filename = 'logo.png'
return send_file(filename, mimetype='image/png')
# Define the API endpoint for the plugin manifest
@app.route('/ai-plugin.json', methods=['GET'])
def plugin_manifest():
host = request.headers['Host']
with open('./ai-plugin.json') as f:
text = f.read()
return Response(text, mimetype='text/json')
# Define the API endpoint for the OpenAPI specification
@app.route('/openapi.yaml', methods=['GET'])
def openapi_spec():
host = request.headers['Host']
with open('./openapi.yaml') as f:
text = f.read()
return Response(text, mimetype='text/yaml')
All done! Now that you have all of this done you've completed all of the four steps you have a server that's up and running with your functionality ready to be talked to now comes the step of actually integrating it with ChatGPT.
Here is the full code of app.py
file:
from flask import Flask, request, jsonify, send_file, Response
from stability_sdk import client
import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation
import base64
# Define the API endpoint for generating images
@app.route('/generate-image', methods=['POST'])
def generate_image():
# Get the prompt and other parameters from the request
data = request.json # Get the request payload
prompt = data.get('prompt') # Prompt to generate the image from
seed = data.get('seed', None) # Set to None to use a random seed
steps = data.get('steps', 30) # Number of steps to run the diffusion for
cfg_scale = data.get('cfg_scale', 8.0) # Scale of the diffusion model
width = data.get('width', 512) # Width of the generated image
height = data.get('height', 512) # Height of the generated image
samples = data.get('samples', 1) # Number of samples to generate
# Generate the image using Stable Diffusion
answers = stability_api.generate( # Call the generate() method
prompt=prompt,
seed=seed,
steps=steps,
cfg_scale=cfg_scale,
width=width,
height=height,
samples=samples,
sampler=generation.SAMPLER_K_DPMPP_2M
)
# Retrieve the generated image(s) from the response
generated_images = []
for resp in answers:
for artifact in resp.artifacts:
if artifact.type == generation.ARTIFACT_IMAGE:
encoded_image = base64.b64encode(artifact.binary).decode('utf-8')
generated_images.append(encoded_image)
# Return the generated image(s) as the API response
return jsonify(images=generated_images)
# Define the API endpoint for the plugin logo
@app.route('/logo.png', methods=['GET'])
def plugin_logo():
filename = 'logo.png'
return send_file(filename, mimetype='image/png')
# Define the API endpoint for the plugin manifest
@app.route('/ai-plugin.json', methods=['GET'])
def plugin_manifest():
host = request.headers['Host']
with open('./ai-plugin.json') as f:
text = f.read()
return Response(text, mimetype='text/json')
# Define the API endpoint for the OpenAPI specification
@app.route('/openapi.yaml', methods=['GET'])
def openapi_spec():
host = request.headers['Host']
with open('./openapi.yaml') as f:
text = f.read()
return Response(text, mimetype='text/yaml')
# Run the Flask app locally
if __name__ == '__main__':
# Set debug=True to enable auto-reloading when you make changes to the code
app.run(debug=True, host='127.0.0.1', port=5000)
Open https://chat.openai.com/, go over to the
plugin store>
develop your own pluginand then click
my manifest is ready, copy/paste the base link to our app which is
https://127.0.0.1:5000` in our case. Then click find manifest file
> next
> install for me
> continue
and install plugin
.
WooW! Our plugin installed and ready to be used. Now use them as you want!
Conclusion!
In this tutorial, we have explored the process of building a ChatGPT Plugin for image generation using Stable Diffusion. By leveraging the power of Stable Diffusion, we can enhance ChatGPT's capabilities to generate realistic and diverse images based on single textual prompts.
By the way, plugins play a crucial role in expanding the functionality of ChatGPT, allowing it to interact with external applications and APIs.
Thank you for following along with this tutorial, and I hope you found it valuable.
made with 💜 by abdibrokhim for lablab.ai tutorials.