spring AI (7) Wenshengtu DALLE3 + SD

Spring AI supports the generation of text-to-image. It is based on existing solutions. The code for generating text-to-image is quite simple, only about 5 lines. However, let's still record it.

DALLE3#

This is the text-to-image model released by OpenAI. It is basically the most commonly used one. It is considered good compared to professional SD and MJ models. The key point is that it is simple. You don't need to think about prompt words, just describe it in natural language.

Since the dependencies and configurations for OpenAI have been set up before, there is no need to make any changes.

First, create an OpenAiImageModel object.

private final OpenAiImageModel openAiImageModel;

Then, just copy the official example here.

ImageResponse response = openaiImageModel.call(
        new ImagePrompt("A light cream colored mini golden doodle",
        OpenAiImageOptions.builder()
                .withQuality("hd")
                .withN(4)
                .withHeight(1024)
                .withWidth(1024).build())

);

Here are some main parameters explained when building the model:

Parameter	Explanation
Model	The model, default is DALL_E_3
Quality	The quality of the generated images, only supported by dall-e-3
N	The number of generated images
Width	The width of the generated images. For dall-e-2, it must be one of 256x256, 512x512, or 1024x1024. For dall-e-3, it must be one of 1024x1024, 1792x1024, or 1024x1792
Height	The height of the generated images
Style	The style of the generated images, must be either vivid or natural, where the former is more realistic and the latter is more natural. Only supported by dall-e-3

Finally, return the image.

/**
     * Call OpenAI's dall-e-3 to generate an image
     * @param message The prompt
     * @return ImageResponse
     */
    @GetMapping(value = "/openImage", produces = "text/html")
    public String openImage(@RequestParam String message) {
        ImageResponse imageResponse = openAiImageModel.call(new ImagePrompt(message,
                OpenAiImageOptions.builder()
                        .withModel(OpenAiImageApi.DEFAULT_IMAGE_MODEL)
                        .withQuality("hd")
                        .withN(1)
                        .withWidth(1024)
                        .withHeight(1024).build())
        );

        String url = imageResponse.getResult().getOutput().getUrl();
        System.err.println(url);
        return "<img src='" + url + "'/>";
    }

OpenAI provides two ways to return the image: either the URL of the image or the BASE64 encoding of the image.

Stability AI#

At first, I thought Stability AI could use the local Stable Diffusion API for image generation. However, I found out that it is not possible. Stability AI is an online drawing platform launched by Stability, which is different from Stable Diffusion.

First, register an account on the Stability AI official website to get free credits. After completing the registration, copy the API KEY.

Then, add the dependency.

 <!--  stability dependency   -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-stability-ai-spring-boot-starter</artifactId>
        </dependency>

And configure it.

spring:
  ai:
    stabilityai:
      api-key: sk-xxx

Then, call it in the same way.

/**
     * Call Stability AI to generate an image
     * @param message The prompt
     * @return ImageResponse
     */
    @GetMapping(value = "/sdImage", produces = "text/html")
    public String sdImage(@RequestParam String message) {
        ImageResponse imageResponse = stabilityAiImageModel.call(
                new ImagePrompt(message,
                        StabilityAiImageOptions.builder()
                                .withStylePreset("cinematic")
                                .withN(1)
                                .withHeight(512)
                                .withWidth(768).build())

        );

        String b64Json = imageResponse.getResult().getOutput().getB64Json();
        String mimeType = "image/png";
        String dataUrl = "data:" + mimeType + ";base64," + b64Json;

        return "<img src='" + dataUrl + "' alt='Image'/>";

    }

Please note that:

For Stability AI, please use English prompts. Chinese prompts may not work.
Stability AI only returns the image in base64 format, and the URL is null.

Zhipu#

I also tried another AI drawing model called Zhipu, but I found out that the free credits provided after registration only support the dialogue model, not the text-to-image model. Moreover, to try the latest model for free, real-name authentication is required... It's a bit unreasonable, so I gave up.