Crafting the Future of Path of Image-Generating AI: An Exclusive Interview with Silicon Valley’s AI Visionary, Xiaobai Ji

DXPOS
2 min readApr 14, 2024

In the realm of AI, 2023 marked a breakthrough in image and video generation, spearheaded by open-source technologies like Stable Diffusion. Beyond the conversational capabilities of large language models like ChatGPT, Stable Diffusion, MidJourney, and Sora have begun to render lifelike images and videos.

The rapid advancement of technology has sparked concerns about the commercialization of image generation AI. “When will these technological breakthroughs translate into productivity? Is the ultimate role of AI merely to create images and videos?” These questions prompted a trip to Silicon Valley for insights from image-generation AI entrepreneurs and human-computer interaction experts like Xiaobai Ji.

Xiaobai, a long-term entrepreneur in the field and an employee at a leading social media company in Silicon Valley, emphasized the nascent state of AI. “People harbor unrealistic fantasies about AI, imagining it has human-like thinking, emotions, and can replace humans in tasks like writing or even forming relationships,” Xiaobai explained. He clarified that current AI merely abstracts and recombines humanity’s collective wisdom, lacking true creativity, especially in the image domain, which is why copyright disputes frequently arise.

Unlike ChatGPT, humans cannot directly provide images to AI, leading to a sensitivity towards the outcomes of AI-generated images. “Humans can’t use images to communicate directly, unlike how we use words and sounds. That’s why platforms like ChatGPT gained popularity quickly; they’re easy to use without prior training, making the importance of human-machine interaction less critical,” Xiaobai stated.

For image-generating AI, creating an intuitive human-machine interaction experience that reduces barriers to use and accurately captures human requests is essential. Xiaobai also highlighted the complexity of human-machine interaction in this field, ranging from generating images through text to more sophisticated methods involving multiple interactions.

Discussing the commercial potential of image-generating AI, Xiaobai noted the shift from consumer-focused (B2C) to business-oriented (B2B) markets. “The challenge in commercializing general AI tools for image generation lies in their inability to meet specific consumer needs, which limits their market ceiling,” he remarked.

In the near term, Xiaobai advises focusing on low average revenue per user in the consumer market and high-frequency needs in specific vertical business markets. “AI is meant to replace repetitive and costly human labor, not to create luxury goods,” he commented, underscoring the importance of understanding specific industry needs for effective B2B solutions.

Looking ahead, Xiaobai sees video generation as the next frontier, predicting that it will reach commercial viability by 2025. “Video generation will likely transform the entertainment industry, offering tools for visual effects and detail correction that are currently unachievable by traditional methods,” he projected.

Concluding the interview, Xiaobai expressed optimism for the long-term collaboration between AI and human creativity, particularly in photography and visual arts. “AI will never replace the subjective expression inherent in photography, which is often a reflection of the photographer’s emotions and thoughts,” he concluded, highlighting the enduring value of human creativity in an increasingly automated world.

Follow Xiaobai for more insights on his LinkedIn: https://www.linkedin.com/in/xiaobaiji/

--

--