Apple isn’t currently one of the top players in the AI game, but its new open-source AI model for image editing shows what it can contribute to the field. This model, called MLLM Guided Image Editing (MGIE), uses a multimodal large-scale language model (MLLM) to interpret text-based commands when manipulating images. In other words, the tool has the ability to edit photos based on text entered by the user. Although it’s not the first tool to do so, “human instructions may be too short to be captured and followed by current methods,” reads the project’s paper (PDF).
The company developed MGIE in collaboration with researchers at the University of California, Santa Barbara. MLLM has the power to transform simple or vague text prompts into more detailed and clear instructions that photo editors themselves can follow. For example, if a user wants to edit a photo of a pepperoni pizza to make it “healthier,” MLLM will interpret that as “add veggie toppings” and can edit the photo that way.
In addition to making extensive changes to images, MGIE also lets you crop, resize, and rotate photos, as well as improve brightness, contrast, and color balance through text prompts. You can also edit specific areas of your photo, such as changing the hair, eyes, or clothing of people in your photo, or removing background elements.
as venture beat Apple released this model through GitHub, but those interested can also try out the demo currently hosted on Hugging Face Spaces. Apple has not yet said whether it plans to use what it learns from this project in tools and features it can incorporate into its products.