Virtual Makeup Options: An In-Depth Look

I’ve spent the last 6 months with a hands-on look at the intersection between AI and makeup. ACON AI’s first mobile app, Mirror Mirror, uses advanced AI techniques for skin and facial analysis as well as combing the latest trends to answer users’ questions and create style recommendations. There’s also a feature where, after a look is recommended, we show the user what they might look like with the makeup applied to their face.

Today, I’ll go in-depth about that virtual makeup preview image – how it’s created, the various AI technologies we considered, and the plusses and minuses of each approach. As a general note, makeup is much harder to handle well than other virtual try-on categories such as eyeglasses, bags, or outfits. The way makeup interacts with the human face, combined with the sheer number of makeup options (eyeliner, concealer, lipstick, etc.) and the complexity and importance we place on facial features, makes this a particularly challenging task.

Virtual Makeup Options: The Cheap, Fast, Good Matrix

Technology	Cheap?	Fast?	Good Quality?	Best For	Key Limitations
Virtual Try-On	✅ Yes	✅ Yes	⚠️ Adequate (for simple applications)	Single products, simple applications	Looks artificial for complex styles
Makeup Transfer (GANs)	⚠️ Mixed ($.01 per image)	❌ No (20 to 60 seconds)	✅ Yes (for trained looks only)	Predefined, specific makeup styles	Requires existing “from” image; uneven results with new styles
Generic Image Generation	❌ No ($0.04/image)	❌ No (20+ seconds)	✅ Yes	Visualizing styles on generic faces	Not on the user’s own face
Doppelganger Generation	❌ No (very expensive)	❌ No (hours to train)	✅ Yes	High-quality highly personalized	Extremely slow initial setup
Image Generation + Transfer	❌ No ($.07)	❌ No (40+ seconds)	⚠️ Mixed (60% great, 30% OK, 10% “zombie”)	Customizable looks with moderate quality	Inconsistent results; costly; slow
Doppelganger + Transfer	❌ No (extremely expensive)	❌ No (extremely slow)	✅ Yes (lifelike)	Premium applications with no time/cost constraints	Prohibitively slow and costly for most applications
Newer AI Models (Gemini)	⚠️ Mixed	❌ No (15-20 seconds)	⚠️ Mixed (inconsistent)	Experimental applications	Privacy restrictions; unreliable results
Mirror Mirror DIY Solution	✅ Yes	✅ Yes	⚠️ Adequate	Basic visualization integrated with recommendations	“Paint by numbers” look, simplified capabilities

Understanding the Options: Virtual Try-On vs. Makeup Transfer

It seems like every beauty website has virtual try-on for their makeup, but if you’ve tried these features, you’ve probably been underwhelmed by the results. On the other hand, sites like Photo.AI show what you’d look like with a series of amazingly realistic looks. These differences are due to the trade-offs each business made to meet their needs.

Virtual Try-On: Paint by Numbers

“It looks like a third grader colored it on with crayon.” – Virtual Try-on Tester

Virtual try-on is what you see on most beauty websites. You take a selfie and it shows you what the eyeshadow or lipstick would look like on you. This approach works reasonably well for single products but falls short for full makeup looks. For instance, if you’re looking for a new no-makeup makeup look, it might take a half hour or more to pick through all the individual products – blush, eyeshadow, concealer, etc. – and pull them all together. Even then, like a paint-by-number picture, all areas tend to have crisp definition between them, creating an artificial look.

This is because virtual try-on is, in fact, a sophisticated paint-by-number system. The processing works by mapping faces and assigning numbers to points on the face. For instance, the tip of the nose might be #1, the right edge of the right eye #2, and so on. Different face mappers use different points, but one of the major ones, Mediapipe, uses a type of AI called Machine Learning (ML) to map as many as 468 facial spots.

Just like in a paint-by-number kit, to apply makeup on the image, the program finds which of the 468 spots surround the feature in question and then puts a color on top of it. You can adjust the opacity of the color so that the image underneath shows through to a greater or lesser degree, but ultimately this is just an advanced paint-by-number approach.

The big advantages of Virtual Try-On are that it’s well-proven, quick, cheap, and relatively easy to implement.

The main disadvantage is that it does not handle complex looks well, often resulting in that “colored with crayon” appearance.

Makeup Transfer with GANs

A completely different approach is Makeup Transfer, wherein an AI is trained using Generative Adversarial Networks (GANs) to take the makeup from one person’s face and apply it to another. GANs are a fascinating concept where one AI (the generator) tries to create convincing fakes while another AI (the discriminator) tries to distinguish real items from fake ones. Ideally, the generator becomes so good at making fakes that the discriminator can’t tell the difference between real and generated images. Once trained, this system can effectively transfer makeup between two images by leveraging its ability to create realistic, transformed versions of faces.

Makeup Transfer can produce stunning results when using predefined looks but has several serious drawbacks. Like most trained models, it performs very well on specific tasks it was trained for while struggling with new scenarios it hasn’t encountered before.

Key limitations include:

It requires a “from” image that already has the makeup look you want. With an almost infinite number of possible makeup styles, this restriction can be quite limiting.
It can be quite slow and requires significant processing power. This also translates to costs approximately 100 times higher than virtual try-on.
When used with looks that are new to the model, the results can be uneven. For instance, it would often misinterpret skin color as makeup and inadvertently change the target image’s skin tone.

Generic Image Generation

If you’re just looking to see what a makeup style looks like, and it doesn’t need to be on your own face, then using ChatGPT or similar AI tools to create an image with the desired look produces excellent results. This approach costs about $0.04 per image when used in a commercial product and can take 20 or more seconds to generate, but the quality is typically outstanding. The obvious limitation is that you’re not seeing the makeup on your own face, which reduces its value for personal decision-making.

Doppelganger Image Generation

Another option is image generation that creates a doppelganger—a digital double of the person, reminiscent of the mythological entity that mimics humans—where the AI generates an image designed to closely resemble the original person. This is one of the technologies sometimes used in “deep fakes” (although transfer techniques are more often used). Sites like PhotoAI.com do doppelgangers extremely well and create very realistic images.

These can be quite slow to produce, in some cases requiring several hours to train the model based on a dozen or more pictures of the person, and there’s a reason Photo.AI charges $500 a year for use. If quality is paramount and you can accept slow processing and high costs, then this is a compelling option.

I will say that I wasn’t always pleased with the results from photoai.com. It turns out it’s not easy to generate an image that reliably looks like a specific person. Still, the average results far surpassed the average results of other methods.

Image Generation and Transfer

This is a combination approach in which an image is generated with the desired makeup and then transferred to the user’s image. As mentioned earlier, makeup transfer requires an image to transfer makeup from, and current AI image generation models do a great job of building realistic-looking images with makeup based on text prompts.

The closer the generated image is to the user’s image in terms of features and coloring, the better the transfer results will likely be. However, the fundamental problem remains that makeup transfer models tend to work very well for makeup styles they were trained on and less well, sometimes not at all, for styles they weren’t trained with.

In our testing, where we have extremely divergent recommendations, about 60% of the images looked great, 30% looked acceptable, but 10% looked like “the zombie apocalypse had hit.” Sometimes this was the model confusing skin tone with makeup, and sometimes it was misidentifying facial features. Over time, we tried a dozen different models, including several we trained ourselves, but none were consistently reliable.

Because both image generation and transfer require substantial computational resources, the combination is slower and more costly than simple virtual try-on.

Doppelganger + Transfer

One method that I’ve seen work extremely well is combining a doppelganger model with transfer technology. This is exceptionally computationally intensive and slow to create, but it can deliver remarkably lifelike results. The approach essentially creates a highly personalized base image and then applies precisely controlled makeup changes. The quality can be outstanding, but the processing requirements make it impractical for most consumer applications.

Newer AI Models

Some of the newer public AI models, such as Google’s Gemini 2.0 Flash Experimental model, have strong capabilities with images. The idea is simple: just give it an image, tell it what to do, and it’ll do it – sometimes. Because Google is extremely sensitive about privacy and concerned about misuse, especially with deep fakes, their models are very restrictive about what they will do with images, particularly when it comes to modifying them. Approximately 90% of my attempts to have this model make the makeup changes I wanted failed. The model didn’t explain why, but I assume it was triggering privacy protection filters.

Interestingly, when it did work, it was clear that the model was simply using a simplified virtual try-on approach in the background. Unfortunately, even when it successfully added makeup to the picture, there were often problems. In one case, the eye wings went up to the forehead instead of extending out to the side. In another, the eyelashes were extremely long. It typically took 15-20 seconds to return results.

We have tried numerous models to accomplish this task, and Gemini’s results, while disappointing, are actually better than most. I’m highly confident that AI will eventually handle this with a simple prompt, but the technology isn’t there yet.

Mirror Mirror’s Approach

Mirror Mirror specializes in image analysis and personalized recommendations; the revised image showing how makeup would look is a nice-to-have add-on feature for us. We needed a way to take a user’s image, combine it with recommendations based on skin analysis and current influencer trends, and generate a reasonable preview image quickly.

Our recommended looks are custom and vary widely – effectively infinite in variety – from simple lipstick to full Halloween costume makeup. As such, we needed a system that could be highly customizable on the fly without human intervention.

Cost was also a significant factor. For an amazing image quality, we might have been willing to pay a premium, but short of that, we wanted to keep costs low.

Initially, we thought that generating artificial images that looked somewhat like the user was the way to go. This approach displayed the recommended makeup very well. It was somewhat costly at $0.04 per image and slow at 30 seconds to generate, but I was personally impressed with the images. Unfortunately, our customers were not. They wanted to see the makeup on their own image, not on an idealized person that sort of looked like them (creating a more accurate likeness would have required far more computing power at a much higher cost).

So, we proceeded to evaluate what the major virtual try-on vendors had to offer. Of these, Banuba’s solution was by far the best – quick, reliable, and easy to use (although the API documentation was somewhat challenging to navigate as they offer many different products). There was a cost involved, which wasn’t cheap but also wasn’t prohibitive. However, we encountered a few major issues:

Customization was required, as all products have to be preloaded into their system. Since we have over 55,000 possible recommendations for individual products, and new ones are added regularly, this was logistically impractical and cost-prohibitive.
Testers complained that it looked like “a third grader put it on with crayon.” To be fair, Banuba’s offering looked substantially better than most competitors, and there were various adjustments that could be made to improve the appearance. These modifications, however, would have required significant implementation effort.
The technical integration wasn’t exactly aligned with our needs. We could make it work, but instead of being seamless and simple, it would have required careful customization.

Based on some promising initial trials, we ultimately went with the image generation and transfer method. Generating an image with the desired makeup and transferring it to the user’s photo had the significant advantage of being simple to implement – just generate the image, then pass it and the user’s image to the GAN.

The results were good enough that this approach was built into the Mirror Mirror app for several months. However, it suffered from some disadvantages: Because both image generation and transfer require substantial computing resources, it was taking about 30 seconds to process and costing $0.07 per image. The main problem was that about 10% of the time, the images looked terrible due to faulty transfer, as outlined earlier. We tried at least a dozen different models, including some we trained ourselves, but there was always a non-negligible percentage of photos that looked truly awful. It was concerning enough that we worried it would negatively affect users’ perception of our product.

Since these preview images are not a core feature of Mirror Mirror, we considered eliminating them entirely and focusing our resources on more critical aspects of the product. Instead, we decided to try a simplified approach.

We discovered a very basic example of a do-it-yourself virtual try-on system. It only supported eyeshadow and lipstick in one color (green), but it looked promising, and we thought we could build upon it to create something more comprehensive. Over time, we expanded it to include everything from concealer to eyelashes to lip liner. The result still has a somewhat “paint by numbers” appearance, but it’s specifically designed to work seamlessly with our recommendations and is both inexpensive and quick to process. The results don’t look as good as Banuba’s solution, but they meet our current needs. It’s cheap, fast, and good enough for our purposes.

Conclusion: Good, Fast, Cheap – Pick Two

There is a classic project management triangle which says that “In any endeavor, you can achieve excellence in two dimensions—quality, speed, or affordability—but must compromise on the third.” Of course, the real world is more complex: what does “fast” mean? How cheap is “cheap enough”? What constitutes “good” quality? We find it more useful to take a holistic approach to these tradeoffs.

For Mirror Mirror, if we found a solution that perfectly displayed our recommended looks and processed quickly, that would be a major advantage for our product, and we’d gladly pay a premium for it. Unfortunately, nothing currently fits that bill. Instead we went with fast, cheap and good enough.

The best solution for your specific needs will depend largely on what you’re trying to achieve:

If you want exceptional quality, don’t mind the cost, and are willing to wait hours for processing, check out Photo.AI and its doppelganger plus image transfer techniques. The results can be truly impressive.
If you have a limited set of predefined looks that you want to apply to images, and don’t need lightning speed, then makeup transfer using GANs is a good approach.
If you want to show custom looks but don’t need to display them on a specific person’s face, then image generation is straightforward and offers excellent quality. I prefer Stability AI’s tools, but Grok and Gemini also produce good results.
If you just need basic virtual try-on functionality, there are many vendors available, but I was most impressed by Banuba’s offerings.

Of course, the technology landscape is evolving rapidly and might be completely different six months from now. I would recommend starting by exploring the latest public models from Gemini, OpenAI, Grok, and Claude to see if any of them meet your requirements. It might be somewhat more expensive, but the simplicity of having a single solution would likely pay dividends over time in terms of integration and maintenance.

If you’re aware of other effective methods for applying complex makeup looks to images, please feel free to share them in the comments. I’m always looking for new approaches to enhance our virtual makeup experience.

Lowry On Leadership