A few days ago Stylegan3 was released to the public. Less than 24h later, nshepperd, integrated CLIP into Stylegan3 in this notebook, pleasing many of us that where willing to see it becoming real. I have been doing some tests, and this video above is one example of them. It was funny that the AI recognized who is Xena, and it made a decent work with it. I'll mention several AI artist around, that might be interested on this.
Hace pocos días Stylegan3 pasó a estar disponible para el público. Menos de 24h más tarde, nshepperd, integró CLIP y Stylegan3 en este notebook, complaciendo a unos cuantos de nosotros que esperábamos con muchas ganas ver esto hacerse realidad. He estado haciendo varios test, y este video es uno de ellos. Fue divertido que la IA reconozca quien es Xena, y que hiciera un buen trabajo con ella. Mencionaré algunos artistas de la IA que hay por aquí, creo que esto les puede interesar.
Testing Stylegan3+CLIP
AIs are evolving really quick, and Stylegan3 is one of the best examples. It's amazing what it does. I was missing some kind of control or interaction with it, so as I said above, felt very hyped when found this notebook, with a rudimentary implementation of CLIP. I shared the link to the original notebook, instead host a copy as I usually do. That's because I expect it gonna get many updates from its creator. When the creator will be done with it, I'll host a copy and do little improvements if need.
I ran several tests, wanted to know how the AI reacts to several text prompts, and target images as well. Here you have to choose between text or images, unlike the more evolved VQGAN and Diffusion notebooks around.
About the text prompt, is CLIP, so it works as usual. But a couple of things needs to be taken in consideration. Stlygan3 is focused in faces... so any prompt given should be keeping that in mind, although CLIP can understand whatever thing we give to it, Stylegan3 only knows how to make faces. Also, I have noticed that the AI doesn't seems to know a long collection of artists, that worked perfectly with VQGANs and diffusion models. According of what I read somewhere, it seems that this notebook is using a modified imagenet dataset, that's doesn't include everything as the original one. So I guess that is the reason for the issue I'm talking about.
About work with target images there is not much to say, the AI does a very decent work with them, more than any other AI. Of course, need to keep in mind that this AI only does faces properly. I made some tests using images from my previous portraits, and don't feel deceived.
I would like to remark how optimized is this notebook. Stylegan works with pretrained models, so to make images of 720x720 it just only consumed 5Gb of the GPU RAM. Just amazing!
For what I observed, this notebook follows two steps. The first one is just generate, almost instantly, 32 outputs and choose the one that matches better with the prompt. Then, the second one, so the AI starts to transform it to try to bring you the best result.
I have added ten results from my experiments, each one has notes/prompts below. They are as they came from the AI. Didn't want to edit them, as the goal of this post is who what the AI can do. Hope you like them, or at least they are useful as guidance. Regards, and see you in my next post!
Las IA's están evolucionando muy rápido, y Stylegan3 es uno de los mejores ejemplos. Es increíble lo que hace. Estaba echando de menos alguna forma de controlar o interactuar con ella, y cómo dije más arriba me sentí muy entusiasmado cuando encontré este notebook, con una implementación de CLIP bastante rudimentaria. He compartido el enlace al notebook original, al contrario de una copia como hago habitualmente. Esto es por que espero que el creador lo vaya actualizando. Cuando ya esté acabado, alojaré una copia y le iré haciendo pequeños mejoras, si es necesario.
Hice varios test, ya que quería saber como la IA reacciona a prompts distintos y también imágenes objetivo. hay que elegir entre texto o imágenes, al contrario que en otros notebooks que trabajan con VQGAN y Diffusion, que están más desarrollados.
Acerca de usar textos, es CLIP, así que funciona como siempre. Pero hay que tener un par de cosas en cuenta. Stylegan3 es para hacer caras... así que cualquier prompt debe ser pensado teniendo en cuenta eso, ya que aunque CLIP entiende cualquier cosa que le digas, Stylegan3 sólo sabe hacer caras. También me he dado cuenta, de que no reconoce una larga lista de artistas. Por lo que he leído en algunos sitios, parece que este notebook esta usando un versión modificada de imagenet dataset, que no incluye todo lo que trae el original. Así que supongo que esta es la razón de que suceda lo que he dicho.
No tengo mucho que decir acerca del trabajo que la IA hace con imágenes objetivo, el resultado es más que decente, mucho más que el de otras IAs. Por supuesto, hay que recordar que esta IA sólo sabe hacer caras de una forma apropiada. Hice varios test con imágenes de mis anteriores retratos, y no me siento decepcionado.
Quiero remarcar lo bien optimizado que está este notebook. Stylegan usa modelos pre entrenados, así que para hacer imágenes de 720x720, sólo ha consumido 5Gb de RAM de la GPU. Maravilloso!
Por lo que he podido ver, este notebook sigue dos pasos. El primero es generar, casi instantáneamente, 32 resultados y escoger el que se ajusta más al prompt. El segundo paso, es transformar dicha imagen, intentando dar un resultado que se ajuste lo más posible a lo que hemos pedido..
He añadido diez resultados de mis experimentos, cada uno de ellos con notas/prompts debajo. Son tal cual la AI los ha creado. No he querido editarlos, pues la misión de esta publicación es mostrar lo que la IA puede hacer. Espero que os gusten, o que al menos sean útiles como guía. Saludos,¡y nos vemos en mi siguiente publicación!.
Notebooks
Enhanced Original VQGAN+CLIP Notebook
Mse regularized Modified VQGAN+CLIP Notebook
VQGAN+CLIP with Video Features Notebook
AI Artworks
This is a "normal" face created in the first step.
Esto es una cara "normal" de las que son creadas en el primer paso.
"A guy rendered in unreal engine" is the prompt for this one.
"A guy rendered in unreal engine" es el prompt para esta imagen.
"A dark priestess" is the one I used for this.
"A dark priestess" es el que he usado para esta.
Let's try to mix the two prompts above, "A dark priestess rendered in unreal engine".
Probemos a mezclar los dos prompts que hemos usado antes, "A dark priestess rendered in unreal engine".
Here I used "A pilgrim".
Aquí he usado "A pilgrim".
And here used "A woman by Francisco de Goya".
Y aquí he usado "A woman by Francisco de Goya".
What about "Xena, the Warrior Princess"? You already met her in the header of this post.
Qué tal "Xena, the Warrior Princess"? Ya la habéis conocido en la cabecera de esta publicación..
And to finish, some images where I used some of my previous portraits as target images.
Y para acabar, algunas imágenes que he creado usando retratos que tenía hechos como imágenes objetivo.
All the content of this post is from my own.
Todo el contenido de este post es de mi autoría.
Images generated with Stylegan3+CLIP, served without anything else.
Imágenes generadas Stylegan3+CLIP, servidas sin guarnición.
100% AI free writing.
Textos 100% libres de IA.
We are on the same vibe ;-) I hope we will see your immortals moving their heads soon !
Yeah, I could do it, but at end have chose Xena this time... get her was amusing to me hehe
By the way, you mentioned Moebius in another comment directed to me... I wish I could get him working properly haha. I just get ugly stuff every time I tried him. So my portraits are currently using Enki Bilal.
Thanks for the support, take this !PIZZA!
@jotakrevs, sorry! You need more to stake more $PIZZA to use this command.
The minimum requirement is 20.0 PIZZA staked.
More $PIZZA is available from Hive-Engine or Tribaldex
Hi @jotakrevs. I too did some experiments soon after @dbddv01 had written a post about it.
You will notice that even if you increase the resolution to 1024 x 1024, there is no increase in VRAM usage or increase in iteration time. I did not increase the resolution beyond that. Thanks for mentioning us :)
Oh thanks for the info! I wonder how much resolution can bear the P100 that I'm getting usually 🤔
thanks for stop here to comment, grab this !PIZZA!
@jotakrevs, sorry! You need more to stake more $PIZZA to use this command.
The minimum requirement is 20.0 PIZZA staked.
More $PIZZA is available from Hive-Engine or Tribaldex
I wonder too🤔. Thank you for the Pizza :)
The output illustrations of the face are really good. I had some run with faces too but I can't still make it cohesive. I did follow your hints last time and read @dbddv01 approach to faces in his blog. By the way, thanks for sharing this tool. !discovery 20
This new notebook makes it very easy... I didn't need almost to work in the target images to get decent portraits from them. Thanks for the comment and the curation!
Take this !PIZZA!
@jotakrevs, sorry! You need more to stake more $PIZZA to use this command.
The minimum requirement is 20.0 PIZZA staked.
More $PIZZA is available from Hive-Engine or Tribaldex
This post was shared and voted inside the discord by the curators team of discovery-it
Join our community! hive-193212
Discovery-it is also a Witness, vote for us here
Delegate to us for passive income. Check our 80% fee-back Program
Congratulations @jotakrevs! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :
Your next target is to reach 5000 upvotes.
You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP
To support your work, I also upvoted your post!
I played with Stylegan3 already too. Will post my results later. Thanks for Clip notebook! !PIZZA
Awesome! I'm curious to see them!
@aemile-kh, sorry! You need more to stake more $PIZZA to use this command.
The minimum requirement is 20.0 PIZZA staked.
More $PIZZA is available from Hive-Engine or Tribaldex
I'm playing with the notebook and the format is a bit different than the usual ones. How did you make your video with all the different transitions?
Wow, awesome question haha... well I have not clue about video edition software so I copy pasted some stuff from the internet in a python script to make videos, and then used the tar files one download with the notebook. Sadly it runs locally in my machine, is not in a Colab notebook.
Que resultados tan locos, están geniales 👌👌
Muchas gracias! Toma tu !PIZZA!
PIZZA Holders sent $PIZZA tips in this post's comments:
@jotakrevs(3/10) tipped @kitzune (x1)
Learn more at https://hive.pizza.