Aleksi Sahala: “A few words about generative AI (Warning-O-Wall-O-Text). Lately I have been playing with text-to-image and image-to-video models and created a few more or less goofy background videos for my music, mostly because it’s fun, but as a positive side note, it seems to positively affect people’s attention when it comes to watching music reels. Static images are way more often skipped in a few seconds in comparison to videos.
Using AI has invoked mixed reactions (although generally positive). People from demoscene or with graphics design/3D-animation background have clearly a negative sentiment. The reasons for this seem to be mostly (1) the fluctuating quality in comparison to the state-of-the-art productions and (2) that creating such things is too easy and does not require a 2000+ hour time investment, nor the effort to create the actual content. The latter is in my opinion just as ridiculous argument, as would be telling someone that they should learn how to play French horn instead of using a synthesizer or a plugin for it in music production.
I personally feel that the excessively negative attitudes come from the fear of becoming obsolete and being concerned that the stupid masses won’t give the deserved respect to the people who do it “the proper way”. Before, I have raised similar concerns and whined about AI-generated music, and I’ve been pissed off when someone (who apparently had no clue about how tracker music is made) was not impressed at all about my 4-channel song written in a 1990s software. He told me that it sounded muddy and amateurish in comparison to other chip tunes (that were obviously written in modern DAWs without any technical limitations). The lesson learned here was that many people only care about the end product and couldn’t care less how much skill, hacking and creativity you put into making it.
In a few years AI will rival humans in creativity (yes, creativity) and production. It will generate videos that are almost on par with the quality that the top digital artists can create, and the same will eventually apply to music. It will become more and more difficult to recognize what is generated and what is not, and people will be able to simply hide the fact that their creations were AI-generated. This will ruin digital arts competitions, and it is obvious that it will hurt several people whose income depends on music or graphics, as everyone with access to the right tools can create what they want without the necessary skill set or even imagination, instead of having to order them from someone with the skills to create it from scratch.
All digital arts will be heavily affected by AI, and it is everyone’s freedom to either whine about it, or cope with it.
After being a whiner about Suno generated music for some years, I have changed my opinion recently. I personally think that these tools are great. They open creative avenues to people who simply do not have the time or interest in putting thousands of hours to learn these arts “the proper way”, but who want to create things. I think that the fear of “uncreative” people taking over the digital arts is not a relevant concern, since why would someone with no interest in arts create art even if it was easy? (After all, using these models costs real money or requires a fairly high technical knowledge to set up locally). The fact is that many things have been fairly easy to do since modern computers became available. Creating a techno track from loops has been literally behind a two-hour learning curve for two decades, and anyone can throw raw eggs on a canvas and call it modern art.
I now see AI as a tool that will primarily cut corners from people who love arts but do not like certain parts of the process or want to diversify their creations. If someone’s main concern about generative AI is that people will be doing stuff “too easily”, then it is a you problem. You can do it easy too if you want to. If it’s losing your job or the environmental impact of all these models generating useless content around the clock, I can understand you.
I can put 20 hours of my free time into just writing melodies that I never use or spend a night writing a song for Harpsichord that will get 500 listens in 10 years. I do it because it’s fun, and this is why most people do art. If you have fun doing it the traditional way or with generative tools, it really doesn’t matter. Personally, I would never replace the music composition and writing with an AI, but if someone wants to do it because they suck at writing melodies, I can totally understand. On the contrary, I would gladly outsource the engineering, mixing and mastering of songs to an AI model if it actually did a good job, since I never really liked any of those aspects of music production.
Anyway, I’m really looking forward to image and video generators to improve and to give users more control over the creative process. I believe that once these tools become more flexible, there will be a learning curve to use them efficiently, and people will begin to notice the “human skill gap” in different generated productions, just as you can notice a bad writer, cinematographer or director in movies. These days the process is fairly tedious because the models are extremely inconsistent and your credits will be consumed, were the result as desired or not. First you have to deal with the image generation, often Photoshop them, use another AI tools to change the aspect ratio (which is again a random process, if you do expansive rather than reductive resolution changes), and then create the video itself from 5-10 second clips and sew them together.”