Adobe says it has worked closely with professional video creators to advance the Firefly Video Model with a particular emphasis on generative editing, reports Adrian Pennington.

Adobe has progressed its generative AI video product with further tools to enhance the editing process launching in beta.

First previewed earlier this year, the Firefly video model’s capabilities launching in beta this week are designed to help video editors fill gaps in a timeline and add new elements to existing footage.

“Video is hard,” says Alexandru Costin, Vice President, Generative AI and Sensei at Adobe. “We’ve been working on it for a while but we weren’t happy with the quality. Our research department has done wonders to handle 100 times more pixels, 100 times more data using thousands and thousands of GPUs to make sure we master the research. We’re now proud of the quality we’re delivering.”

Read more Adobe’s Deepa Subramaniam: How AI video will shake-up post production

A 17-year Adobe veteran, Costin is charged with building and integrating generative AI models into the firm’s creative tools, and building AI and ML data, training and inference technologies and infrastructure for Adobe Research. He has helped launch generative AI models for imaging, vectors and design, which are integrated into products accessible from Creative Cloud.

Index image (1)

The company says some 13 billion images have been created using Firefly to date

The company says some 13 billion images have been created using Firefly to date. Customers like toy maker Mattel are using it to refine packaging ideas. Drinks brand Gatorade just activated a marketing campaign which encourages customers to design their own virtual bottles using a version of Firefly on its website.

Now the focus is on video generation using text-to-video and image-to-video prompts. Adobe customers though want to use AI smarts to speed up and improve editing rather than for pure video generation.

“Video was a big ask from our customers since video is now a very prevalent medium for content creation,” Costin says. “The most use we get from a Firefly image is Generative Fill [in which users can add, remove, or modify images using simple text prompts inside Photoshop] because we’re serving an actual customer workflow. More than 70% of our use cases for Firefly are in editing versus pure creation. Generative editing is the most important thing our customers are telling us in terms of what they need.”

Generative Extend

Generative editing essentially means helping video creators extend and enhance the original camera footage they already have.

Costin explains: “Most video post-production is about assembling clips, making sure you match the soundtrack and the sounds with the actual clips. One big problem customers have is that sometimes they do not have the perfect shot and cannot match it up with the audio timeline.”

Generative Extend in Premiere Pro is a new tool in beta that allows users to extend any clip by several seconds to cover gaps in footage, smooth out transitions, or hold on shots longer. Not only is the video extended but so too is the audio track.

“We’re extending the ambient ‘room tone’ to smooth out audio edits. You can even extend sound effects that are cut off too early. It’s an amazing technology. We’ve already heard from customers that they’re super excited about this application.”

Screenshot 2024-10-14 110903

Available in beta are Firefly-powered Text-to-Video and Image-to-Video capabilities

Generative Extend won’t create or extend spoken dialogue, so it’ll be muted. Music is also not supported due to potential copyright issues, but you can automatically lengthen and shorten tracks with the existing Remix tool.

Watch now Adobe reveals the latest updates

Also available in beta are Firefly-powered Text-to-Video and Image-to-Video capabilities. The former includes generating video from text prompts, accessing a wide variety of camera controls such as angle, motion and zoom to finetune videos and referencing images to generate B-Roll that fills gaps in a timeline. With Image-to-Video, you can also utilise a reference image alongside a text prompt to create a complementary shot for existing content, such as a close-up, by uploading a single frame from your video. Or you could create new b-roll from existing still photography.

Costin reiterates the importance of Adobe’s “editing first” focus. “We want to make sure customers bring their own assets and use generative AI to continue editing those assets or derive new videos from images because we’ve heard this is the most important thing they need.”

Control is another important attribute that creators are asking for.

Screenshot 2024-10-14 110915

Within the Firefly video application users can already write a detailed prompt to which is now added a “first wave of control mechanisms” for calibration of the shot size, motion and camera angle

Controlled prompts

“Our customers are very picky. They want to be able to control the virtual camera and make sure that their Prompt is well understood. They want to make sure we can generate high-quality videos that they can actually use not only in ideation but also in production.”

Within the Firefly video application users can already write a detailed prompt to which is now added a “first wave of control mechanisms” for calibration of the shot size, motion and camera angle.

“Those are very important control points that will help video creators to generate new clips using image-to-video or text-to-video to basically direct their shots so they can tell the story they want. We have many more research capabilities in control to come but we’re very proud of this first step and we’re going to keep investing in it.”

Another generative editing use case is for overlays in which editors can add visual depth to existing footage by overlaying atmospheric elements like fire, smoke, dust particles and water inside Premiere Pro and After Effects.

“We’re also focusing our model to learn both imaginary worlds and the real world so that the quality [of output] of the imaginary worlds is as high as for the real world.”

9cc55c2d-6090-4f97-9e2c-517cd2d01ea2

Another generative editing use case is for overlays in which editors can add visual depth to existing footage by overlaying atmospheric elements like fire, smoke, dust particles and water inside Premiere Pro and After Effects

You can even change the original motion or intent of your shot in some cases. For example, if your clip has a specific action and you’re an editor who wishes to pitch a reshoot to a director, you can help to visualise how the update will help the story while maintaining the same look.

Or if production misses a key establishing shot you can generate an insert with camera motion, like landscapes, plants or animals.

Generative Extend has some limitations in beta. It is limited to 1920x1080 or 1280x720 resolutions; 16:9 aspect ratio; 12-30fps; 8 bit SDR and mono and stereo audio

“We’re rapidly innovating and expanding its capabilities for professional use cases with user feedback. We want to hear how it’s working or not working for you.”

Adobe advises that editors can use can unique identifiers known as ‘seeds’ to quickly iterate new variations without starting from scratch.

It suggests using as many words as possible to be specific about lighting, cinematography, colour grade, mood, and style. Users should avoid ambiguity in prompts by defining actions with specific verbs and adverbs. Using lots of descriptive adjectives is a plus as are use of temporal elements like time of day or weather. “Bring in camera movements as necessary,” Adobe advises. “Iterate!”

Content authenticity

“The Firefly model is only trained on Adobe stock data, and this is data we have rights to train on.” Alexandru Costin, Adobe

For all the focus on editing features, Adobe is insistent that its “responsible” approach to AI differentiates it from companies like OpenAI where few if any guardrails on copyright infringement are acknowledged.

It claims Firefly is “the first generative video model designed to be safe for commercial use” and says this is what its customers want more than anything.

“Our community has told us loud and clear that they needed first and foremost to make sure the model is usable commercially, is trained responsibly and designed to minimise human bias,” Costin says.

“The Firefly model is only trained on Adobe stock data, and this is data we have rights to train on. We don’t train on customer data and we don’t train on data we scrape from the internet. We only train on Adobe stock and public domain data which gives us the confidence and comfort to ensure our customers that we cannot infringe IP.”

It is why Adobe offers its largest [Enterprise] customers indemnification. “We offer them protection from any potential IP infringement.”

Costin also points to the Content Authenticity Initiative (CAI), a cross-industry community of major media and technology companies co-founded by Adobe in 2019 to promote a new kind of tamper-evident metadata. Leica, Nikon, Qualcomm. Microsoft, Pixelstream, SmartFrame, the BBC and even OpenAI are among the 2,500 members.

Read more Momentum builds behind content credentials to combat AI deepfakes

“We’re getting more and more companies to join the consortium. All the assets that are generated or edited with GenAI in our products are tagged with content credentials. We offer visibility in how content is created so consumers can make good decisions on what to trust on the internet.”

Plus, Content Credentials can be included on export from Adobe Premiere Pro after using Generative Extend.

“We’re also respecting principles of accountability and transparency. In terms of accountability, we have this feedback mechanism in Firefly where we ask for and act on customer feedback. This is what helps us design and improve the guard rails that minimise bias and harm and minimise the potential of defects for our model. We’re proud of the approach we took in building AI responsibly and we know this is a key differentiator that makes our models trusted and usable in real workflows.”

Read more IBC2024 Tech Papers: Media Provenance – signing your content in practice