Modern Creator Network
Simone Ferretti · YouTube · 15:28

How I Clone Winning Instagram Reels Using HeyGen & ElevenLabs

A 15-minute full guide: find outlier reels, clone your voice, generate your AI avatar, and ship content you never recorded.

Posted
2 days ago
Duration
Format
Tutorial
educational
Channel
SF
Simone Ferretti
§ 01 · The Hook

The bait, then the rug-pull.

Simone Ferretti opens holding his phone to camera with the caption I Have Cracked The System, then immediately unloads the proof: zero to 167,000 Instagram followers in fourteen months, 15K per month minimum, from content he never personally records.

§ · Stated Promise

What the video promised.

stated at 00:04I am going to show you the exact three step process that allow me to grow my brand new AI Instagram page from zero to 167000 followers in just fourteen months.delivered at 13:30
§ · Chapters

Where the time goes.

00:0001:02

01 · Hook and Proof Stack

Phone-to-camera opener, rapid number escalation 0 to 167K and 15K per month, brand deal timeline. Waitlist CTA teased before the steps are even named.

01:0202:22

02 · Three Steps Plus Reel Selection Warning

Names the 3-step system. Warning: cloning the wrong reel wastes all setup time. Manual method: fresh account, follow 5-10 niche creators, warm up the feed.

02:2203:20

03 · The Outlier Test

Quantifies winner threshold: 20K avg vs 400K spike. Saves/shares over views. Filter: clone only info/tutorial content that survives creator-brand transplant.

03:2004:28

04 · Discovery Page Gold Mine

Instagram discovery page as real-time signal of what works. Warm the account toward niche so the algorithm shows relevant winners.

04:2805:33

05 · Picking the Example Reel

Live walkthrough of his own account. Identifies the AirLLM reel as the example. Shows split-screen format.

05:3306:46

06 · Step 2 Voice Clone Philosophy

Core principle: input quality equals output quality. Record with emotion, clarity, quiet room, close mic. One-time setup affecting every future reel.

06:4608:27

07 · ElevenLabs UI Walkthrough

Screen share: Voices to Create Voice to Professional Voice Clone. 5-minute sample minimum. TTS flow: paste script, select voice, pick model, adjust sliders, generate.

08:2708:55

08 · Live Voice Generation Demo

Generates and plays back cloned voice live. Downloads the file for HeyGen upload.

08:5510:57

09 · Step 3 HeyGen Setup Philosophy

Input quality repeated. Training video becomes every future reel. Good lighting, eye contact, emotion. If you sound robotic, HeyGen will be robotic.

10:5713:30

10 · HeyGen Walkthrough Avatar Creation and Script Upload

Shows Avatars dashboard with multiple looks. Avatar creation: real camera not webcam, 2-3 min sweet spot, consent required. Uploads ElevenLabs audio as accent workaround. Selects Avatar 5, generates.

13:3014:08

11 · Final Output Reveal

Completed AI reel plays back. Recommends captions and pacing via CapCut AI or Captions AI.

14:0815:28

12 · Results Proof and Waitlist CTA

Full AirLLM reel shown: 130K views, 3.5K likes, Made Almost Entirely With AI. Zero on-camera time from Simone. Full-system waitlist pitched.

§ · Storyboard

Visual structure at a glance.

hook phone to camera
hookhook phone to camera00:00
steps named
promisesteps named01:02
discovery page strategy
valuediscovery page strategy03:20
step 2 transition
valuestep 2 transition05:33
ElevenLabs UI
valueElevenLabs UI06:46
step 3 transition
valuestep 3 transition08:55
HeyGen avatar grid
valueHeyGen avatar grid10:57
output reveal
proofoutput reveal13:30
130K views proof
cta130K views proof14:08
§ · Frameworks

Named ideas worth stealing.

02:22concept

The Outlier Test

Compare a reel performance against that creators baseline, not platform averages. A 20x spike is the winner signal. Cross-check saves/shares to confirm genuine engagement.

Steal forAny niche research phase
06:00concept

Input Quality Equals Output Quality

The quality ceiling of your AI clone is set by the quality of your real recording. Bad mic equals bad clone. Good emotion equals believable avatar.

Steal forAny AI-assisted content pipeline
01:02list

Three-Step AI Reel Cloning System

  1. Find a winning reel outlier test plus saves filter
  2. Clone your voice ElevenLabs professional clone
  3. Recreate with AI avatar HeyGen plus cloned voice
Steal forDirect template for AI content production
03:20concept

Account Warming for Discovery

Open a fresh account, follow 5-20 niche creators, engage to train the discovery algorithm. Discovery page becomes a live feed of niche winners.

Steal forResearch setup for any platform
§ · Quotables

Lines you could clip.

01:00
I generate over 15000 a month as a minimum from content I never even record myself.
Self-contained hook, income claim plus paradox in one lineTikTok hook
06:00
Input quality directly affects output quality.
Memorable principle, universal application, repeatable across platformsIG reel cold open
09:50
If you sound robotic, HeyGen will be robotic.
Tight causal warning, no setup needed, immediately actionableIG reel cold open
02:43
A reel with 100000 views and 5000 saves is significantly more valuable than a reel with 500000 views and 200 saves.
Counterintuitive stat comparison, concrete numbersnewsletter pull-quote
§ · Pacing

How they spent the runtime.

Hook length62s
Info densityhigh
Filler8%
§ · Resources Mentioned

Things they pointed at.

00:00toolHeyGen
13:33toolCapCut AI
14:08toolAirLLM
§ · CTA Breakdown

How they asked for the click.

01:40link
Click the link below to join the wait list.

Teased in the hook block before steps are named. Repeated at the end with full system description. No subscribe push, all traffic goes to waitlist.

§ · The Script

Word for word.

HOOKopening / re-engagementCTAthe pitchanalogy
00:00HOOKI've cracked a system that lets me clone any viral Instagram reel using Heijen and eleven Labs. And in this video, I'm going to show you the exact three step process that allow me to grow my brand new AI Instagram page from zero to 167,000 followers in just fourteen months, so you can do the same. And here's how it works. Step one, find a winning reel in your niche. Step two, you wanna clone your voice using 11 laps. Step three, you want to recreate the video of yourself yourself with with your your AI avatar using Heijen.
00:32HOOKAnd I'm not exaggerating when I say discovering this simple process changed everything for me. Because once I figured this out, I was able to grow a brand new AI Instagram Instagram account from zero to 30,000 followers in ninety days and use that account to charge up to 5,000 reels for a single sponsored reel. I got monetized and I managed to get the first brand deals after just four months of opening day account. And now I generate over $15,000
01:00HOOKCTAa month as a minimum from content I never even record myself. I get invited to speak about this through event, workshops, companies, and it's just grown way more than I would have expected. And by the way, what I'm about to show you in this video is one piece of a much bigger system that I've built for myself and for my clients. The full version with the exact prompt, templates, tools, workflows, niche specific frameworks, basically, everything I've spent the last twelve months refining behind the scenes. If you want the deep dive, nitty gritty version of all of this, click the link below to join the wait list. Otherwise, what I'm about to share will still get you most of the way there. Okay. Step one, find a winning reel. Quick warning. If you clone the wrong reel, none of this will work. You're literally gonna waste hours setting up your voice and your avatar to recreate a piece of content that was never going to perform anyway. Here's a manual method I want you to start off with because there is an AI tool that can help that I use, but I do believe you need to recognize it with your own eyes first. One of the best way is to actually open a brand new Instagram account and start following all the best people in your niche. So specifically, you want creators who are already speaking to the exact audience you want to reach. Find five to 10 of these accounts and just follow them, like, look at a lot of their reels. But, John, look for views only. Views sometimes can be misleading. You're looking for reels that dramatically overperformed
02:22compared to the rest of that creator's content. If the average reel gets, let's say, 20,000 views and one specific reel got 400,000 views, that's probably a winning reel. But don't stop there. Try to look at saves, shares, and comments. A reel with 100,000 views and 5,000 saves is significantly more valuable than, let's say, a reel with 500,000 views and 200 saves. So saves and shares held the algorithm something is generally engaging.
02:49Then one more thing, what to avoid? Don't try to clone reels that depend entirely on the brand of the original creator. If the only reason a reel worked is because of who the person is, you cannot replicate that with AI. So try to look for reels where the value is in information, what they are showing, whether they're showing a tool, whether they're showing any sort of information that is not related to the person saying it. So, again, it can be tutorials, can be breakdowns, explainers, lists, frameworks, content that someone else could deliver
03:20and it would still hit. So once you've found three to five reels that meet all of these criteria, then you have your starting list. And just so that you understand, once you warm up the account, which means that you follow these people, you engage with these people, you scroll through your list. Also, in the discovery page of Instagram, you will start noticing that Instagram will try to give you similar reels to those. And that Instagram discovery page is a gold mine because Instagram is showing you what's already working. And most likely, if a piece of content is there, means it's probably an outlier for that specific creator. So try to warm up the account, follow at least ten, twenty people in that specific niche, look at their reels, and then also start looking at your discovery page on Instagram. This is very important. If you're looking at cats, if you're looking at hobbies, if you're looking at your friends, that discovery page will be misleading. Now for the purpose of this video, I'm gonna pick a reel that I created. In this case, for me, this reel was the one that actually outperformed anything else. So I just wanna show you with my account. Right? You can see that my average views is not super high, so we scroll down. And then we see one that has a big hit, 125,000
04:28views on this one right here as you can see. And then I can see there are some good ones, let's say 18 k, maybe 23 k. But then when I look at the number of comments and likes here, it's actually much bigger. So I go inside and I have a look at all the stats. If you look at on mobile, you will see a bit more than this. But overall, I can tell you that we have very good numbers. 3.5 k likes in just five days with a much smaller account, let's say, from my main account. And then let's have a look at this one, for example. What if you can run a 70,000,000,000 parameter AI model without any supercomputer? So we see that is a split screen. We have something that is going on at the top, and then it talks about tech and AI because, obviously, that's what I do on this channel. And then there is me in the bottom area. Now let's assume that this was actually a human and you wanted to kind of replicate it with yourself. Well, this brings us to step number two, which is all about cloning your voice using one AI tool that I love that is called 11 Labs. And you don't wanna skip this step because if the voice sounds robotic, fake, or off in way, the viewer will think, well, great. That's more AI slop. I'm out. So how does 11
05:33actually work? Well, you give it a sample of your real voice, and it creates a digital version of that voice that can read any text you give it, sounding like you. Then you can also modify a bunch of sliders just to adjust the voice exaggeration, how similar you want it, and a lot of different things. You just need to try a few times, make sure that sounds like you, and then you're good to go. That digital voice is what gets paired with your AI avatar later in step three. So getting this right is nonnegotiable.
06:00And here's the principle I want you to keep in mind here. Input quality directly affects output quality. If you record your voice sample in a noisy room with a bad microphone, your AI voice will sound bad. On the other hand, if you record your voice sample in a quiet room with a clean microphone like this one, speaking clearly and naturally, but with a bit more emotion, your AI voice will sound shockingly real. This is a one time setup that affects every single piece of content you produce going forward. So it's absolutely important that you get it right. Now let me take you inside eleven Labs to show you how you do it properly. So this is the main screen of Elevenlab as it looks right now. What you need to do is that you need to go into voices, then you click on create voice. Here you have the possibility
06:46to create a professional voice clone or an instant voice clone. I would definitely recommend to go for professional voice clone. As you can see right here for me, it's not available because I already have two professionally cloned voice. So you're gonna click on this one, then you need to feed it five minutes of your voice. Once again, it's super, super important that you record it with a high quality microphone. And if you have just your iPhone, you wanna test it, make sure that your microphone is as close as possible to your mouth. Because the farther the microphone, the lower the quality that is going to be. So very close. Try to exaggerate. Try to simulate how you'd like to be perceived on social media. So even when I record myself, I don't usually speak like this in normal life. I'm a little bit more calm even though I am quite exaggerated as well in real life. But overall, when I try to record content, I try to give my my real best. And once you have your professional voice cloned, what you need to do is that you're just gonna go into text to speech, and then here, you just need to write what you wanna say. If your script, let's say, is just copied from somebody else and then you just wanna reformulate a little bit with AI, you just take what the AI gave you, you paste it here, or if you wanna just write it manually, that is totally up to you. But after you pasted the script right here, then you click on generate speech after selecting the voice. So let me just do that. I'm gonna take that script that we created for that viral video. So I'm gonna take it from this. Let's say I'm gonna copy just the first two lines just so that we have it a little bit faster, then I paste it in here. I make sure that my voice selected. So Simone Pro, this is my voice, and I wanna make sure that I have v two as model in this case, but maybe when you're looking at it, there's gonna be a much better version afterwards. Depends on when you're looking at this. Then you can change the speed, stability, similarity, and style exaggeration.
08:27As I said earlier, you wanna test these parameters because I cannot tell you use these exactly. All depends. It's very subjective. It all depends on you. So in this case, I know that these are the best settings for me. So what I'm gonna do, I'm just gonna click on generate speech. And now you just need to wait a few seconds while it generates, and now we can hear my voice. What if you can run a 70,000,000,000 parameter AI model without any supercomputer? What if you can run a 70,000,000,000 parameter AI model without any supercomputer?
08:55Shans just like me. It's amazing. So what I'm gonna do right now is that I'm gonna download this. And once your voice is cloned and the output sounds clean for you, you're ready for the final step where everything comes together, which is step three, create the video with Heijen. Just like your Eleven Labs voice recording, your Heijen setup is a one time investment that determines the quality of every reel you ever produce. And I promise, people generally won't be able to tell it's AI if you really put effort in this. So HeyGen is an AI tool that takes a single recording session and uses it to generate new videos of you sync anything you want. You can combine it with the voice you clone in step number two to create a fully produced talking head with your avatar that is delivering the script in your voice. This system is what allows you to create unlimited content without ever recording on camera. Again, quality of input equals quality of output. I really keep coming back to this principle because it's really the foundation. It's the entire game. The video you record to train Haejen needs to look as professional as possible. And I'm talking about good lighting, clean background, looking directly into the camera, try to be as natural as possible with a bit more emotions than usual. And sometimes, as I said, people freeze when you're talking to camera. This is totally normal. Just try to get comfortable and try to wear something that you'd be comfortable seeing in every reel you ever produce because that recording essentially becomes you. From that point forward, every AI version of you wears that outfit, sits in that environment, and speaks with that energy, which is actually the most important one because right now, HeyGen is coming up with amazing updates that let you change clothes, let you change environment. They work pretty well, but the energy thing, well, that does not change. If you sound robotic, Heijen will be robotic. So let's go inside Heijen so that I can show you how to do this properly. So this is the landing page of Heijen right now. And then if I go on avatars, I can see that I have two avatars right now. One is Elenia, which is my fiancee, and then the other one that I have right here is actually my avatar. So this one right here. You can see here that I have different looks. It's because I generated
10:57several looks, that's how they call it, in several environments just because I wanted to have variety. I created these already before the latest Heijen launch. So right now, you can even potentially change the background, change the clothing. But in my opinion, I think the most realistic result will always be when you feed it a perfect video. So perfect lighting, great movements, great emotion, and then you just generated a clone of that rather than having agent to change your clothes or change the background. So this is really important. Right? So in this case, I already created my avatar. But if you don't have one, you just go into avatars, and then you'll have to create a new one. So you click on create avatar, and then here you wanna clone a real person. Now you just need to follow the instruction. So you will need to upload a footage. Don't do it with your webcam because that will look horrible. Either do it with your phone if you don't have a camera, but if you have a camera, do it with your camera. And as I said, just try to be a little bit more emotional than usual. Look directly into the camera and pause after each sentence. Then you'll also need to give authorization to Heijen, which is one of the best part about this platform because you cannot clone other people. So it needs to be yourself that you give authorization to clone it. And then you just need to follow the instruction, and you can even give it a fifteen second video. My suggestion is that you give it a two to three minute video. That's, I think, the sweet spot. Fifteen seconds will not be able to pick up among your emotions, among all your different words. Then once you have created your avatar, if you created one only, you're gonna see one right here, but then you also can create different ones, different looks. Potentially, you can change clothes, or you can test also the tools inside, hey, Jen, in case you wanna try to change your environment. But in this case, for example, this is one of my favorite avatars. I'm gonna go in, and I'm gonna use NVIDIA. And we're gonna click on add a script. Here, you can add a script based on text. So if you just wanna copy and paste the text here, but, hey, Jen, is not the best as cloning the voice. That's why I went to 11 laps. You can try if you are an English native speaker, you can do it, and I think we'll be fine. But for me, that I have an accent, it didn't really work well, so I prefer to use eleven Labs. Or you can upload the voice. So in this case, we can just upload the file that we just downloaded right now with 11 Labs. So I'm taking this one, then we upload it, and that's it. Once you're ready, you can select one ATP, the avatar that you wanna choose. So either avatar five or avatar four as of right now. Avatar five, I would say, enhances emotions quite a lot, so it kinda, like, moves the gesture. I was actually talking to the Hae Gen team before, and maybe they're gonna dial it down with the next avatar. So, obviously, depending on when you're looking at this video, try avatar four or avatar five. I'm just gonna keep avatar five and then making sure that I have the avatar selected here, and then we just click on generate.
13:31CTAAnd once it's done, this is the final output that we got. Instead of loading a massive model at once, it loads it layer by layer from your hard drive. And just like that, you have an AI avatar clone. So from there, you just need to add captions, adjust the pacing if needed. If you want to have any screen recording, you can drop in some background music, or you can even use CapCut AI or Captions AI to edit the video with AI. If you wanna nail down the edit, I would highly suggest that you hire an editor or you use one of these AI apps in case you are on a budget. Now let's look at the real start to finish again. What if you can run a 70,000,000,000 parameter AI model without any supercomputer?
14:08CTAIt's now possible with an open source Python library called Air LLM. Instead of loading a massive model at once, it loads it layer by layer from your hard drive, just like reading a book page by page instead of memorizing the whole thing first. There's also a feature called flash attention, which keeps memory usage almost flat even with long inputs. With it, models like Lama 3,370,000,000
14:31CTAparameter can run directly on your MacBook or gaming computer. For students, researchers, and indie developers, this completely changes the game. Comment air, and I'll send you the link. And that's it. A 130,000 views, 3.5 k likes, such a winning concept without me needing to record in front of the camera. So this has been created completely by my team without me doing anything. Now what I just walked you through is one piece of a much bigger system I've built behind the scenes. The full version includes d dives into avatar creations and Gen AI, so both making videos, making images with your face in any scenario,
15:07CTAfinding scripts, finding the right ideas with AI tools, templates I use for thumbnails and captions, the niche specific frameworks I've refined over the last fourteen month working fully with AI, and the workflows I used to write everything at scale. If you want the deep dive version of all of this with everything I couldn't fit inside this video, click the link in the description. I'll see you in
§ · For Joe

The whole game is the input recording.

AI content production playbook

Ferretti system works because he treats the one real recording session as the master asset: everything downstream is an infinite derivative of that single quality investment.

  • The outlier test is the real alpha: saves/shares over views, baseline comparison over absolute numbers. Run this before touching any AI tool.
  • Fresh account plus niche warming equals free trend research. The discovery page shows what the algorithm is already distributing.
  • ElevenLabs Professional Voice Clone not Instant is the quality gate. 5-minute sample, quiet room, close mic, extra emotion. Record once, use forever.
  • The ElevenLabs to HeyGen handoff is the unlock: generate audio in ElevenLabs, upload to HeyGen as a voice file. Bypasses weaker TTS, critical if you have an accent.
  • Avatar 5 enhances gestures. Test both Avatar 4 and 5 and pick per-reel as versions evolve.
  • The Made Almost Entirely With AI label on the proof reel is itself a hook mechanic that drives shares.
  • This is a one-person media company model: the system separates ideation from execution. The trained assets do the work.
§ · For You

What you can actually do with this today.

If you want to build an AI content presence

You do not need to be on camera to build a following: you need one good recording session and a clear niche.

  • Open a fresh Instagram account, follow 10-20 creators in your target niche, and spend a week warming the discovery feed before touching any AI tool.
  • When evaluating reels to clone, look at saves and shares not view count. 5000 saves on 100K views outperforms 200 saves on 500K.
  • Only clone content where the value is in the information not the personality. Tutorials and frameworks transplant. Reaction content does not.
  • ElevenLabs Professional Voice Clone needs about 5 minutes of clean audio. Record in a quiet room, mic close, with more energy than normal. This sets your quality ceiling forever.
  • HeyGen avatar training: use a real camera not a webcam, record 2-3 minutes, look directly into the lens. The energy in that session is the energy in every reel.
  • Once your voice and avatar are trained, producing new reels requires no additional on-camera time. The input cost is front-loaded; the output is unlimited.
§ · Frame Gallery

Visual moments.