How to Build Links Through ScrapeBox Using White Hat Techniques

The words “ScrapeBox” and “white hat SEO” may not be found together very often, but they should be. In this video I’ll walk you through a number of actionable strategies that you can use to leverage ScrapeBox for more guest posting opportunities, grab contact information and delete duplicate targets from your SEO campaigns.

Video Transcript

Hey, what’s up everybody, it’s Brian Dean from Quick Sprout. In this video I’m going to show you how to use ScrapeBox to make your link prospecting much more efficient including some advanced features found within ScrapeBox.

The first thing I want to go over is the harvester area of ScrapeBox. This is the most important area of ScrapeBox because this is where you give it search strings that you’re going to use to find results in Google.

The top is your search operator footprint area. You can include some yourself, or if you press this button here it has a list to give you some ideas. For example, if you want to limit your results to only edu sites you would choose this option here. What it’ll do is for the keywords that you add – like marketing, SEO, advertising, blogging – take this operator and put it in front of all of these keywords. That’s an efficient way to add maybe hundreds of keywords and limit them to edu results.

Now, you dont’ have to use a footprint or an operator at the top. You could do this manually. You could limit the results for this keyword, marketing, to edu sites. But, for the SEO keyword you could use that to search for guest posting opportunities and so on. So, you don’t have to use a search operator because then that’ll apply to every keyword. Just for the sake of efficiency, if you want to search for several different link opportunities at once you can do that by adding specific search strings here and leaving that blank.

Next, I’m going to go over the select engines and proxies area of ScrapeBox. This is really important, because the settings here can make or break your scrape. The first thing you want to do is choose the search engines that you want to scrape from. Obviously, you want to include Google as part of the list. You can also include Yahoo or Bing if you want to get more results. I typically don’t choose AOL because their results are actually identical to Google’s most of the time.

Under results, that’s where you decide how deep you want your scrape to go. If you just want to scrape the first page results you’d limit that to 10. If you want to go to the fifteenth page you could put that to 150, and so on.

Next, I’m going to go over the use of proxies which is very important. Because if you don’t use proxies with ScrapeBox, eventually Google will blacklist your IP address and you won’t be able to access Google any more. So, whenever you do a scrape you want to make sure that use proxies is checked off. These are actually proxies right here. Don’t be intimidated. It looks kind of technical, but it’s very easy to set up your proxies and test them within ScrapeBox. That’s what I’m going to show you how to do right now.

Okay. So, I’m logged into my paid proxy account. Although they’re all set up a bit differently, they pretty much have the same setup where it shows you all the IP addresses that they currently have and then your proxies that you can use. What you want to do is in this box just copy the list that they have. Open up ScrapeBox, and in this little box here just paste them in. That’s it. It was easy, right?

Once you have your proxies in place you want to test them before you use them. Because if you use proxies that don’t work ScrapeBox doesn’t just move onto the next proxy that works in your list. It just won’t scrape. It basically breaks the program.

So, you want to make sure that 100 percent of your proxies are working before you run ScrapeBox. To do that click on manage. Then, click on test proxies. Then, click on test all proxies or test untested proxies if you’ve already tested this batch. I’m going to test all proxies because I haven’t tested these yet. After the program is finished you’ll see something like this, a mix of red and green proxies. Red proxies don’t work. Green proxies do work.

What you want to do is click on filter and choose keep Google proxies. These are proxies that aren’t blacklisted by Google and they’re functioning. As you can see, they’re all green which means we’re good to go.

Your next step is to copy these proxies into ScrapeBox. To do that click on save, and then click on save proxies to ScrapeBox, and then close. All the proxies that we determined that were Google friendly and functioning are now in this box.

Now it’s time to actually start harvesting. To do that just simply click on the start harvesting button and ScrapeBox will do its thing. After a while you get a message that says the harvester has completed, so click okay.

ScrapeBox will tell you how the scrape went. If it’s all green then everything went well, but if you see one of them as red that means that it wasn’t able to scrape all the results for that keyword – and that’s typically because the proxies weren’t working. What you want to do is click on close, and now take a look at your results.

All of these URL’s, in this case there are 600, are URL’s that they scraped based on your search strings. You can just go one by one down the list and open these in your browser. But, it’s typically much easier to do some prospecting within ScrapeBox to make this list a little better.

The first thing you want to do is remove any duplicate results that might come up. To do that click on remove filter, and click on remove duplicate URL’s, and then click yes. It only had one duplicate URL. Then, if you only want to reach out to one site, for example, you don’t want to have multiple results from the same site, you want to click on remove and filter and click on remove duplicate domain. So, if there are two results from Quick Sprout on the list it would only show one result from Quick Sprout. So, click on that. Click on yes.

In this case it removed 99 domains. This is actually very, very common within ScrapeBox to find two of the same site, and you don’t want to reach out to the same person twice.

The next thing you want to do is check the page rank. This is important just for figuring out which sites in the list are quality. There are two ways to go about this. You can get the URL page rank which is the page rank of this particular URL and this particular URL. This is helpful if you know what page that you’re going to get your link on ahead of time, for example, a resource page or a scholarship page or something like that.

But, if you’re just looking for sites that you will get a link from somewhere down the line, like a guest post or something like that, you probably want to check the domain page rank which is just the home page page rank. To do that just click on get domain page rank, and ScrapeBox will tell you when it’s done.

In this case we got the page rank for 500 URL’s in 3 minutes. Not too bad. What you want to do here is probably sort by page rank. Click on this, and your results will be sorted so it shows the lowest page rank at the top. You probably want to click that again so it has the highest page rank at the top. This is helpful if you’re going to be doing a lot of outreach and you want to limit your results to those that have, let’s say, a page rank of three or above. This makes it very easy, because then you can just take all those that are below three, so two or one, and then delete them or just not reach out to them.

Finally, you want to export the list so it’s easy to work with. To do that click on import export URL’s and PR, and then choose whatever file format that you want to export the list as. When you do that you’ll basically have a spreadsheet that will show you the URL and page rank. Then you can add fields to keep track of who you reached out to.

That’s all there is to white hat link prospecting using ScrapeBox. As you can see, it’s a little bit tricky to get started, but once you get the hang of it it’s actually a very easy to use and very powerful tool.

Thanks for watching this video, and I’ll see you in the next one.