Chuyển đến nội dung chính

New Site Crawl: Rebuilt to Find More Issues on More Pages, Faster Than Ever!

Posted by Dr-Pete

First, the good news — as of today, all Moz Pro customer have access to the new version of Site Crawl, our entirely rebuilt deep site crawler and technical SEO auditing platform. The bad news? There isn't any. It's bigger, better, faster, and you won't pay an extra dime for it.

A moment of humility, though — if you've used our existing site crawl, you know it hasn't always lived up to your expectations. Truth is, it hasn't lived up to ours, either. Over a year ago, we set out to rebuild the back end crawler, but we realized quickly that what we wanted was an entirely re-imagined crawler, front and back, with the best features we could offer. Today, we launch the first version of that new crawler.

Code name: Aardwolf

The back end is entirely new. Our completely rebuilt "Aardwolf" engine crawls twice as fast, while digging much deeper. For larger accounts, it can support up to ten parallel crawlers, for actual speeds of up to 20X the old crawler. Aardwolf also fully supports SNI sites (including Cloudflare), correcting a major shortcoming of our old crawler.

View/search *all* URLs

One major limitation of our old crawler is that you could only see pages with known issues. Click on "All Crawled Pages" in the new crawler, and you'll be brought to a list of every URL we crawled on your site during the last crawl cycle:

You can sort this list by status code, total issues, Page Authority (PA), or crawl depth. You can also filter by URL, status codes, or whether or not the page has known issues. For example, let's say I just wanted to see all of the pages crawled for Moz.com in the "/blog" directory...

I just click the [+], select "URL," enter "/blog," and I'm on my way.

Do you prefer to slice and dice the data on your own? You can export your entire crawl to CSV, with additional data including per-page fetch times and redirect targets.

Recrawl your site immediately

Sometimes, you just can't wait a week for a new crawl. Maybe you relaunched your site or made major changes, and you have to know quickly if those changes are working. No problem, just click "Recrawl my site" from the top of any page in the Site Crawl section, and you'll be on your way...

Starting at our Medium tier, you’ll get 10 recrawls per month, in addition to your automatic weekly crawls. When the stakes are high or you're under tight deadlines for client reviews, we understand that waiting just isn't an option. Recrawl allows you to verify that your fixes were successful and refresh your crawl report.

Ignore individual issues

As many customers have reminded us over the years, technical SEO is not a one-sized-fits-all task, and what's critical for one site is barely a nuisance for another. For example, let's say I don't care about a handful of overly dynamic URLs (for many sites, it's a minor issue). With the new Site Crawl, I can just select those issues and then "Ignore" them (see the green arrow for location):

If you make a mistake, no worries — you can manage and restore ignored issues. We'll also keep tracking any new issues that pop up over time. Just because you don't care about something today doesn't mean you won't need to know about it a month from now.

Fix duplicate content

Under "Content Issues," we've launched an entirely new duplicate content detection engine and a better, cleaner UI for navigating that content. Duplicate content is now automatically clustered, and we do our best to consistently detect the "parent" page. Here's a sample from Moz.com:

You can view duplicates by the total number of affected pages, PA, and crawl depth, and you can filter by URL. Click on the arrow (far-right column) for all of the pages in the cluster (shown in the screenshot). Click anywhere in the current table row to get a full profile, including the source page we found that link on.

Prioritize quickly & tactically

Prioritizing technical SEO problems requires deep knowledge of a site. In the past, in the interest of simplicity, I fear that we've misled some of you. We attempted to give every issue a set priority (high, medium, or low), when the difficult reality is that what's a major problem on one site may be deliberate and useful on another.

With the new Site Crawl, we decided to categorize crawl issues tactically, using five buckets:

  • Critical Crawler Issues
  • Crawler Warnings
  • Redirect Issues
  • Metadata Issues
  • Content Issues

Hopefully, you can already guess what some of these contain. Critical Crawler Issues still reflect issues that matter first to most sites, such as 5XX errors and redirects to 404s. Crawler Warnings represent issues that might be very important for some sites, but require more context, such as meta NOINDEX.

Prioritization often depends on scope, too. All else being equal, one 500 error may be more important than one duplicate page, but 10,000 duplicate pages is a different matter. Go to the bottom of the Site Crawl Overview Page, and we've attempted to balance priority and scope to target your top three issues to fix:

Moving forward, we're going to be launching more intelligent prioritization, including grouping issues by folder and adding data visualization of your known issues. Prioritization is a difficult task and one we haven't helped you do as well as we could. We're going to do our best to change that.

Dive in & tell us what you think!

All existing customers should have access to the new Site Crawl as of earlier this morning. Even better, we've been crawling existing campaigns with the Aardwolf engine for a couple of weeks, so you'll have history available from day one! Stay turned for a blog post tomorrow on effectively prioritizing Site Crawl issues, and a webinar on Friday at 9am Pacific.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Nhận xét

Bài đăng phổ biến từ blog này

Moz Transitions: Rand to Step Away from Operations and into Advisory Role in Early 2018

Posted by SarahBird I have some big news to share with you. As many of you know, three and a half years ago, Rand began to shift his role at Moz . He transitioned from CEO into a product architect role where he could focus his passion and have hands-on impact in evolving our tools. Now, over the next 6 to 9 months he will transition into a supporting role as a Moz Associate. He will continue to be a passionate speaker and evangelist, and you'll still see his enthusiastic face in Whiteboard Fridays, on the Moz Blog, and on various conference stages. And of course, he is one of our largest shareholders and will remain Chairman of the Board. This is hard. Rand started Moz (formerly seomoz.org) over 16 years ago as a blog to record what he was learning about this new field. He and his co-founder Gillian Muessig created a marketing agency that focused on helping websites get found in search. They launched their first SAAS software product in February 2007, and I joined the company ni...

The Rules of Link Building - Whiteboard Friday

Posted by BritneyMuller Are you building links the right way? Or are you still subscribing to outdated practices? Britney Muller clarifies which link building tactics still matter and which are a waste of time (or downright harmful) in today's episode of Whiteboard Friday. Click on the whiteboard image above to open a high-resolution version in a new tab! Video Transcription Happy Friday, Moz fans! Welcome to another edition of Whiteboard Friday. Today we are going over the rules of link building. It's no secret that links are one of the top three ranking factors in Goggle and can greatly benefit your website. But there is a little confusion around what's okay to do as far as links and what's not. So hopefully, this helps clear some of that up. The Dos All right. So what are the dos? What do you want to be doing? First and most importantly is just to... I. Determine the value of that link . So aside from ranking potential, what kind of value will that link brin...