Technical SEO Training – Lesson 1
Technical SEO is an invaluable asset in increasing your site’s visibility and, as you will find out throughout this course, it’s just as important as creating fresh, useful content. And, it is a great tool for ensuring your site is accessible to both search engines and users alike, a vital component in getting users to the site, to begin with.
What you will Learn in This Lesson
Hello and welcome to the first part of SEO in Motion’s technical SEO training, a weekly series to help readers to understand what technical SEO is and how to implement it. In this lesson, we’re going to go over the very principles of technical SEO, what it entails, and a brief explanation of the elements involved. By the end of this lesson, you should have a good understanding of what technical SEO is and why it matters.
|What is technical SEO?
|Why is it important
|Crawling and Indexing
|How to understand an SEO crawl
|Common response codes
|Core Web Vitals
What is Technical SEO?
I’m sure you’ve heard lots about it, and you can be forgiven if you’ve found it a treacherous maze of a subject, as there can be lots of different niches to this pillar of SEO. To put it simply, technical SEO is the process of optimising your website to make it more visible to search engines and easier to crawl and index. This includes optimising your website’s structure, speed, accessibility and more to improve its search engine rankings and visibility.
It’s important because it helps search engines, like Google, understand your site’s content and how it can serve users. By optimising your website’s technical aspects, you not only improve Google’s understanding of your website but improve users understanding of the products and services you have to offer, as well as easing their navigation across the site and inevitably pushing up your conversion rate.
Crawling and Indexing
What is Crawling?
I’ve mentioned crawling a few times now but what is it?
Everything you have seen on a search engine results page is a product of crawling, it’s simply the method that search engines, like Google and Bing, use to ‘read’ a page on a website. Now, Google and other search engines don’t read a page like you and I do, instead they use bots to crawl and render the content on your URL, downloading HTML, images and video.
What is Indexing?
So, search engines visit your page, get the content from it and what next? Well, it’s time for indexation. Indexing is actually publishing your content onto the database that is Google, Bing or DuckDuck Go, as the case may be. It’s not a guaranteed process and it doesn’t necessarily mean that your page will rank against relative queries, as you’re probably aware. But, if the content isn’t getting indexed, it can never rank and that is why crawling and indexing form the very basis of technical SEO.
The Crawling and Indexing Process
The image below is an illustration of a flow chart that demonstrates what happens when you submit a page for indexing.
It’s worth making a couple of notes here; firstly, a site requests that a page is crawled by creating a follow link, adding the URL to the sitemap or using the manual inspection tool/ instant indexing tool on search engines. Any URL with a 200 response code is eligible for indexing.
You may also notice that there are two possibilities in the above flow chart, the first is discovered – not crawled. This is a Google page coverage report and signifies that Google is aware of these pages but they are either waiting in a queue to be crawled or there’s another issue blocking these pages from being called, this may be down to poor internal linking, for example.
The other is crawled – not indexed, this usually signifies poor content or structural quality and may even be down to other technical issues on site. Don’t worry if you’re finding this hard to grasp at first, we’ll be doing a deep dive into Google Search Console in one of our lessons.
Factors that Influence Indexing
Crawling, downloading and rendering content on websites is not free for search engines, in fact, there’s heavy investment in servers to enable this process. This is why search engines will find a way of monitoring the resources used to index websites, in Google’s case, this is based on two factors; content and technical SEO. I have provided a brief summary of the factors that help ensure that content gets indexed below.
- Content quality: High-quality, unique content with authority is more likely to get indexed by search engines. Following Google’s E-E-A-T guidelines will work for most search engines. You may also find my roundup on Google’s Helpful Content update useful.
- Having popular content that is updated regularly is also a positive signal for getting more of your site crawled and indexed.
- If you’re targeting Bing (and not Google), social profiles are used to determine authority and popularity.
- Being able to access content quickly is also a positive signal to search engines, Google has created specific metrics for this purpose, called Core Web Vitals.
- Mobile responsiveness: Ensuring pages can be used and displayed correctly on mobile devices.
- Website structure: Using internal links to prioritise pages, link to related content and create a logical structure.
- Sitemap: A sitemap will contain all the pages you would ideally want to be indexed.
The XML Sitemap
Your sitemap does not have to be in XML format, as search engines will read and process HTML sitemaps but XML files are specifically designed to hold data and display a hierarchal format. This makes them the best tool for the job. If you’re not sure what an XML sitemap is, you can go to almost any website’s homepage and add ‘/sitemap.xml’ on the end to see one. A sitemap doesn’t have to be at this URL address, however, so don’t be disappointed if it doesn’t work on the first site you try, just try another. Or, you can look up robots.txt to find the sitemap addresses.
The sitemap informs search engines of every page you want to index on-site and can provide information, such as priority, when a URL was last modified and how often the content is updated. Because it’s inevitable that not every page of your site will get indexed, it’s an invaluable tool in communicating what your most important pages are to search engines.
How to Create a Sitemap
If you’re worried about creating a sitemap, the good news is there are hundreds of tools to do this already and depending on your CMS (content management system), it might already be getting created automatically. Some of the popular CMS that includes autogenerated sitemaps are Shopify, Wix and Squarespace. Other CMS, like WordPress, have plugins that generate the sitemap for you, such as RankMath or YoastSEO.
There are literally so many CMS to choose from and checking how your chosen CMS populates a sitemap should be one of your priorities.
What is Robots.txt?
The robots.txt is a file that tells search engine bots which pages or content to crawl and index on your website. It’s important to ensure that your robots.txt file is properly configured to allow search engines to crawl your content and it can even prevent Google from crawling parts of your site, but this won’t prevent indexing on its own.
In addition, the robots.txt will also list the sitemaps.
It’s useful to note that Google will essentially obey any directive on your robots.txt, whether that’s to allow crawling, disallow crawling on a certain subfolder or stick to a certain crawl rate but the directives may not be followed by all search engines.
It can also be used to prevent certain crawlers (tools like ScreamingFrog or Sitebulb) from crawling or to ensure they don’t crawl too quickly but crawl settings can be set to ignore robots.txt, making this use only as useful as those who chose to follow it.
Understanding SEO Crawlers
An SEO crawler is a tool that is designed to crawl websites and extract information about them, providing a report on what’s working and what isn’t working. There are many different crawlers on the market and they broadly report on the same factors, which include how many URLs are on site, how many external links there are, what file types are on site, whether pages are indexable, what pages are in the sitemap, whether indexable pages have internal links pointing to them and so on. The below video is a brief run-through of a crawl using the ScreamingFrog crawler and it will help identify a very basic understanding of what a crawler is and what it can be used for.
Getting to Grips with Response Codes
HTTP response codes tell us the status of the page, this can include whether it loads without issue, if it doesn’t load at all, if there are server or time-out errors and if a page is redirected. The main response codes you will come across are 200, 404 and 500 but there are many and the table below provides a full breakdown of all status codes. Realistically, you will only come across a handful of these when managing your website.
What is Structured Data?
Structured data, or schema markup, is a data format to help search engines understand elements of your content and what it is about. It can help your pages rank for SERP Features and help content rank higher.
There are numerous types of schema markup that can be used on everything from a blog, or news article to products, tables, videos and more. Whatever schema you choose to use is only limited by your content and it can be a great way of testing what can improve traffic to site and what type of users land on pages using specific schema.
To find out what schema there is head over to schema.org. It’s also useful to remember that Google uses JSON when crawling schema. We’ll do a deep dive into the different types of popular schemas and SERP features later on in this course.
What are Core Web Vitals?
A user’s experience on websites has always been a concern of search engines, as it provides an extension of the user’s experience using that search engine. That being said, making sure that users can access content quickly, without issue has always remained a goal. However, in 2021, Google revolutionised this process by introducing 3 concise metrics related to page speed and experience.
Largest Contentful Paint
The Largest Contentful Paint, or LCP, is the largest element in the viewport – that’s the portion of the page you can see when a page first loads. It can be image files, fonts, JS resources used to display the element and even large style sheets. Reducing the LCP to under 2.5 seconds will positively improve page load speeds for the user, making content on the page visible more quicker and leading to quicker interaction with the page overall.
First Input Delay
Have you ever gone onto a webpage and tried to click on a button only to find it’s not responding, and then you realise it is and by this point, you’ve requested the same action two or three times because the response to your action was too slow? The First Input Delay, or FID, measures the time between interacting with a page and the page responding to that request. The goal is to get the FID down to less than 100 milliseconds to avoid user frustration, and ultimately, users leaving your page (bouncing).
Culumative Layout Shift
There’s nothing worse than landing on a page and trying to interact with it, only to find the elements of the page are moving around! This is Culumative Layout Shift or CLS, and it does what it says on the tin; measures the layout shift of elements on the page. This could be any element, images, tables, or paragraphs – if dimensions aren’t set properly, elements on the page can move around as different resources on the page load are requested and become visible. The goal here is to get a CLS score of 0.1 or lower.
We’ll really dig into this subject in one of the upcoming lessons but for now, I want to leave you with some quick wins for improving CWV metrics.
- Use compressed image files, such as WEBP, a lightweight, high-resolution file supported by most browsers.
- Resize images to display sizes before uploading to your CMS.
- Use a content delivery network to deliver media files and improve page load times.
- Use image and element dimensions to prevent CLS.
- Use a cache to help reduce reliance on all page resources when loading a page; a cache effectively stores a static version of your page that can be downloaded quicker.
- Try to avoid large media files (very large images or videos) in the viewport (remember, this is the first view of a page your get)
- Avoid using enormous file types like gifs for videos, MP4s are still really resource heavy but can be compressed and served via a CDN to reduce the size of resources required.
Remember, these are just some quick rules to go by, or best practices if you like. We’ll go into more detail on Core Web Vitals later in the series. If you have any questions about any of these suggestions, please comment below.
Ensuring Your Pages are Mobile Responsive
Almost 60% of traffic comes from mobile devices so ensuring that your website is mobile-friendly pays off. This includes designing pages that are easy to navigate and read on small screens, optimizing images and videos for mobile devices, and ensuring that your website’s pages load quickly on mobile devices.
Most CMS’ will automate this process for you, by creating alternative layouts for different screen sizes, using responsive HTML code and using multiple-sized image files for different devices. However, there can still be occasions that mobile best practices slip through the net and you can end up with content that is wider than the mobile screen, text that is too small or components that do not respond well to smaller screens.
Google Search Console is a really good source for picking up mistakes like this and you can use the mobile-friendly test to check any pages that have failed mobile responsiveness. If you do find there are errors to fix, you might need a website developer to help you out.
This has been the first lesson in my free technical SEO training, I hope you have enjoyed it. if you have any questions or feedback, please feel free to drop me a line or comment below.
Join me for the next part of our technical SEO training, when I will be taking a deep dive into Google Search Console and Bing Webmaster tools.
If you liked today’s lesson why not leave me a review?
Want to stay up to date with upcoming sessions?