In a sea of content, and with Google's increased focus on Core Web Vitals as a ranking factor, a Javascript-heavy website has fewer and fewer chances to be indexed to its true potential.
As developers who care about clean code and maintaining a healthy codebase, it's important that we strive to always design agile infrastructures - a fundamental prerequisite for good user experience. After all, we don't just care for clean code, we care for people to have an easy interaction with our work and find what they are looking for.
In this article we dive deeper into how to avoid writing cluttered Javascript-based sites, thus improving their SEO.
Contents:
- How Google Bot processes Javascript "how-google-bot-processes-javascript"
- Server Side Rendering (SSR) "server-side-rendering-ssr-"
- URL Optimization, CDN and other SEO Practices "url-optimization-cdn-and-other-seo-practices"
- Decreasing Bundle Sizes "decreasing-bundle-sizes"
- Case study: Netflix "case-study-netflix"
How Google Bot processes Javascript
Generally, Google Bot works as a two-way process that involves crawling and indexing. How it works is Google Bot processes the webpage, then enters the crawling stage of the overall process. Once crawled, the content of the webpage is passed and then stored into the indexing stage. All the links in the webpage are sent back to the crawler of Google Bot.
But with the rise of immense use of JavaScript as loading dynamic content, another stage has been added as well to it. This stage is called rendering.
Google Bot executes JavaScript while rendering the page. This also concludes that JavaScript will be downloaded, parsed and then executed before sending the content back to the indexing stage. Although the overall cycle has increased and become expensive. It is necessary because for SEO optimization, Google Bot needs to see all or at least some of the page’s content, including content which is rendered dynamically.
It's worth observing that Single Page Applications (SPAs) deliver the static parts (or the shell) and allow JavaScript to decide which content will be displayed later.
All these considered, we can conclude that if some of your content is displayed in the first initial render of the web page, only that data is indexed in the indexing stage. In other words, with JavaScript-heavy websites, Google Bot does not index anything, unless the page is rendered completely!
Server Side Rendering (SSR)
For SEO specifically, Server Side Rendering (SSR) is often preferred. Here’s why:
- Unlike Client Side Rendering, all JavaScript is executed and rendered on the server.
- JavaScript is rendered on the server, so it decreases the loading time, improving user experience, which allows Google Bot for quicker access to page’s content and links.
- You avoid the risk of partial indexing or zero indexing. Since everything is rendered on the Server, Google Bot gets the benefit from SSR to crawl and index all the content.
- With Server Side rendering, you can provide alt texts for images and videos in your HTML markup, which can help boost SEO.
Within the first wave of indexing, Google Bot crawls the source code entirely, so HTML and CSS is indexed immediately. Not only that, it also creates a crawling queue, for the further or future links that are present in the HTML.
The second wave of indexing can be delayed for any given time unless requested by Google bot or the client. This gives a huge benefit to SSR, since all source code and resources are already indexed in the first wave, giving a major advantage over Client Side Rendering (where resources are revealed on the rendering stage, creating partial indexing).
URL Optimization, CDN and other SEO Practices
On the one hand, we have a handful of standard on-page SEO practices:
- a clear and unique page title;
- using meta tags for description, content, keywords, author and title;
- using a favicon (the image file is stored with the extension of '.ico');
- ensure mobile-friendliness;
- Google bot and crawlers rely on the semantics and the layout structure of your website. Using headings, sections, and paragraphs in HTML, you can create a structure that can be crawled and indexed easily.
- providing captions and alt text in video and image HTML tags;
- not placing href or navigation links on div, sections, headings, or any other attribute other than anchor tag.
Apart from this, it's necessary that you master advanced optimization, further detailed below.
Optimizing Images and Videos with CDN
CDN, also known as Content Delivery Network, is a new way to optimize your images and your videos. These are servers placed in order to decrease the response time according to your region. By decreasing payload time, we can enhance SEO and the user experience of the website.
CDNs are highly critical when it comes to the loading time of the page. They will help you optimize, transform and provide you with delivery options for media content. They are also important to use for creating a cache of the content requested, in the nearest location of the user. So when the next time a user requests a website page, all images and media will be loaded from the cache.
In theory, CDNs help you save 40-60% loading time and they support content delivery options such as quality, size and format. On an abstract level, you can consider them as APIs for manipulating, formatting and accessing your images. All the formatting and transformations allow creators to create a wide range of customization within their platforms.
Because the images are stored in the form of URLs, an example for Image CDN would look like:
https://cdn.johndoe.com/images/dog.jpg?format=webp&quality=auto
Let's break down this example:
https://cdn.johndoe.com/
You can easily create images with CDN platforms and attach them to your domain./images/dog.jpg
is the source for the image in the images folder.?format=webp
As we discussed earlier, we can transform images according to our needs in the website, hence we can call images with either webp format or with the jpg, depending upon the browser.quality=auto
It defines the quality of the image requested by the user. CDN is smart enough to understand your request payload and the device that is receiving it, which includes pixel density. This helps you optimize the rendering of the page and increase SEO performance in areas like reduced bounce rate and increased conversion rates.
Although you can create and manage your own image CDNs, the most popular practice is to use a third party image CDNs. Companies like Amazon, Google and Cloudflare provide image CDNs, with a service fee.
URL Optimization
Optimizing URLs helps a lot with Google bot understanding your page. The more unique and more human-friendly your URL is, the more likely it is to get ranked higher and be prefered by Google Bot.
For example, you have a website called https://johndoe.com
that has a blog section. Whenever the main blog page is rendered, the URL would become https://johndoe.com/blogs
. This link fetches all the blog posts that have been published. If a user decides to open any of the blogs you have written, this is where URL optimization comes into play.
You can have two scenarios:
Case 1 is a readable link that lets Google Bot decide if this is human-friendly, thus bumping the page signals up.
Decreasing Bundle Sizes
When we talk about decreasing bundle sizes, we are basically talking about shipping less JavaScript to users.
Although decreasing bundle size may not look like a handful to deal with, think about how your app will be used by one billion users, across different platforms and different internet connections. You want your web app to load and run smoothly.
Here's where you will need to take a critical look at your packages and their size. To see the size of each major package from the beginning, we will install a bundle analyzer package.
npm install --save-dev webpack-bundle-analyzer
Find your webpack’s config file and then add the analyzer plugin.
const path = require("path");
//Webpack Analyzer
const WebpackBundleAnalyzer = require("webpack-bundle-analyzer")
.BundleAnalyzerPlugin;
//Is it in development mod
let devMode = process.env.devMode || true;
module.exports = {
entry: path.resolve("./app.js"),
mode: devMode ? "development" : "production",
output: {
filename: "app.js",
path: path.resolve("./dist"),
chunkFilename: "[name].js" ///< Used to specify custom chunk name
},
resolve: {
extensions: [".js", ".json"]
},
module: {
rules: [
{
test: /.jsx?$/, ///< using babel-loader for converting ES6 to browser supported javascript
loader: "babel-loader",
exclude: [/node_modules/]
}
]
},
//Add Analyzer Plugin alongside your other plugins...
plugins: [
new WebpackBundleAnalyzer()
]
};
When you have added the plugin in Webpack’s Config File, just run the Webpack build using:
npm run build
You will notice a new window will open and you can analyze your package’s size with comparison to others. However, you can also access the same bundle size screen by entering the URL http://127.0.0.1:8888/
after running the build command.
You can see now each bundle represented as a block - the bigger the block size, the bigger the bundle size and vice versa.
Running this analyzer has allowed us to:
- Explore different bundle size.
- Analyze the tree/hierarchy of each module and its dependencies.
- Differentiate between third-party node modules and app sources including scripts.
- Understand your web app dependency on each module in a better way.
Another example could be Lodash, for React apps. Using Lodash or any other package can hurt or damage React app performance in the longer run. Instead of using one module from a package, we are often importing the whole package and the dependencies along them, which makes the app slower.
To avoid that, consider using Lodash’s find in the correct way.
//So, instead of
import * as_from "lodash";
//We only need the find function
import {find} from "lodash";
You can fix dependencies and imports, then run the build again to see the difference in the bundle size. Despite being small in size, JavaScript can cause delays as well. For better performances, you can split your code efficiently and fetch content asynchronously.
Case Study: Netflix
JavaScript, although smaller in size, can affect your SEO in terms of time-to-interaction and other metrics.
To see exactly what this means, let's take a look at Netflix. Analyzing why users were not signing up on mobile devices, the Netflix team discovered users were using not-so-ideal internet connections to do the task. So Netflix knew that they had to make changes in the landing page for logged out or new users.
Netflix conducted a technical audit where they realised that the landing page, for logged out or new users, contained 300kb of JavaScript. It was built on React and was using libraries like Lodash for utility purposes.
All the Netflix pages were served through Server Side Rendering, serving with the generated HTML and later serving the client side application.
On improper internet connection, the landing page could take up to 7 seconds to load completely and become intractable for users, which is far too long for a simple landing page.
After a handful of changes and switching to an almost static landing page, Netflix was able to achieve these feats:
- Loading and Time-To-Interactive decreased by 50% for desktop and homepage for logged out users.
- The landing page for logged out users was switched to Vanilla JavaScript from React and other client-side libraries, thus reducing the JavaScript bundle size by 200kb.
- Prefetching HTML, CSS and JavaScript reduced time-to-interactive by 30%.
By prefetching handful resources (like React, future pages and CSS code) and optimizing the client-side code on Netflix’s logged-out landing page (using the built-in browser API and XHR), Netflix was able to reduce time-to-interactive by 30%. This allowed them to enhance and boost their sales with their time-to-interactive metrics, used in the previous sign up process.
Removing React from the landing page and prefetching allowed Netflix to leverage client-side React throughout the rest of the single page application sign up process.
Netflix realised that most of the elements that could be used, can be done with a simple HTML, CSS and Vanilla JavaScript. The parts that were passed on to Vanilla JavaScript from React, with 300 less lines of code, are:
- Event handling;
- Adding classes;
- Cookies;
- Language switching;
- Basic interactions;
- Performance measurement and logging.
Takeaway
The Netflix case serves as a reminder that, despite the usefulness of React, it is not the solution to every problem. Which is not just the case for React, but for many other libraries.
If your goal is to create a reliable website that performs well - long after you worked on it -, be proactive and prioritize the reduction of Javascript file sizes .