Rating: 7.7/10.
Building Browser Extensions: Create Modern Extensions for Chrome, Safari, Firefox, and Edge by Matt Frisbie
Book about browser extensions, covering the basics of extensions primarily for Chrome but also for other browsers. It explores different parts of extensions and how they can interact with each other, as well as the sandbox model, various permissions, that are needed to build an extension, making it a solid book for browser extension developers. The only downside is that some sections are a bit lazily written, as they merely provide a list of all the permissions or all parts of the extension API in alphabetical order.
Chapter 1. Browser extensions are similar to mobile apps in that they are downloaded from an app store with mandatory review and interact with well-defined APIs in a sandbox. Popular extensions include those for ad blocking, password management, writing assistance, screen capture, cryptocurrencies, developer tools, etc.
Chapter 2. The browser model has different tabs in sandboxes, so the same origin can share things like cookies and local storage. Extensions are separate from the page and have access to additional browser APIs like bookmarks and history that regular pages do not have, and are technically not required to interact with the page itself at all. The manifest JSON contains information such as name, authors, permission requirements, and other options; V3 makes some changes from V2, such as making ad blocking more difficult. The popup page triggers when the user clicks the extension button in the toolbar, and the options page is similar, but it’s when the user clicks the option in the menu. The content script is injected into the web page and has fewer permissions than a pop-up page. Lastly, extensions can create panels in the dev tools. Examining several extensions such as Honey, LastPass, and Grammarly, most of them rely on content scripts to inject things into the page with a pop-up page for additional features.
Chapter 3. Crash course on how to develop a simple extension. First, enable developer mode and load an extension from the local file system. The Chrome API offers ways to send messages between different parts of our extension, such as the content script, pop-up, and background worker. The content script can change styles and run a script on all pages.
Chapter 4 – the browser extension architecture. Background scripts are a singleton instance, and the WebExtensions API connects various parts together. A pop-up can have multiple instances, but only one per browser window. Extensions generally have access to their own files by making requests to the extension file server, which functions like a network request but does not actually use the network. Web scripts only have access to the extension files if they are made web-accessible through the manifest.
Chapter 5: provides an alphabetical list of all the properties in the manifest JSON. Properties such as content script can match URL globs, allowing them to run only on certain domains. It can customize various aspects, such as the appearance of the pop-up icon, the default search engine, and the home page. You can also define keyboard shortcuts that trigger acommand, set the content security policy, etc.
Chapter 6. Manifest V3 introduces some major changes. Scripts must now be included in the extension and not loaded from elsewhere. Background scripts respond to events instead of running continuously, and they cannot set the timeout (you must use the Alarm API, which will wake the service later); they cannot use global variables (the Storage API should be used instead). Network blocking must be based on static rules and not run a script to determine whether a request should be blocked – this severely limits ad blocking.
Chapter 7. Background service workers define a variety of event handlers, and they will terminate and wake up many times, they don’t have access to the DOM, the window, or local storage. When debugging, use the service worker link in the extensions to open up DevTools for debugging. A common pattern is to handle secrets and authentication in the popup page and background scripts so that credentials never exist in the content script, which is visible to the host page.
Background scripts can perform several tasks, including injecting scripts into the page and running them, opening tabs, and sending and receiving messages to content scripts on different tabs. In this way, they can serve as a message broker, enabling content scripts on different tabs to communicate with each other. Generally, background services are not supposed to persist indefinitely, but there are some hacks if this is really needed, such as periodically sending a message to a listener registered on a content script.
Chapter 8. A popup page is similar to a normal page, except there are no browser controls. They cannot be opened programmatically, so they can only open if the user clicks the icon or uses a keyboard shortcut. Only one popup page is visible at a time in a browser window, but it’s possible to have multiple browser windows, and therefore multiple instances of the same popup page. An options page can either open in a new tab or a modal, but it’s recommended to use a tab since this is usually for configuring the extension.
Chapter 9: Content scripts are injected into the page, and they share most things with the page’s own script, including the DOM, cookies, and local storage, but not global variables. They interact with the host page by triggering events on elements that you can query with selectors. When using bundles and importing scripts, make sure to declare them to be web-accessible to be dynamically imported in a content script. Scripts are mostly injected in the manifest JSON, but they can also be programmatically injected by the background script using the scripting API.
Chapter 10: Devtool pages can be a panel or a sidebar and are HTML pages that render in DevTools. The networking API gives you access to network traffic in a JSON format, and you can also programmatically inspect DOM elements.
Chapter 11: This chapter gives an overview of the extensions API and all the functionality available. Messages can be sent from content scripts and extensions to open up ports to send lots of messages, and you can also send messages to native applications. The API allows altering the behavior of the omnibar, context menu when right-clicking, sending OS notifications, managing bookmarks, history, downloads, fonts, and many other features.
Chapter 12. Permissions are declared in a manifest, and if missing, the API will be undefined when trying to use it. Optional permissions are those that are not requested when the user first installs the extension but only during runtime, if the code tries to use it – the script is blocked until the user accepts the permission, and then returns as if you had the permission the whole time. Once granted, the permission is permanent. To see the install-time permission warning, you need to load the extension as a packed extension. If requesting a new permission and updating the extension, it is silently disabled until the user explicitly accepts the permission. The active tab permission grants temporary scripting and network access to the current tab when the user triggers it until they navigate away from the page. Host permissions give access to pages at all times, and this can be specified for all websites or only some websites (partial permission).
Chapter 13. Content scripts can execute requests with the authentication of the host page, which is powerful for performing authenticated actions on websites that the extension wants to modify. An extension ID is generated by the Chrome Web Store, but it’s possible to pin it, which is useful for local development if you have the public key. Various authentication methods are available, but some, like cookies, don’t work well due to the permissions model. The easiest way is OAuth since API functions implement the flow. The webRequest API can inspect and modify network requests, but it will be discontinued in Manifest V3 and replaced with the declarativeNetRequest API, which can perform actions based on a list of rules either specified in a manifest or updated dynamically, but is much less powerful than the previous API.
Chapter 14. Various developer pages can help you inspect the state and debug Chrome extensions, but be careful with certain features like background service inspection, as it behaves differently when the tool is open. Unit tests work as normal JavaScript; you can stub out API functions and assert that they are called correctly. For integration tests, you can use Puppeteer to launch a Chromium browser within a test. Every update will require App Store review, and once approved, it will be automatically pushed to users, which can take from a few minutes to a few days. It is possible to specify a page to which the user will be redirected after uninstall, allowing you to collect surveys and feedback.
Chapter 15: Chrome holds about 2/3 of the desktop browser market share, so targeting it with your extension will cover most users; Chrome extensions will work on Chromium-based browsers like Edge and Opera with minimal modification. Each of Firefox, Edge, Safari, and Opera, all have their own marketplaces for extensions. Mobile extensions are much less common but are still available on some platform combinations, such as Firefox on Android, but Chrome extensions are not available on mobile. Safari extensions are installed via an app, which requires a bit of app code written in Swift but is otherwise similar to Chrome extensions; on iOS the only browser that allows extensions is Safari. Firefox extensions can run on desktop and Android, but there are a number of API differences, and it is notably slower in the transition to Manifest V3.
Chapter 16: covers tooling and frameworks for extension development. If the extension only has one entry point, then create-react-app works well, eg: if the extension is contained in the popup page. Routing cannot change the page URL but can use the hash router to put the route after the extension URL. Bundlers: Webpack is the go-to tool for most web development for bundling, but Plasmo is specifically geared towards extension development and offers some useful features like generating multiple different manifests, environment variables, content script mounting, and publishing to marketplaces.