Rating: 8.3/10.
Web Security for Developers: Real Threats, Practical Defense by Malcolm McDonald
Book about web security, for developers looking to secure their apps. It offers best practices to prevent hacks; understanding security also sheds light on how various web architecture components are crafted and motivations for their design.
Chapter 1: Hacking a website can be straightforward. Just install Kali Linux, which comes with numerous security and hacking tools. The program Metasploit runs as a single-line command to search for vulnerabilities in a website. Of course only do this on websites you own, never on someone else’s site.
Chapter 2: This chapter delves into the basic components of the internet such as TCP/IP, DNS, and TLS. A common method to maintain sessions is for the server to send back a Set-Cookie header in the HTTP response. The browser then sends this cookie back in subsequent requests, making it a common target for hacking.
Chapter 3: how browsers work. They have a rendering pipeline that processes the HTML and CSS, integrating styling information. JavaScript is executed in a sandbox. Browsers also manage stuff like caching DNS and security certificates.
Chapter 4: overview of web server-side technologies. Websites can be categorized into static and dynamic. Static files are directly returned by the server, but URLs are resolved not necessarily corresponding to file system. Some sites use content management systems (CMS), others use dynamic HTML generation. Other essential technologies include databases and various caching layers like Redis and Memcached.
Chapter 5: overview of processes that professional developers use, including version control, test environments, the release process, builds and containers.
Chapter 6. Injection attacks occur when you send some sort of malformed request to the server. SQL injection works by string concatenation making it execute arbitrary SQL commands, and commenting out the intended command. You can mitigate this by using prepared statements and binding the data. Most ORM methods also protect against injection, but not always if you do string concatenation. Another line of defense is to avoid revealing debugging information to potential attackers, which turns it into a blind attack, which is harder to execute.
Command injection attacks convince the web server to execute user input as a shell command. To mitigate this, escape characters that can be used in shell arguments and use more secure system call functions that prevent attackers from adding extra characters as arguments to trigger new commands.
Remote code execution is when the attacker tricks the server into executing arbitrary code. This often happens during the serialization of objects, so ensure that you keep your dependencies updated since older versions might have vulnerabilities.
File upload vulnerabilities arise when a web app supports file uploads, like images, and the server can be tricked into executing this file. To mitigate this, disable the execution permission on uploaded files and rename the files on disk to avoid arbitrary extensions.
Chapter 7. Cross-site scripting attacks (XSS) occur when malicious JavaScript is embedded on a page that’s then sent to other users. The most common XSS is stored scripting, where data enters the database and later appears on a page, a simple example is an alert using a script tag. To mitigate this, escape characters that could cause a script to execute. Also, implement a content security policy in the HTTP response header that disables the script tag and only allows imported JavaScript.
Reflected XSS attacks happen when an attacker sends malicious code in a URL, which, when opened by the victim, executes the code. DOM-based XSS occurs in the URI fragment, often used for navigation in single-page applications. It’s tricky because it never reaches the server, making it harder to detect.
Chapter 8. Cross-site request forgery attacks (CSRF) typically occur when a GET request changes the state, such as modifying the contents of a page. The victim can be tricked into clicking a link, triggering this action. One mitigation method is to ensure GET requests only retrieve data and never change information. Another is using anti-CSRF cookies that are unique to the request and validated server-side. The SameSite attribute tells the browser to strip cookies from requests coming from a different domain, but this can be inconvenient if linking to a logged-in page since it might require the user to log in again; setting it to Lax allows only GET requests to send cookies from other domains. Another best practice is requiring re-authentication for sensitive actions, like changing passwords.
Chapter 9: Authentication. There are several ways to handle authentication. The basic HTTP native authentication is rarely used because it’s not visually appealing and cannot be customized, typically, applications use an HTML form that collects login and password credentials. The most straightforward attack is a brute force attack that attempts to guess all possible passwords; many security threats can be reduced by using a third-party authentication service based on OAuth.
When using an email as the login username, it’s crucial to validate the email address. This can be done by sending a confirmation link to ensure users have access to the account. It’s also a good practice to disallow certain email domains for throwaway emails using a blacklist. Password reset links should expire after 30 minutes to prevent misuse of old reset links.
Passwords should never be stored in plain text. Instead, always store a cryptographic hash. By doing this, attackers can’t use rainbow tables to look up commonly used passwords. Using salting and multiple hashing rounds also makes it harder for hackers to brute force the hash, this can be done using bcrypt. Multi-factor authentication (MFA) requires users to provide at least two forms of verification, like a password and a six-digit code from a smartphone. Always include a logout button that invalidates the session server-side.
Attackers often try using leaked credentials to gain access. To counter this, it’s important to prevent user enumeration: don’t allow attacker to reveal if a username exists on your site, always display the same error message whether a user enters an incorrect password or if the username doesn’t exist. Additionally, ensure that server request times are consistent to prevent timing attacks, where a request might be slower when a valid user is identified.
Chapter 10: Session hijacking is stealing someone’s session while it’s active. For server-side sessions, only an identifier is stored, which maps the session to data on the server. On the other hand, client-side sessions store all the session information in the cookie, not just the session ID. This data can be either encrypted or simply signed to prevent tampering. A common method to hijack sessions is by stealing cookies, often via XSS attacks, eg, if malicious JavaScript can forward cookies to a hacker’s servers. You can counteract this by setting the cookie as HTTPOnly, making it inaccessible from JavaScript, using headers to ensure that cookies are only sent via HTTPS and never HTTP; the SameSite Lax prevents CSRF attacks.
Always transmit the session ID using cookies and never through the URL. This precaution prevents fixation attacks, which rely on URLs containing cookies set by attackers. Furthermore, a weak session ID that isn’t adequately randomized can be exploited by attackers, who might enumerate to guess the session ID.
Chapter 11: Authorization involves determining which permissions a user has after they’ve logged in. A typical way to manage this is through roles or groups. A user becomes part of a group that’s granted specific permissions. An alternative method is ownership-based control, where each resource is owned by a user who can then grant access to others, this is frequently seen in social media. On the server side, access control can often be implemented using annotations, like in Flask or Django.
Don’t assume a URL is secure just because it hasn’t been publicly shared. Attackers commonly guess URLs based on patterns. Another common vulnerability is directory traversal attacks. In these, attackers might persuade the server to navigate up the folder structure and access sensitive files that shouldn’t be exposed. It’s crucial to sanitize input paths, but many non-obvious methods define relative file paths, making them hard to spot with simple regex checks. So, it’s best to rely on the secure file retrieval methods provided by your web server.
Chapter 12: Zero-day attacks happen when a vulnerability is exposed, and within a day or so, hackers scan for vulnerable sites to exploit. To defend against this, make it hard for hackers to figure out which server side tech you’re using. This can be done by hiding server-side headers that show the technology and its version, and by avoiding URLs that give away details about the server framework and cookie settings. Also, turn off client-side error reporting that sends stack traces to the user, as it can help hackers find security gaps. Often, JavaScript files are compressed to boost performance, this has a beneficial side effect of making it more challenging for someone to reverse engineer and give away sensitive info in comments.
Chapter 13: Encryption typically uses Transport Layer Security (TLS), which powers the HTTPS protocol. This employs a mix of symmetric key encryption (where the same key both encrypts and decrypts) and asymmetric encryption methods like RSA and elliptic curve cryptography. For symmetric encryption, both parties need the decryption key. To provide it securely, asymmetric encryption comes into play: a public encryption key is shared openly, while a private decryption key remains confidential.
The TLS handshake primarily uses symmetric encryption but begins with slower asymmetric encryption to share the symmetric key. This handshake involves sharing information about the cipher algorithm and presenting a certificate signed by a Certificate Authority, along with the server’s public key. The browser then verifies the certificate’s authenticity, creates a random key, and encrypts it using the server’s public key, ensuring both parties have a common symmetric encryption key.
A Certificate Authority (CA) issues certificates confirming domain ownership. This process thwarts DNS attacks where DNS redirection could lead users to a fraudulent server. Such a scheme would fail because the fake server wouldn’t have a legitimate certificate signed by the CA. To get a certificate from a CA, you must prove domain ownership. You then provide a public key, which the CA signs using its private key. Let’s Encrypt is a non-profit offering this service, but there are also commercial CAs that provide extended domain ownership validation.
Note the difference between web servers like nginx, which manage low-level TCP operations and serve static files quickly, and application servers that produce dynamic content, like Flask. HTTPS configuration happens at the web server level, meaning the application server only deals with unencrypted content. A recommended approach is to redirect HTTP traffic to HTTPS and always use headers instructing browsers not to transmit session data over HTTP.
Switching from HTTP to HTTPS can prevent many man-in-the-middle attacks. Examples include Wi-Fi hotspots or routers being compromised, ISPs accessing or modifying sensitive data, or unwanted ads being inserted into content.
Chapter 14: Many vulnerabilities arise from third-party code, such as software library dependencies and the operating system. Ensure you keep these dependencies updated and verify they haven’t been compromised by using checksums. Another frequent security issue is tied to configurations. Ensure your web server is set up correctly, such as exposing open directories or default admin credentials which can lead to quick compromises. This also goes for admin pages and test or staging sites which people often forget about.
Sometimes, security issues can come up in the services you use. For example, API keys shouldn’t be sent where hackers can see them on the frontend. And if you’re using webhooks for a service to connect back to yours, check they’re authenticated correctly. When third-party services ask you to add JavaScript to your site, it could let malware run on your site, eg, if it’s an ad script, use trusted platforms like Google and use the sandboxed iframe tag to limit what the ad JavaScript can do.
Chapter 15: XML Attacks. XML is an extension of HTML for different data types. But one insecure feature is the Document Type Definitions (DTD), which are meant to let an XML document declare what kind of data is in it, but hackers can misuse it in many ways, eg: the XML bomb uses string expansion to crash the parser, or allowing arbitrary URLs to be fetched by the server. When allowing XML files to be uploaded, make sure you turn this feature off.
Chapter 16: Malicious users can exploit your site to target other sites. Since emails don’t have solid authentication, phishers can pose as your domain. To counter this, configure your email server to use SPF and DKIM. Open redirects let your site send users to any URL, which can hide harmful links in emails, making them seem like they’re from your domain, and this can get your domain blacklisted by spam filters. Avoid this by never allowing redirects to random URLs from your domain.
Clickjacking happens when your site is framed within another, tricking users into clicking. Hackers can overlay a transparent div on your iframe to grab the click, and stop this with a security policy that blocks embedding your site as an iframe elsewhere. Server-side request forgery (SSRF) tricks your server into making requests to random URLs. This can be handy for hackers building botnets using your server.
Chapter 17: Denial of Service Attacks. These can occur on several levels, like pings using the ICMP protocol, TCP, or HTTP. Essentially, any operation that costs the server more computational resources than it does the hacker can amplify the damage, like a SYN flood, which overloads the server with open connections that cost memory. If the attack comes from one source, it’s easier to counter by filtering out the offending IP. However, distributed denial of service (DDoS) is trickier; CDNs often protect against this using heuristics, as they have more data than a lone website.