How to escape JavaScript for a script tag

To add JavaScript to a web page, we use a <script> tag like this:

<script>console.log("Hello!");</script>

But what if we need to add arbitrary JavaScript to our web page? Say, a valid script like this?:

if (x<!--y) { ... }

We can’t just write that in a <script> tag, because the browser will interpret the <!-- as the start of an HTML comment!

“But that’s fine,” you think. “We can just escape the string. This is how we serialize strings everywhere else in programming.”

You might reach for HTML entities, replacing < with &lt;. After all, isn’t the JavaScript just ordinary text content? No, it’s not! Once the browser sees a <script> tag, it goes into a special JavaScript parsing mode, where HTML entities are not interpreted!

In this JavaScript parsing mode, the browser is looking for one of two strings:

To “escape” arbitrary JavaScript, we need to avoid those two substrings.

If we find <!-- in our JavaScript, we can’t just replace it, because its meaning is context-dependent:

// This is a comment containing <!--
let foo = x <!--y; // That's valid JS operators
const s = "This is a string containing <!--";

To “escape” the above JavaScript, we’d have to write something like:

// This is a comment containing
let foo = x < !--y; // That's valid JS operators
const s = "This is a string containing <" + "!--";

This is not a simple string replacement. To do those replacements, we need to parse the JavaScript, and handle every possible context where <!-- might appear.

Here’s the HTML spec. It’s all rather horrifying.

Tagged #string-escaping, #script-tag, #javascript, #html, #security, #web, #programming.

Similar posts

More by Jim

👋 I'm Jim, a full-stack product engineer. Want to build an amazing product and a profitable business? Read more about me or Get in touch!

This page copyright James Fisher 2024. Content is not associated with my employer. Found an error? Edit this page.