Can you Syntax Highlight a code snippet on the web without overloading the DOM with a ton of `<span>` elements wrapped around the tokens?
Thanks to the Custom Highlight API, you can!
Can you Syntax Highlight a code snippet on the web without overloading the DOM with a ton of `<span>` elements wrapped around the tokens? Thanks to the Custom Highlight API, you can! 50 comments
With that in place, and after tokenizing code snippets (e.g. using @prismjs), it’s only a matter of assigning the tokens to the corresponding Highlight. `CSS.highlights.get(token.type).add(range)` The Custom Highlight API is supported in Chrome 105+ and Safari 17.2+. Firefox has experimental support. @bramus @hi_mayank The Pen worked for me in Firefox Nightly, so it appears MDN is accurate on this. Also, I don’t understand how it works, because your script and style blocks don’t have all the `span`s around highlighted stuff, but the Prism home page’s code blocks do have all those `span`s. So how Prism helps here is completely opaque. @hi_mayank @Meyerweb I have prism set up in manual mode, meaning it doesn't automatically kick in. I then manually call it to tokenize the code. This gives me a bunch of numbers about which token is where and what type it is. This info is then used to populate the Custom Highlight API. @bramus @hi_mayank @Meyerweb I think a next step towards a declarative API—one that doesn’t require JS—would be some way of teaching the browser grammars using something like PEG. Then you could set an attribute on a `<code>` element to specify which grammar you want to be used for it to auto-tokenize and highlight that code block. @knowler @hi_mayank @Meyerweb I don’t think that would work. Which languages do you include? Which versions of those languages? When do these definition files get updated? Would you be able to load your own? … Reminds me of authors requesting to put jQuery in the browser. Same questions arose. (We actually got that last thing … not by including jQuery in browsers but by having better JavaScript/DOM APIs nowadays) @develwithoutacause @bramus @hi_mayank @Meyerweb Ya, that sort of API is kinda what got me thinking about using grammars. I do think that would still be a really nice low cost API though. Scales down alright, even though you’d probably need a tool to manage it (otherwise… lots of counting and attention to whitespace). @knowler @bramus @hi_mayank @Meyerweb Yeah, definitely would need to be generated by a tool. @bramus @hi_mayank I think I grasp what you’re saying, but is there an article or something that breaks this sort of thing down in detail? @Meyerweb @hi_mayank Sorry, not yet. My typical flow is hack it together, put it on socials, and then maybe later write about it. What could help you better understand is to console.log(tokens) right after Prism has done its thing. MDN might also have some good info (I'm afk right now, so can't check) @bramus @Meyerweb @hi_mayank I'm curious too, need to do some sleuthing. This clearly wants to be a custom element, right? No shadow dom required, and the progressive enhancement story is "just add color". A perfect use-case. @mia @bramus @Meyerweb and then you plug the token positions into ranges (and the token types and ranges into highlights): https://developer.mozilla.org/en-US/docs/Web/API/CSS_Custom_Highlight_API#create_ranges @hi_mayank @mia @bramus @Meyerweb just trying to get my head around it - looks like key JS requirement is creating and registering the text ranges - and to do this requires a start an end "position" - so presumably you could "tokenise" somewhere other than the client as long as you shipped this big list of number pairs and types with each code example? Would maybe assume that with a large number of code examples it is very quickly less bytes to do it all clientside though - just thinking aloud… @mia @Meyerweb @hi_mayank Ooh, good idea. Would be possible indeed :) @bramus @hi_mayank Seems to be working in Dev Edition with the about:config option turned on. If you want to know the details: did a full write up on this one: https://www.bram.us/2024/02/18/custom-highlight-api-for-syntax-highlighting/ Also comes with an extra demo that syntax highlights the code in a [contenteditable] as you type. @bramus It’s interesting that this new approach prevents you from using many CSS features. I’m using Highlight.js to syntax-highlight code for LaTeX (to produce PDFs from Markdown). And I had to look up sequences of CSS class names in CSS files to get color, text weight, etc. That’s what you need to do here, too. It’s a shame there is no declarative (non-JS) version of this API. @bramus This is cool. Do you have a demo, or code available somewhere to play with? @bramus Slick! but if there's no markup it's not really a semantic document anymore, right? not standalone anyway. @tbeseda The way these highlighters typically work is by wrapping things in spans with a bunch of classes. These spans with classes add no semantics at all. Also, sometimes - e.g. on large files with many tokens - they can cause performance issues because of the larger DOM tree. Don't have info on how this API came to be, so don't know if and when TC-39 was consulted. @bramus that's actually the exact thing that I want to implement (for some time). I also couldn't find any reasons except some comments confirming that inputs/textareas don't work 😢 @cheeaun @bramus I was thinking of hacking something for textarea based on https://github.com/kueblc/LDT (which overlays a transparent <textarea> on top of a styled <pre>) Not sure the highlight API has been thought out for dynamic content... I don't know if you can change the bounds of a range after it's been registered and have the output updated... We'll see 🙂 @pygy @cheeaun Perfectly possible. Here's a demo that does on-the-fly highlighting: https://www.bram.us/2024/02/18/custom-highlight-api-for-syntax-highlighting/#highlighting-contenteditable That makes LDT redundant... Prism is ~10Kb larger, but its parsing abilities is heaps and bounds better than what LDT offers with just regexps. Prism also supports styling the content of script and style tags in HTML, out of the box... @bramus Some random thoughts: - Making the `<style>` contentEditable in this example is fun :) @bramus This demo is incredibly cool. I see a huge value in this API. I am wondering more about the trade-off: more spans vs more tokenization logic. What if code blocks can be rendered on the server? Is it worth the effort? I have so many questions about this… Thanks for sharing this snippet dude! @bramus Do CSS Highlights support bold/italic? I saw those in your code but had a play and couldn't get it to work. MDN lists a stricter set of allowed properties. Is that changing? https://developer.mozilla.org/en-US/docs/Web/CSS/::highlight#allowable_properties @davatron5000 @bramus I originally wrote this MDN page, and my recollection is that, no, highlight() doesn't support anything that would have an impact on layout. It can only be used to alter the painting of the range. @patrickbrosset @davatron5000 Correct. Only a limited set of styles are allowed. No changing the font-weight or the like. |
As a first step you need to define the various highlight styles in your CSS using `::highlight(x)` and also register them in the registry using `CSS.highlights.set(x, new Highlight())`
(x being the types of tokens: comment, property, boolean, class-name, etc.)