Email or username:

Password:

Forgot your password?
Simon Willison

How hard is it to process untrusted SVG data to strip out any potentially harmful tags or attributes (like stuff that might execute JavaScript)?

I feel like this is well trodden ground for HTML these days, are there robust solutions for the SVG version of this problem?

30 comments
Simon Willison

I'm wondering if I can give untrusted authors the ability to go wild with custom SVG in a framed-off fixed size area of a web page, without breaching the security of the wider page or application

Joe Crawford

@simon SVG can do a lot, but perhaps a Web Component that would only hold a single svg would sequester such an SVG and prevent it from attempting to suck in information about the enclosing page.

Simon Willison

This is great! This Cloudflare Rust library includes a detailed test suite that tells me everything I wanted to know mastodon.theorangeone.net/@jak

Simon Willison

... and it looks like that means I can do an img tag with an src that points to a base64 encoded SVG object and any nasty JavaScript etc will be disabled for me - here's an example which seems to demonstrate that working gistpreview.github.io/?03f0076

Screenshot showing three SVG examples demonstrating base64 embedding. Contains heading "SVG Base64 Embedding Demo" followed by three panels: 1) "Simple Sun SVG" showing a yellow circle with rays, labeled "A basic sun shape with rays" 2) "Pelican SVG" showing a gray stylized bird shape, labeled "A stylized pelican shape" 3) "SVG with JavaScript (ignored)" showing a pink square with text "JS Ignored", labeled "SVG with JavaScript that gets ignored when embedded as an image". Footer note states "When SVGs are embedded using img tags with base64 data URIs, any JavaScript or interactive elements are safely ignored by the browser."
Kevin Marks

@simon right, an img tag sandboxes them. What I do for svgshare.com is both display them as img and also run the svg through python html5lib and remove any script elements. (I also inline it in the upload dialogue so anyone trying to xss me does it to themselves instead). The other approach is what feedparser does and whitelist svg and html elements.

sage

@simon The main downside with this approach is that it doesn't let you style the SVG paths with CSS, which may or may not be necessary depending on your use case

sayrer

@simon you can try stuff like github.com/cure53/DOMPurify. I am not sure that is the best one, but they exist (I wrote one, once upon a time).

Marco Rogers

@simon sounds like you want a sandboxed iframe?

Simon Willison

@polotek yeah probably! I'm still trying to work up my confidence in those, detailed and comprehensive documentation on exactly what the sandbox attribute does has been hard to come by

Terence Eden

@simon the JS in an SVG cannot interact with anything outside of itself.
So while an SVG can do all sorts of crazy things, it can't escape its sandbox.

Jake Archibald

@simon <iframe sandbox> is useful here. You can even allow JavaScript but have it run in an opaque origin.

Simon Willison

@jaffathecake I'm desperately keen on learning the true ins and outs of that, but I've found detailed documentation (including browser support) on all of the options you can stuff in that sandbox attribute frustratingly difficult to locate

Simon Willison

@jaffathecake it's the best I've seen but it still leaves me with so many questions... how good is browser support for each of those allowX things? What do browser security experts advise in terms of using them?

I'm really paranoid :/

Jake Archibald

@simon the browser support for the various allow features is in the table at the end of the page

Frederik Braun �

@simon @jaffathecake if you just want the SVG displayed, put them in an <img> tag. Otherwise, your favorite sanitizer library DOMPurify has great SVG support. (Iframe sandbox works really great too!!)

Jake Howard

@simon It's pretty niche sadly. But there is a library from Cloudflare which helps github.com/cloudflare/svg-hush

AK

@simon an html sanitizer would probably cover it because SVG can be inlined into HTML

Edit: they *should* but can't confirm they *do*

Martin Owens :inkscape:

@simon

It depends on what you are doing with the svg. If all you need to do is display it, then an img tag is your friend as the browsers have done all the work isolating it. If you need something a bit special like external css or javascript (rather than smil animations) then you do have to embed it.

Removing potentially harmful things is fairly easy though. Kill any script tags and js attributes, any style tag headers, nuke xlinks that don't start with a # (object id).

Good luck.

Kye Russell

@simon I haven’t read through all the replies, so apologies if this has already been said, but I believe that the BIMI folks defined a ‘safe’ SVG standard.

João S. O. Bueno

@simon based on the exoerience of people who tried to create a Python sandbox over the decades, I'd say it is pretty much impossible. (save for a browser saparayed as another page box: i.e. a "Frame")

Simon Willison

@gwidion I think JavaScript sandboxes are a whole lot easier than Python, because browsers are already the most widely-deployed sandboxes in the world

João S. O. Bueno

@simon i agree that a "document" in a tab or a frame is a good sandbox. But I doubt very much one can achieve slfurther segregation within a document. there are way too many ways of linking back to javascript from html or svg tags, for example. And JS, on its side, has no segregation or protection whatsoever: one is free to manipulate all the DOM and beyond.

Simon Willison

@gwidion it looks to me like claude.ai has a robust solution to this, using a combination of iframes with the sandbox attribute and CSP headers, plus web workers with CSP headers and careful application of postMessage

I'm still trying to reverse engineer how their solutions work though

martin sereinig

@simon CSP is probably a good second layer of security, no matter what you end up doing.

Pelle Wessman

@simon I wonder if you could do a similar approach as eg Figma used for their third party plugins: Make it all happen in a WASM script that’s sandboxed

So that and render to a canvas?

Go Up