Previously, I’ve written a lot about Application Protocols, which are a simple and popular common mechanism for browsers to send a short string of data out to an external application for handling. For instance, mailto is a common example of a scheme treated as an Application Protocol; if you invoke mailto:[email protected], the browser will convert this to an OS execution of e.g.:
outlook.exe mailto:[email protected]
Application Protocols are popular because they are simple, work across most browsers on most operating systems, and because they can be added by 3rd parties without changes to the browser.
However, Application Protocols have one crucial shortcoming– they cannot directly return any data to the browser itself. If you wanted to do something like:
<img src='myScheme://photos/mypic.png' />
… there’s no straightforward way for your application protocol to send data back into the browser to render in that image tag.
You might be thinking: “Why can’t I, a third-party, simply provide a full implementation of a protocol scheme, such that my object gets a URL, and it returns a stream of bytes representing the data from that URL, just like HTTP and HTTPS do?“
Asynchronous Pluggable Protocols
Back in the early days of Internet Explorer (1990s), the team didn’t know what protocols would turn out to be important. So, they built a richly extensible system of “Asynchronous Pluggable Protocols” (APP) which allowed COM objects to supply a full implementation of a protocol. The browser, upon seeing a URL (“moniker”) would parse the URL Scheme out, then “bind” to the APP object and send/receive data from that object. This allowed Internet Explorer to handle URLs in an abstract way and support a broad range of protocols (e.g. ftp, file, gopher, http, https, about, mailto, etc).
In many cases, we think only about receiving data from a protocol, but it’s important to remember that you can also send data (beyond the url) to a protocol; consider a file upload form that uses the POST
method to send a form over HTTPS, for example.
Writing an APP was extremely challenging, and very risky– because APPs are exposed to the web, a buggy APP could be exploited by any webpage, and thanks to the lack of sandboxing in early IE, would usually result in full Remote Code Execution and compromise of the system. Beyond the security concerns, there were reliability challenges as well– writing code that would properly handle the complex threading model of a browser downloading content for a web page was very difficult, and many APP implementations would crash or hang the browser when conditions weren’t as the developer expected.
Despite the complexity and risk, writing APPs provided Internet Explorer with unprecedented extensibility power. “Back in the day” I was able to do some fun things, like add support for data:
URLs to IE7 before the browser itself got around to supporting such URLs.
Understanding Custom Schemes
Sending a URL to into an APP object and getting bytes back from a stream is only half of the implementation challenge, however.
The other half is figuring out how the rest of the browser and web platform should handle these URLs. For Internet Explorer, we had a mechanism that allowed the host (browser) to query the protocol about how its URLs should be handled. The IInternetProtocolInfo interface allowed the APP’s code to handle the comparison and combination of URLs using its scheme, and allowed the code to answer questions about how web content returned from the URL should behave.
For instance, to fully support a scheme, the browser needs to be able to answer questions like:
- Is this scheme “standard” (allowing default canonicalization behaviors like removing
\..\
sequences), or “opaque” (in which other components cannot safely touch the URL? - Is this scheme “Secure” (Allow in HTTPS pages without mixed content warnings, allow WebAPIs that require a secure context, etc)
- Does this scheme participate in CORS?
- Does this scheme get sent as a referrer?
- Is this scheme allowed from Sandboxed frames?
- Can top-level frames be navigated to this scheme?
- Can such navigations only occur from trusted contexts (app/omnibox) or is JavaScript allowed to invoke such navigations?
- How do navigations to these urls interact with all of the other WebNavigation/WebRequest extensibility APIs?
- How does the scheme interact with the sandbox? What process isolation is used?
- What origin is returned to web content running from the scheme?
- How does content from the scheme interact with the cookie store?
- How does it interact with CSP?
- How does it interact with WebStorage?
Implementing Protocols in Chromium
Unlike Internet Explorer, Chromium does not offer a mechanism for third-party extensibility of its protocols; the browser itself must have support for a new protocol compiled in.
Similarly, there’s no IInternetProtocolInfo interface for protocol implementors; instead the scheme must be manually added to each of the per-behavior lists of schemes hardcoded into Chromium.
Impatient optimist. Dad. Author/speaker. Created Fiddler & SlickRun. PM @ MSFT '01-'12, and '18-, presently working on Microsoft Edge. My words are my own. View more posts