Friday, August 26, 2011

Why isn't SSL turned on by default for all websites?


  • Discussions around the Internet on the matter have been quite heated, with lots of people thanking him for his efforts in raising awareness on the security issues of modern Internet applications, and many others blaming him for making it way too easy for anyone -even people who know close to nothing regarding security- to hack into other people’s accounts on social networks, webmails and other web applications, provided some conditions are met. In reality, all these issues have been well known for years, so there is very little to blame Butler for, in my opinion, while we should pay more attention to the fact that most websites are vulnerable to these issues, still today.  So, if the issues highlighted by Firesheep hardly are news, why has it caught so much attention over the past few months?

Some context

Whenever you login on any website that requires authentication, two things typically happen:

1- first, you are usually shown a page asking you to enter your credentials (typically a username and a password -unless the service uses  OpenID or any other  single sign on solution, which is a quite different story), and upon the submission of a form, if your credentials match those of a valid account in the system, you are authenticated and thus redirected to a page or area of the site whose access would otherwise be forbidden.

2- for improved usability, the website may use  cookies to make logins persistent for a certain amount of time across sessions, so you won’t have to login again each time you open your browser and visit the restricted pages -unless you have previously logged out or these cookies have expired.

During the first step, the authentication requires your credentials to travel over the Internet to reach their destination, and  -because of the way the Internet works- this data is likely to travel across a number of different networks between your client and the destination servers; if this data is transferred in clear on an unencrypted connection, then there is the potential risk that somebody may be able to intercept this traffic, and therefore they could get hold of your credentials and be able to login on the target website by impersonating you. Over the years, many techniques have been attempted and used with different degrees of success to protect login data, but to date the only one which has proven to be effective -for the most part- is the full encryption of the data.

In most cases, the encryption of data transferred back and forth between the servers hosting web applications and the clients, is done by using HTTPS. That is, the standard HTTP protocol, but with the communication encrypted with SSL. SSL works pretty well for the most part: nowadays it is economically and computationally cheap, and it is supported by many types of clients. SSL encryption isn’t perfect though; it has some technical downsides more or less important and, besides these, it often gives the user a false sense of security if we also take into consideration other security treats concerning today’s web applications such as -for example-  Cross-Site Scripting: many people think that a website is “secure” as long as it uses SSL (and some websites even display a banner that says “this site is secure” and links to their provider of SSL certificates… -good cheap advertising for them), while in reality most websites may be affected by other security issues regardless of whether they use SSL encryption or not. However, if we forget for a moment other security issues, the main problem with SSL encryption is, ironically, in the way it is used by most web applications, rather than in the SSL encryption itself.

As mentioned above, web applications usually make use of cookies to make logins persistent across sessions; this is because the web is stateless. For this to work, these cookies must travel between client and server with each request that is for each web page you visit during a session within the same web application. This way the application on the other side can recognize each request made by your client and keep you logged in for as long as the authentication cookies are available and still valid.

The biggest problem highlighted by Fire sheep is that most websites only enable or enforce SSL encryption during the authentication phase, so to protect your credentials while you log in, but then revert to standard, unencrypted HTTP transfers from that point on. This means that if the website makes logins persistent by using cookies, since these cookies -as said- must travel with each request, unless the authentication tokens stored in these cookies are themselves encrypted and thus protected in a way or another (on the subject, I suggest you read  this), as soon as the user has been authenticated these cookies will travel with subsequent HTTP requests in clear (unencrypted) form, so the original risk of somebody being able to intercept and use this information still exists; the only difference is that in this case an attacker would more likely have to hijack your session by replaying the stolen cookies in their browser, rather than trying to login themselves by entering your credentials directly in the authentication form (this is because these cookies, usually, store  authentication tokens rather than credentials). The end result, however, is pretty much the same, in that the attacker can impersonate you in the context of the application.

So, why don’t websites just use SSL all the time?

CPU usage, latency, memory requirements

At this point, if you wonder why all companies don’t just switch SSL on by default for all their services all the time, perhaps the most common reason is that, traditionally, SSL-encrypted HTTP traffic has been known to require more resources (mainly CPU and memory) on servers, than unencrypted HTTP. While this is true, with the hardware available today this really is no longer too big of an issue, as also demonstrated by Google when they decided to allow SSL encryption for all requests to their services, even for their popular search engine. Here’s is what Google engineer Adam Langley said on this a few months ago:

” all of our users use HTTPS to secure their email between their browsers and Google, all the time. In order to do this we had to deploy no additional machines and no special hardware. On our production frontend machines, SSL/TLS accounts for less than 1% of the CPU load, less than 10KB of memory per connection and less than 2% of network overhead. Many people believe that SSL takes a lot of CPU time and we hope the above numbers (public for the first time) will help to dispel that. “

So, if SSL/HTTPS does not require a significantly higher amount of resources on servers, is it just as fine as unencrypted HTTP, only more secure? Well, more or less. In reality, SSL still introduces some latency especially during the handshake phase (up to 3 or 4 times higher than without SSL), and still requires some more memory; however, once the handshake is done, the latency is slightly reduced, plus Google are working on ways to improve latency. So connections are a bit slower, true, but Google -see Langley’s blog post- have partially solved this issues by caching a lot also HTTPS requests. Google have also solved the issue with higher memory usage by patching  OpenSSL to reduce up to 90% the memory allocated for each connection.

Static content and CDNs

Besides CPU/memory requirements and increased latency, there are other issues to take into account when switching SSL on all the time for a website. For example, many websites (especially large and popular ones like Facebook and others that are also targeted by Fire sheep) use a CDN distribution to reduce load on their servers, as well as to improve performance for their users depending on their geographical location; CDNs are great for this since they are highly optimized to serve static content from locations that are closer to users. This often reduces latency and so helps improve the overall performance of the site for those users. In most cases, using a CDN is as easy as serving the static content from canonical hostnames that point to the CDN’s servers directly.

But what happens if a website using a CDN is adapted to use SSL all the time? First, a few general considerations on the usage of SSL encryption with static content.

By “static content”, we usually mean images, style sheets, JavaScript, files available for download and anything else that does not require server side processing. This kind of content is not supposed to contain any sensitive information; therefore, at least in theory, we could mix SSL-encrypted, sensitive information served via HTTPS, with unencrypted static content served via HTTP, for the same website, at the same time. In reality, because of the way SSL support is implemented in the browsers, if a page that uses SSL also includes images and other content that is downloaded with normal HTTP transfers, the browser will show warnings that may look “scary” to users who do not know what SSL/HTTPS is. Here’s an example with Internet Explorer:



Because of this, it is clear that for a page using SSL to work correctly in browsers, all the static resources included in the page must also be served with SSL encryption. But this sounds like a waste of processing power... Doesn’t it? Do we really need to encrypt images, for example? So you may wonder why browsers behave that way by displaying those warnings. Actually, there is a very good reason for this: remember cookies? If a web page is encrypted with SSL but it also includes resources that are downloaded with standard, unencrypted HTTP transfers, as long as these resources are served from hostnames that can access the same cookies as the encrypted page those cookies will also travel in clear over HTTP together with those resources (for reasons I’ve already mentioned), making the SSL encryption of the page useless in first place. If browsers didn’t display those warnings, it would be possible to avoid this issue by serving the static resources from hostnames that cannot access the same cookies as the encrypted page (for example, with the page served from mydomain.com and static content served from anotherdomain.com), but it’s just easier and safer to enforce full SSL encryption for everything…

Sounds dirty and patchy, yeah? That’s the web today… a collection of technologies developed for the most part ages ago, when people just couldn’t foresee all the potential issues that have been discovered over the years. And it is funny, to me, that over the past few years we have been using buzzwords like “web 2.0″ not to refer to a set of new technologies that address all those issues but, instead… to refer to new ways of using the same old stuff (and perhaps not even “new”... think of  AJAX)  that have either introduced or highlighted more security issues than ever before.

Back to the CDNs

SSL requires a certificate for each of the hostnames used to serve static content, or a “wildcard” certificate provided that all the hostnames involved are just sub domains of the same domain name (for example, static.domain.com, images.domain.com and www.domain.com would all be sub domains for domain.com); if hostnames for the static content to be served by a CDN are configured as CNAME records that point directly to the CDN’s server, requests for that static content will obviously go straight to the CDN servers rather than to the website’s servers. Therefore, although the required SSL Certificates would already be available on the website’s servers, those certificates must also be installed on the CDN servers for the CDN to serve the static content under those hostnames and with SSL encryption; so in theory it is necessary for the website’s owner to simply provide the CDN company with the required certificates; the CDN provider then has to install those certificates on their servers. In reality, the SSL support provided by some CDN providers can be seriously expensive since it requires additional setup and larger infrastructure because of the aforementioned overhead; plus, most CDN providers do not even offer this possibility since traditionally they have been optimised for unencrypted HTTP traffic, at least so far.

As you can easily guess, the static content/CDN issues alone are already something that could make switching a site like Facebook to using SSL all the time, more challenging than expected.

“Secure-only” cookies

After all I’ve already said about cookies, you may think that as long as the website uses SSL by default, all should be fine. Well... Not exactly. If the website uses by default SSL but still allows requests to a page with unencrypted HTTP, it would still be possible to steal cookies containing authentication tokens / session ids by issuing an unencrypted request (http:// without the s) towards the target website.

This will allow once again the cookies to travel unencrypted, and therefore they could still be used by an attacked to replay a victim user’s session and impersonate them in the context of the web application.

There are two ways to avoid this. The first is to flag the cookies as secure, which means the cookies can only be downloaded with https://, therefore they will be encrypted and the problem disappears. The second is to make sure the web server hosting the web application enforces SSL by rewriting http:// requests to https://. Both methods have basically the same effect with regards to the cookies, however I prefer the second one since it also helps prevent the mixed encrypted/unencrypted content issues we’ve seen above and the related browser warnings.

Websites that use SSL but only for the “submit” action of an authentication form

I have seen SSL used in various wrong ways, but this is my favorite one. I’ve mentioned how Fire sheep has highlighted that most websites only use SSL for the login page, and why this is a weak way to protect the user’s credentials. Unfortunately, there are also websites that only use SSL not for the login page itself, which simply contains the authentication form, but for the page that form will submit the user’s credentials to.

I’ve found an example earlier of a website that once clicked on the “Login” link, redirected me to the page at http://domain.com/login.php – so without SSL. But in the source code I could see that the form’s action was instead set to the page https://domain.com/authenticate.php which was using SSL. This may sound kind of right, in that the user’s credentials would be submitted to the server as encrypted with SSL. But there’s a problem: since the login page itself is not encrypted, who can guarantee that this page will not be tampered with and perhaps submit the user’s credentials to another page (a page the attacker has control over) rather than the authenticate.php page the website’s owner meant?

See now why this is not a good idea?

Content hosted by third parties

CDN is only part of the story when it comes to static content. The other part of the story concerns content that may be included on a page but is served by third parties and you have no control on the content itself nor the way it is served. This has become an increasingly bigger problem nowadays with the rise of social networks, content aggregators, and services that add new functionalities to a website, very easily. Think of all the social sharing buttons that these days we see on almost every website; it’s extremely easy for a website’s owner to integrate these buttons in order to help increase traffic to the site: in most cases, all you have to do is add some JavaScript code to your pages and you’re done.

But what happens if you turn SSL on for your page, which then includes this kind of external content? Many of these services already support the HTTPS protocol, but not all of them, for the reasons we’ve already seen regarding overhead and generally higher demands in terms of resources. Plus, for the ones that do support SSL/HTTPS, you as website owner would need to make sure you’re using the right code snippet that automatically takes care of switching to either HTTP or HTTPS for the external content, depending on the protocol used by your own page. Otherwise, you may have to adapt your own pages so that this switching is done by your code, provided the external service supports SSL, at least.

As for those services that make it easy to add functionality to your website, I’ve already mentioned Discuss, for example, as my favorite service to “outsource” comments. There are other services that do the same (Intense Debate being one of them), and there are a lot of other services that add other kinds of functionality such as content rating or even the possibility for the users of your website to login on that website with their Facebook, Google, etc. credentials.

All these possibilities make it easy nowadays to develop feature-rich websites in a much shorter time, and make it pretty easy to let applications interact with each other and exchange data. However, if you own a website and plan to switch SSL always on for your site, you need to make sure all of the external services the site uses already support SSL. Otherwise, those browser warnings we’ve seen will be back, together with some security concerns.

Issues with SSL certificates

There’s a couple of other issues, perhaps less important, but still worth mentioning, concerning domain names and SSL certificates, regardless of whether a CDN is used or not. The first one is that, normally, it is possible to reach a website both with and without www. So for example both vitobotta.com and www.vitobotta.com lead to this site. At the moment, since readers do not need to login on this site (comments are outsourced to Discuss) there is no reason why I would want to switch this site to always use SSL, at this stage. But if I wanted to do so, I would have to take into account that both vitobotta.com and www.vitobotta.com lead to my homepage, when purchasing an SSL certificate. The reason is that not all SSL certificates secure both www and non-www domains; even wildcard certificates often secure all sub domains (including www) but not the non-www domain; this means that if you buy the wrong certificate for a site you want to use with always-on SSL encryption, you may actually need to buy a separate certificate for the non-www domain. I was looking for an example earlier and I found one very quickly in the website of my favorite  VPS provider Libode. The website uses a wildcard certificate that secures all the *.linode.com subdomains, but not linode.com, so if you try to open https://linode.com in your browser you’ll see a warning similar to this (in Firefox in the example):



Generally speaking, it is better to purchase a certificate that secures both www and non-www domains (and perhaps other subdomains depending on the case). In case you are interested, an example of cheap wildcard certificate that does this is the RapidSSL Wildcard certificate. An alternative could be a certificate with the subjectAltName field, which allows you to specify all the hostnames you want to secure with a single certificate (provided you know all of them in advance).

The other issue with certificates is that companies often reserve several versions of the same domain name differing just by extension, with the purpose of protecting the branding of a website. So, for example, a company may want to purchase the domains company.com, company.info, company.net, company.org, company.mobi and so on; otherwise, if they only purchased for example company.com, others would be able to purchase the other domains and use them to their own benefit, black hat  SEO techniques and more. Good SEO demands that a websites only uses a single, canonical domain, so it’s best practice to redirect all requests to the alternate domain names to the “most important” one the company wants to use as the default one (for example company.com). But as for the SSL certificates, it just means that the company must spend more money when purchasing SSL certificates.

Caching

Caching is one of the techniques most commonly used by websites to reduce load on servers and improve the performance both on the server and on the client. The problem with caching, in the context of SSL encryption, is that browsers differ in the way they handle caching of SSL-encrypted content on the client. Some allow caching of this content, others do not or will only cache it temporarily in memory but not on disk, meaning that next time the user visits the same content, all of it must be downloaded (and decrypted) again even though it has not changed since last time, thus affecting the performance of the website.

And it’s not just about the browsers: ISPs and companies often use  proxies to cache content with the purpose of making web surfing faster. The funny thing is that many caching proxies, by default, do not cache SSL-encrypted content…

So… is an SSL-only web possible or not?

It’s nice to see that Facebook now gives the options to turn SSL on. However, it is a) disappointing, because it’s just an option, not the default, and most people do not even know what SSL is; b) surprising, that this change came not following the hype for Firesheep months ago, despite Facebook being one of the higher profile websites Firesheep had targeted; the change, instead, came after somebody hacked into Mark Zuckemberg’s own Facebook profile…. Perhaps the privacy of Facebook’s CEO is more important than that of the other users?

As for the other several sites targeted by Firesheep, I haven’t yet read of others that have already switched to using SSL all the time by default.

So it’s a slow process…. but I definitely think it is possible to think of an SSL-only web in the near future. Despite switching a website to using SSL all the time can be technically more challenging than one would otherwise expect, the truth is that all the technical issues listed above can be overcome in a way or another. I’ve mentioned how Google has pretty easily adapted some of their services to use SSL by default already, thanks to research and the optimisation of the technologies they were using for those services.  So what Google shows us is that other companies really have no excuses not to use SSL for all their services, all the time, since by doing so they could dramatically improve the security of their services (and, most importantly, their users’ privacy), if only they cared a bit more about the aforementioned issues.

The only problem that may be a little more difficult to overcome depending on the web application and on the available budget, is of economical nature rather than technical. It is true that SSL encrypted traffic still costs more money than unencrypted traffic, but that’s it. In particular, I mean the cost of the required changes to a site’s infrastructure and the overhead in management, rather than the cost of the Cheap SSL Certificate, which may not be a problem even for smaller companies, these days.

It is unlikely that we’ll see completely new and more secure technologies replacing the web as we know it today, any time soon; but it is likely that with hardware and network connections becoming faster all the time, the prices of SSL certificates also going down, and further improvements to the current technologies, HTTPS will replace the standard HTTP as the default protocol for the Internet – sooner or later.

In the meantime, as users, we can either wait for this to happen, thus exposing ourselves to the potential risks, or we can instead solve at least partially the problem on our end; in the next few posts we’ll see the easiest and most effective ways of securing our Internet browsing on the most common operating systems, and also why I used the word “partially”.

No comments:

Post a Comment