Evaluating Cloud Providers

15 Dec, 2019

This past weekend I wanted to spin up some personal computing infrastructure that I lost when Joyent dropped their public cloud offering. In doing so, I tested three cloud providers: DigitalOcean, Vultr, and Linode and wanted to summarize what I like and dislike about each of them here.

Warning: Before we begin, note that this post necessarily discusses features and issues in three products that are constantly changing. Everything I say here is true as of December 2019, but may be wildly out of date by the time you read this.

Requirements

My requirements likely differ from yours, so know that I was primarily choosing providers to compare based on the following:

Good support
An existing Terraform provider
Ease of spinning up a one-off server to run simple services like an XMPP server.
Ability to create more complex infrastructure that requires three or four networked machines
Ability to run various Linux distros as well as FreeBSD, OpenBSD, and SmartOS

DigitalOcean

I must admit, I didn’t test DigitalOcean as thoroughly as the other two providers, so you won’t get a good comparison of DigitalOcean here. Maybe I would have loved it if I stuck with it for longer, but my troubles with it started before I had even signed up.

The very first thing I encountered is that the “Signup by email” button doesn’t work on Firefox 71. When you click it, nothing happens: no silly animation, no new page, no error in the JavaScript console, nothing. This didn’t change if I disable adblockers, Firefox’s “enhanced tracking protection”, or if I started in safemode (ie. no plugins running). I couldn’t reproduce this on a nightly install, so maybe it will be fixed soon, or was something about my specific Firefox profile and how it was configured (although this didn’t make it any less of a problem for me, and a simple link to a form likely wouldn’t have been affected by whatever was breaking their JavaScript-y login process).

However, this isn’t the problem that made me not want to use DigitalOcean: the real problem was support. I emailed them the problem I had, the version of Firefox I was running, the operating system I was running, etc. A few days later they got back to me telling me that my account was already deactivated (I guess I had an old one at the email I was using). I replied including the same information, and told them I was not trying to deactivate an account, I was trying to sign up for one. They got back to me saying they were sorry I couldn’t log into my account and that they couldn’t see any login attempts on their end and that they couldn’t provide an account login code over email. At this point, I’m assuming they’re using an overzealous support bot that’s trying very hard not to forward my emails to a real person, although I have no idea what I might have said that made it think I was trying to deactivate or sign into an existing account. I replied explaining the problem again and telling them that this was not about an existing account, I was trying to register a new account. This time they replied saying the issue would be reported to their engineering team if I just provided the browser and operating system I was using, a screenshot of the issue (of the button and an empty JavaScript console, I guess?), etc. I replied quoting my previous emails where this information had been provided every time. They replied asking me for my operating system and browser version again. I gave up on DigitalOcean at this point. Even if their service was great, at some point I will need customer support and I don’t want to deal with this ever again.

To be thorough, I did sign up for an account using a GitHub account for SSO as a workaround (and then changing to a username and password once I’d signed up), however, I still wasn’t impressed.

When spinning up machines they require that you specify “Production” or “Staging”. However, that’s not the division I use for managing my machines and feels like something that could be handled better using a more general concept like search-able “labels” or “tags” (also there’s no mention of this or if it actually does anything other than show a label in the docs). I also couldn’t add SSH keys through the API, I’d just get some generic error about the API token being wrong (it wasn’t, other API requests worked fine), the concept of segregating resources into “projects” only worked for certain resources, the docs were hard to navigate, etc.

The plus side of DigitalOcean is that they have the most services if you’re willing to pay a premium for managed infrastructure. This includes load balancers that do TLS termination, PostgreSQL, Redis, and other managed services.

All that being said, I was already mad from the bad support experience, so take my DigitalOcean dislike with a grain of salt. It’s quite possible I was looking at their product with pre-conceived biases that were tainting my assessment of it, and it’s quite possible that my support experience is uncommon and that if I tried again I wouldn’t get so unlucky. However, I don’t have unlimited time to try products over and over again after a bad first impression, so onto the next provider.

2021-12-12 Update: though I tried to be fair to DigitalOcean in this post and hedged my bad experience with statements about how I went in with pre-conceived biases and that my support experience may be uncommon, I recently tried to use them again for another project and had an even worse experience. In this case I can’t access my droplets via SSH due to extremely high network latency and packet loss on the last hop and can’t access the recovery console because they don’t support resetting the root password on FreeBSD droplets. However, 19 emails in and every single email from them has either told me to SSH in and run commands (at which point I have to point out that the whole problem is that what they want me to do can’t work), told me at one point that since I wasn’t the primary email on the account they can’t talk to me anymore (I am the primary email on the account, and even if I weren’t that makes no sense), and even sent me a reply with bold text saying:

Did you try using a different ISP or network and computer as suggested in the previous reply?

This sort of aggressive incompetence is astounding. I had to reply pointing out that I had in fact specifically discussed this in my previous email right below parts they had quoted and therefore obviously read. My recommendation is now just “never use DigitalOcean”.

Vultr

I also filed a customer support request with Vultr shortly after signing up and quickly got a reply that didn’t resolve the situation, but at least made me feel confident that it was on their radar and that my question had been ready by a real person. I was very pleased with their customer support.

The situation in question was odd defaults in a few of their APIs. For example, creating a new domain in their DNS product sets several default records including an A record, a CNAME record mapping *.@ to the apex domain, and an MX record also mapping mail to the apex. If you don’t want any of these (and you probably don’t), you can remove them, but now what should have been one API call to create a domain is 4. If, like me, you’re using Terraform which works using declarative config files to provision infrastructure, the process is very tedious. To use Terraform you would have to create the domain then apply your configuration, then create resources in the config for the default records, then make three calls to terraform import for those resources, then remove them, then apply the configuration again. You could also use the web UI to remove them, but either way that’s a lot of manual work to use a tool that’s supposed to automate away the manual work and make tasks reproducible.

The other place where odd defaults were a bit more of a problem is in their cloud firewall offering. The first time I used it I created a firewall, added a simple rule or two on the side, and was pleased to see that the first rule was one to deny all traffic and that the rules I had added were poking holes in the deny all rule. Safelisting is a sane default for a firewall. However, then I went to create another firewall for a machine that would never accept public traffic. Since they require that all machines have a public address, I wanted to simply deny all traffic there and only allow traffic on the internal network.

The default for their firewalls is wide open, which seems odd. If I wanted that I wouldn’t attach a firewall in the first place, so imagine my confusion when there was no way to create a “deny all” rule at all. The deny rule only gets added when you create another, unrelated rule. Support’s solution was to add a rule to allow traffic from 127.0.0.1 since loopback isn’t covered by the firewall anyways and this would effectively be a no-op except that it would trigger the creation of the initial “deny all” rule. This does technically solve the problem, but definitely doesn’t feel good and is something I’d have to remember to do, and document everywhere I did it lest future me gets confused and removes what appears to be a useless rule but which is actually a very important rule that doesn’t actually do what it says on the tin. Because of this, I would never use their cloud firewall offering. I prefer to use nftables or pf myself anyways so that my firewall is more portable in case I change providers in the future, so this isn’t a dealbreaker, but it is concerning.

Otherwise, Vultr was great to work with. They support a number of operating systems out of the box including FreeBSD and OpenBSD which I use heavily, had smaller machines than DO or Linode have which are great for testing purposes, and even support booting with custom iPXE scripts (which made it very easy for me to get a SmartOS image up and running).

Linode

Linode was the provider I had used most heavily in the past, so I went in with the most pre-conceived biases. I left them years ago after one too many bad security practices were exposed, in particular after breaches due to out of date software, a lack of two factor authentication, and after I submitted a report letting them know their login page was exposed over HTTP, only to get a reply that it didn’t matter because the form submitted to an HTTPS endpoint (which still didn’t change the fact that the form itself had just been served insecurely). However, I tried to go in with a fresh mind.

They now have two factor authentication, their site redirects to HTTPS, their customer support was great when I submitted a few small questions, and the documentation was generally well organized and complete.

I found two major problems when evaluating Linode’s offerings.

The first is that it’s hard to get custom images up and running if you don’t want to use theirs. For example, they have a guide to setting up FreeBSD that involves a lot of manual work setting up volumes, booting into rescue mode, copying data where it needs to go, associating those volumes with a cloud server, etc. whereas Vultr and DigitalOcean let you easily spin up an instance from an ISO or from another source with iPXE (in Vultr’s case).

The second big issue is that Linode doesn’t offer private networking. There is a private network that your servers can join, but it’s shared by the whole data center so you’ll have to be very careful with your firewall rules. This is the issue that is likely to be a deal breaker for me, which is sad because I loved their other offerings.

On a related note that might be an issue for some: they don’t offer Firewalls (although their website says these are “coming soon”). This is fine for the purpose of this comparison since as I mentioned before I wouldn’t be using Vultr’s firewall offering either and prefer to manage pf or nftables myself.

Otherwise, everything on Linode appeared to just work out of the box. One thing Linode offered that I liked over the other providers is Longview, their monitoring and metrics service. Although I likely wouldn’t use it for larger services where I’d want something under my own control (Prometheus), if I was just running a machine or two to host a blog or a mail server I’d love having some simple metrics I can look at.

Conclusions

After considering these options, I think each provider is most suited to the following:

Linode: I would likely choose Linode if I were running one off services such as a blog, a personal mail or chat server, etc.

Vultr: This would be my pick for actual products requiring more complex infrastructure. If I were running several servers, a database, monitoring infrastructure, a firewall and loadbalancer, etc. that all needed private networking, I’d likely choose Vultr.

DigitalOcean: I would not use DigitalOcean unless I absolutely had to have managed PostgreSQL or some other feature not offered by the other providers.

2021-12-12 Update: as mentioned above in the DigitalOcean section, my new conclusion is “I would never use DigitalOcean under any circumstances”. If I were to be working on an existing project that was already using it, I would seriously consider investing the time and money to migrate if at all possible.