CD becomes more effective depending on how we design our infrastructure to deploy apps. In this episode of CloudUp, we provide insights on how to successfully get started on continuous deployment and when it comes to CI/CD, what needs to be in place for a smooth transition.

Meet the Speakers

Han Kim

Principal Architect

Jeremy Pries

Director of Cloud Infrastructure

Transcript

Jeremy

CI is a little easier to picture, you know, as we transition into CD and try to deploy, you know, the name of the game in the CICD space is try to iterate faster; try to release things faster, try to add customer value faster. So, CD’s a little bit harder, would you agree?

Han

I mean, extremely. I think the bar between CI and CD is extremely high.

Jeremy

Yeah.

Jeremy

Yeah, for sure, for sure. So, CD becomes more effective depending on how we design our infrastructure to deploy our app. Today we’re gonna talk about CICD as one of the top trends in DevOps. So I know you’ve done some projects in the CI space, and, like, what kinda stuff are you doing now that we’re not tied to physical machines anymore at all, even if they’re VMs? Like, we don’t buy a set of hardware anymore. We have kinda, like, this limitless, seemingly, data center. Like, what?

Han

Well, I think that, historically, was like, you have a giant build server, and if we’re talking about things that require build, like Android or Java, or things that require time to go through a process whereby there’s maybe, possibly, even automated testing or some testing as part of that, we’re looking at how we change the mindset from we have fixed amount of compute that we can leverage all the time, ’cause we’ve prepaid for it, ’cause it’s on-prem, it’s hardware, versus the ephemeral compute, which means that we can take massive machines for brief amounts of time to do things faster, right?

So, the key here is, like, on an equivalent level, if you say, “Okay, the processing of the on-prem machine has so much capacity and speed. The equivalent in the Cloud might have, you know, a similar speed or capacity. The difference is when things start to scale. So, projects that I’m working on require ability to keep up with a dynamically growing set of requirements that require more and more developers to come into play and build these kind of rather large artifacts, and do that in a way that they can do it any time on demand, and so we spend up giant build servers that last for minutes instead of, like, an hour, two hours of continuous running to build these artifacts and then disappear.

And I think part of the CICD methodology that we use that works the best is to make sure that we take everything that doesn’t require consistent and constant compute, like you’re talking about, and make those into larger, ephemeral machines, so that we can leverage the speed change without necessarily paying for something that’s lying there idle most of the time, or even some of the time.

Jeremy

Oh, gotcha, so, we have a pipeline that fires off of build, right, instead of having numerous people using the same build server, like a pipeline points to a specific build server? You’re saying, like, make a bunch of copies of that?

Han

Yeah, so build up a new one, let it do its thing, and then die. And everyone has their own, so if you have 500 developers and they’re all tryin’ to use one or two or three on-prem boxes, like, it gets inefficient really fast. There’s no ability to scale that really easily.

Jeremy

Yeah, and I know you’re able to size the VMs then, a little different than if you were, say, running VMware upfront.

Han

Yeah, for sure, right.

Jeremy

Right? I think we might allocate a bunch more VCPUs and try to accelerate that build process.

Han

Yeah, well, we have almost, I could say infinite, but a great deal more overhead, you know, in terms of what we can size around, versus what we have on prem, which is contained by the machine that the VM is running on.

Jeremy

Yeah.

Han

Yeah.

Jeremy

Yeah, so this is, like, really cool for long-running builds, right? Or what were long-running builds, at least, if something took a couple hours.

Han

Yeah, or compute-heavy unit requires a lot of processing. Like, image and video processing is a good example, like, doing that on a single machine takes forever as all production people know who do video, 4K, 8K video, but if you put them off to ephemeral machines and let them do it synchronously, they can make builds and start doing processing outside of your work environment or your work time, right?

Jeremy

Yeah.

Han

Which makes it much more efficient for people, I think.

Han

Yeah, for sure. And per-second billing, does that matter in this space? It sounds like it’d be an advantage.

Han

I think it’s huge because if we look at, like, the on-prem world, we have to forecast when we do our leases for hardware, in advance what we think demand will be. So, like, HVAC, for instance. You have to kinda plan for worst-case scenario, and how do you do that in an environment where market demand, business demand, change, especially when you look at 3-year leases, or multi-year leases, right? In this case, for the per-second billing, we’re not, kinda, burdened by the inefficiency of pre-purchasing a huge amount of things that we may or may not use, or we might saturate all the way and then we’re left in a difficult situation ’cause we don’t have enough compute resources. So we only use what we need at the time, and I think architect the infrastructure as code, application as code model. You know, it’s way more efficient.

Jeremy

Yeah, for sure. So even a build that took a few minutes, if we have per-second billing, could save a bit of money, right, by paying on the second instead of rounding up to the next minute?

Han

But at any scale, for sure. I think that’s definitely the way to go.

Jeremy

Yeah, yeah. And so global development teams can take advantage of this kind of thing too, right?

Han

Oh, especially, because, you know, depending on which call provider you are on, like, you can stand up things anywhere in the world, in multiple regions, and that cost of having that multi-region deployment of all your builds and your code repositories, then leveraging those areas, these build servers that come and go. So, you know, the cost might increase a little bit because of the regions that you’re in, but then, again, the efficiency is so high because you pay per use that you can change or exchange speed of deployment and change for the cost, in essence.

Jeremy

Yeah. So, CD’s a little bit harder, would you agree?

Han

I mean, it’s extremely. I think the bar between CI and CD is extremely high.

Jeremy

Yeah, for sure, for sure. So, CD becomes more effective depending on how we design our infrastructure to deploy our app.

Han

Yeah, well, I don’t think you can do it any other way, really, because, like, in the past, think about doing continuous deployment on the on-prem environment. Like, how would you even really go about doing that? Like, you would have to pre-prep a situation that is wildly complicated, you know?

Jeremy

Yeah, yeah.

Han

Nowadays with infrastructure as code, not so much, ’cause you can actually stand up the infrastructure as well as applications.

Jeremy

Yeah, yeah. So, our average customer is a Netflix, right?

Han

No, no.

Jeremy

Right? So, like, let’s assume we have some CI in place. How do we get started on a continuous deployment? Like, what’s the easiest spot to start from?

Han

Oh, man, that’s a tough one, because, you know, to get to CD, or even to CI, DevOps in general, you know, it’s the whole technology, you know, processing people. We have to kind of make it a mindset, an organizational mindset. But let’s say we’re saying, “Okay, we’re already deploying, you know, now and then, and we’re making updates now and then. Now we wanna allow developers to respond as quickly as possible.” I think we have to look at, okay, we’re not deploying to the same machine , you know, we’re not doing the old-school way of replace what’s there with this new code and test it because there are lots of problems with it. There’s errors, or issues, or multiple people are making changes to the code base and you’re not really aware of what other teams are doing, especially multi-national teams, et cetera. I think the better way is to actually stand up another ephemeral infrastructure, like replica, and deploy to it and do traffic-shaping from network as code piece, where we can say, “Let’s put like five or 10% of the people to the new code. Let’s just see if it’s working in the wild. Let’s see if we can handle the scale.” If not, we can roll it back, and if it does work, we can change the split from 90/10 to 100 for the new deployed code and take down the old one.

Jeremy

Oh yeah, cool, so infrastructure as code comes back into play, right? I mean, we already wrote code, and to deploy that whole infrastructure, so you could replicate for every release.

Han

Yep.

Jeremy

Wow, ultimately everything.

Han

The whole environment. Everything that supports the environment.

Jeremy

Yeah. So I think, well, one thing I’ve talked to customers about is that it’s kind of ongoing. It’s like a continuous improvement process.

Han

Yeah, for sure.

Jeremy

Right? So you start with CI. You know, basic CI in place is something that most development shops already doing, right? But then, add on a little bit of automation at a time, and eventually, ten years from now, you might be a Netflix, or maybe the tool sets are more mature–

Han

Yeah.

Jeremy

right now to make it that a little more achievable than when they started however many years ago that was.

Han

Well I’ve seen, I think, you know, the trend is that lots of people try to offload the burden of the automation on the tool set side to, like, a web-hosting CI tool.

Jeremy

Oh, sure, yeah, like CI as a service kinda tool?

Han

Exactly, so that’s kinda been the new thing, like, I know, whole systemic earth, and Google, and other things are on the rise now, but I feel like the tool, in and of itself, is never gonna be enough. Like, there has to be a core, fundamental organizational mindset to be able to support this type of thing, and that’s why those take iteration and time, because the change-management of each component piece leading from CI to CD needs to be in play before it actually can happen in a way that’s not a disaster, you know?

Jeremy

Yeah, yeah, cool.

Jeremy

Thanks for watchin’ this episode of CloudUp!

Han

Leave your comments and questions below and win some Agosto swag.

Jeremy

Thanks, and see you next time!