Thursday, November 26, 2015

Main differences of managing public cloud-based software development projects

Nowadays more and more of software development projects are based not on traditional on-premises infrastructure but on the public cloud. Other projects are still under consideration, to move or not to move. Let me briefly summarize the impact on the management aspects of such projects, that I see as the most crucial.

First of all, if you develop the cloud-native software (not just moving your existing software to the cloud), the public cloud means your team can leverage the cutting edge industry approaches and solutions in a standardized, not home-brewed way.

What does this mean? Well, this means that a lot of sophisticated options regarding the information storage, access, manipulation are available for your development team in a way your CSP (Cloud Solution Provider) exposed such functionality, with all the abstraction helping to leverage it. This for example means when you use PaaS-level services of your CSP you can easily leverage such non-trivial things as DR (Disaster Recovery), FT (Fault Tolerance), security and so on just as attributes of your whole product/solution.

Of course it is still required to have design/plan/prototype activities in the project plan, but the risks you need to manage here are typically less lower. So some of the technical risks are being mitigated for you in fact.

One more big point is the infrastructure agility level: you need to manage the infrastructure provisioning for different stages/environments of your project, depending on the project phase. At the beginning of the project your team will mostly need prototyping environments to prove the concept of solution. Later during development you will need integration and testing environments to support your testing activities. Finally you will need UAT (User Acceptance Testing), staging and production environments.

Typically provisioning of these environments should be there in the project plan too, for traditional projects the duration of such provisioning activities can be really significant (days, etc.); in case of the cloud this can be from minutes to hours and often automated. Much easier to manage/track as you as a manager can see (and even initiate) it all yourself through the CSP’s self-service portal.

If you and your team are using Agile development practices, then I think cloud has one more thing that really helps – the ease of regular demos for the stakeholders. Public cloud is an ideal platform for all possible access possibilities (thanks to the “Broad network access” essential feature of any cloud), this external infrastructure will help your team and stakeholders to have a common playground.

On the other hand, there is also an implication of public cloud from the fact that CSP is an external organization. You need to manage the communication with CSP support from the very beginning of the project. You have to mitigate the risks of CSP link downtime, as this becomes a critical project resource. To see some deeper analysis of cloud-related risks, please read my other post “Risk Identification in Cloud-based Software Development Projects”.

Sunday, November 15, 2015

EU Personal Data: Safe Harbor vs Home Port

As you probably know, Safe Harbor acting for European personal data in the US from 2000 was recently ruled out by the ECJ (European Court of Justice) to be insufficient. This changes a lot the background for cloud computing consumers in Europe.

OK, what is this all about bit by bit:

Europe has its own privacy laws. This is related to the European citizens personal data protection and is regulated by the Data Protection Directive from 1995.
There are a lot of transnational US businesses that are storing, aggregating, analyzing the global customer data in the US-based datacenters. Such businesses can be of infrastructure level (like cloud services providers, hosting services, etc.), online services (social networks, blog platforms, search engines, etc.), e-commerce players and so on.
Safe Harbor Privacy Principles were developed starting from 1998 and enacted in 2000 to make it possible for the European personal data to travel transatlantic and be handled there in safe manner.
Last several years revelations of NSA activities and USA Patriot Act enforcement that are bypassing the European personal data protection and privacy laws.
Safe Harbor is not sufficient to protect European citizens personal data anymore, as ruled by the court in 2015.

What will be the most likely consequences? Will it help European CSPs to rise and gain the market share? Will this create the workplaces in Europe? How will this boost the cloud consulting companies?

What we can see to the moment, the big US CSPs are opening more datacenters to keep the European data (and metadata) in Europe:

Microsoft announced the plans for UK expansion in 2016 (Azure, Office 365 services datacenter): http://www.windowscentral.com/microsoft-plans-open-new-data-centers-uk
Amazon announced the new cloud datacenter to open in UK in 2017: http://fortune.com/2015/11/06/amazon-web-services-uk-data-center/

One one hand, cloud infrastructure business is a mass market with it's low margin economy, it is only possible to compete there having the global scope and huge resources. So probably Safe Harbor strike down will not significantly help any new European players to benefit from this situation, however new datacenters to be open in Europe should add more workplaces in EU member countries.

One the other hand, Europe has it's own strategy for the cloud computing (https://ec.europa.eu/digital-agenda/en/european-cloud-computing-strategy), C4E (Cloud-for-Europe) initiative, ECP (European Cloud Partnership, https://ec.europa.eu/digital-agenda/en/european-cloud-partnership) organization, so why not to coordinate/implement something of a level of pan-european CSP?

Wednesday, November 11, 2015

if(yourPublicCloud.isClosing)

With the news of HP Helion Public Cloud to close down in January 2016 we can again re-consider main public cloud risk factors we always should keep in mind.

Of course this is not the first time the public cloud provider is going dark, getting out of business or just shifting the strategy so the public cloud is discontinued (Nirvanix, Megacloud to name a few). As soon as public cloud is a mass market this takes a lot to compete in this low margin area.

Public cloud provider has to be a really big player to support the huge compute resources in the data centers across the globe. All the leaders in the area are having them: AWS, Microsoft Azure, Rackspace, Google. To some degree this can be considered a significant risk mitigation to rely on the leader CSP (Cloud Services Provider) that currently demonstates vision and strategy execution in public cloud area (for example, you can see those CSPs positioned at Gartner's Magic Quadrants upper-right part).

Nevertheless, if the public cloud you are using was announced to be closed down, what are the factors that can make migration from it more expensive, hard or even impossible, so you are getting “locked in” the that cloud? I would assume the main of them (but for sure not all) are:

Using CSP-specific functionality that can't be easily migrated. This is a typical risk of PaaS (Platform-as-a-Service) model as opposed to the IaaS (Infrastructure-as-a-Service). You are not just dealing with some virtual machine images you can export/import/recreate; you are using vendor-specific services.
Keeping a lot of the data in the cloud. For the data it is easy/cheap to get in, yet moving out of the cloud is typically charged much higher.
Using public cloud as a primary infrastructure could be considered a risk as well. One thing is just to “burst” into public cloud when it is needed for the compute elasticity, another thing is to be fully based in public cloud.

Tuesday, November 3, 2015

May the Cloud Force be with you. What the recent movie tickets services crash teaches us.

Have you heard how frustrating was ordering tickets for the next Star Wars: The Force Awakens? Big online services like Fandango, MovieTickets, AMC, Regal, Cinemark and others across the globe were crashing when the fans flooded them in attempts to book the tickets just after the announcement.

“Cloud Force” could really help here. As this was a significant peak in booking service consumption, this could be addressed perfectly by the the cloud:

Rapid elasticity would allow to handle a sharply increased number of consumers without noticeable degradation of the service level. The computational nodes could be added automatically and transparently and deprovisioned when not needed any more.
Hybrid cloud scenario would allow to borrow the required computational power from the public cloud without any need in investing in dedicated infrastructure.

Even if this peak would be unexpected, the cloud could handle it based on the utilization metrics; however in this case the spike was perfectly foreseen so the anticipated load could be addressed with schedule-based elasticity triggers or even with manual provisioning.

The automatic elasticity comes extremely smooth if you deal with the cloud on the PaaS (Platform-as-a-Service) level, everything will be handled for you mostly transparently. If you want to keep the resources under control on IaaS (Infrastructure-as-a-Service) level, for some of the most popular public cloud providers those features that enable automatic elasticity would be:

Amazon Web Services: Auto Scaling, Elastic Load Balancing, Auto Scaling Group
Microsoft Azure: Cloud Services, Azure Load Balancer

Basically, the cloud wouldn't only help to handle such peaks in usage and thrive, it would also make it happen in really cost-effective way, without any requirement of statically assigned mostly idling infrastructure.

This is the news coverage:

http://www.wired.com/2015/10/star-wars-online-ticket-sales/
http://www.cnbc.com/2015/10/19/star-wars-ticket-demand-crashes-uk-cinema-websites.html
http://money.cnn.com/2015/10/19/media/star-wars-movie-tickets/
http://www.forbes.com/sites/hayleycuccinello/2015/10/20/star-wars-presales-crash-ticketing-sites-sets-record-for-fandango/

Those are apologies from some of the services:

https://twitter.com/ritzycinema/status/656385731687030785
https://twitter.com/ODEONCinemas/status/656041802013810688

Tuesday, June 2, 2015

Cost Management for Projects That Use Public Cloud

Project cost management

Cost of the project is one of the biggest aspects of the management. This is a typically achieved through the set of well-established approaches. According to PMI, the processes that help with project cost management are:

Resource Planning
Cost Estimating
Cost Budgeting
Cost Control

Let's consider how public cloud associated costs can be addressed withing this framework (cloud costs will be only the part of all project costs, yet we will focus on them specifically)

Public cloud resource planning

Public cloud can be considered as a set of resources which depend on the cloud service model: PaaS (Platform-as-a-Service), IaaS (Infrastucture-as-a-Model), or SaaS (Software-as-a-Service). More specifically those resources would be:

IaaS: server instances usage, storage space, network bandwidth, inbound and outbound traffic, load balancing, etc.
PaaS: developers accounts, technologies/services/frameworks usage, etc.
SaaS: user account, applications/services usage, etc.

Based on the type of the cloud, resource requirements should be defined.

Public cloud cost estimating

As soon as we have resource requirements defined, we can use our WBS (Work Breakdown Structure), activity duration estimation and resource rates to make the estimation for the cloud related costs of the project.

Resource rates are provided by the CSP (Cloud Services Provider) and sometimes the costs structure is not very easy to grasp. For example, for AWS (Amazon Web Services), EC2 it looks like this. You have to decide about a lot of things to be able to calculate the costs, most likely this will require involvement of the project's System Architect or Tech. Lead. (There are also some online calculators, yet the process is not that simple)

Public cloud cost budgeting

After the estimation of the required resources is done we can use it with WBS and project schedule to come up with the cost baseline.

Cost baseline for the whole project will include different costs, we are considering only cloud-related costs here.

As the baseline goes over time, it will go through the different phases of the project, each phase will require its own specific amount of the cloud resources. It would be pretty typical to expect that in the beginning of the project the project team will require the cloud resources for prototyping and closer to the project end a lot of testing will be performed in the cloud. Speaking of the testing activities in the cloud, makes sense to mention that load/scalability/performance testing can be most resource consuming and hence it can require significant part of the whole project cloud budget.

Public cloud cost control

This part is of the whole cost management processes set can look as the most fitting to the current public clouds state of the art. As one of the main cloud's features is "Measured Service" (see NIST cloud definition I mentioned in my previous post Cloud-related projects: when your backend is really based on cloud services?) you are usually in full control of the cloud costs to the moment. This means that you can use very structured reports to see how the cloud spendings are attributed. This can help a lot to revise the estimates, make needed updates or corrective actions. This part is where the public cloud is typically shining.

Sunday, May 31, 2015

Risk Identification in Cloud-based Software Development Projects

Risk Management in Software Development Projects

Managing project risks is a usual area of responsibility for any project manager. It makes sense, anyway to recall the main definitions of the risk before discussing any specifics related to the cloud software projects.

According to the ISO 31000:2009, risk is "the effect of uncertainty on objectives".

PMBOK's definition is "Risk is an uncertain event or condition that, if it occurs, has a positive or negative effect on the projects objectives."

So the risk is all about uncertainty and it's effect.

Does cloud nature add more uncertainty to the software development project?

Of course, cloud infrastructure itself has its own associated risks (most discussed are security risks in the public cloud, for example). However, I would like to look into the software development project's risks that are introduced when the software under development is intended to run in the cloud.

First of all there are two basic types of the clouds, risks for the projects relying on them are different let's consider the risks by type.

Risks introduced to the project by the private cloud

Let's go through the main (in my opinion) uncertainty sources one by one. Each of them can introduce multiple risks.

Low "maturity level" of the in-house cloud infrastructure. How comfortable is the organization with its cloud infrastructure? Is this the first project to really rely on it?
Insufficient cloud skill level of internal IT stuff. Will it be possible to rely on IT specialists to resolve the possible problems?
Unknown/weak SLA. What are the real service levels for the cloud? Will it be enough to address the project goals?
Other cloud tenants' priority within the organization. Will your resource and communication needs get enough priority for the project to succeed?

Risks introduced to the project by the public cloud

Let's proceed with the public cloud, I have three specific points (again, each can mean multiple risks caused by it):

External CSP (Cloud Services Provider) dependency. Yes, after all this is a one more 3-rd party you depend upon. Public clouds blackouts are still possible, your link to the CSP is critical too.
Unclear costs, complicated calculations. Modern public clouds (like Amazon Web Services) are notorious for their hard to calculate prices (yet fully transparent post factum), so your budgeting can be not very easy.
Some unexpected limitations can apply. This can be any technical thing like allowed traffic amount for your load testing or CPU steal time for some server instance types.

Summary

Of course the level of the cloud risks is dependent on how confident you are in the cloud-related development skills of your team and the cloud infrastructure you use. Yet it looks like for the private cloud risks are mostly dependant on the level of the cloud adoption/maturity within the organization, so if this is not the first project relying on the same private cloud, chances are much higher. For the public cloud experience matters too, but part of the risks is fully external.

As soon as we can see the cloud uncertainty sources (to identify the cloud development risks), we can come up with the plan how to mitigate them. The risk management plan should address those cloud risks and further they should be monitored and kept under control.

Wednesday, May 27, 2015

Cloud-related projects: when your backend is really based on cloud services?

A person in charge of the project leadership and finally project success should understand the meaning of cloud-based development project, as this is a really big point (although this game changer is already mature enough).

Cloud-as-a-Buzzword

Due to the big hype for the last 5-7 years, we encounter the word "cloud" and projects related to "cloud services" development very often. In fact, big deal of such cloud-based projects are not about cloud at all.

Sometimes backend is called "cloud services" just for the marketing purposes (yes, cloud is still cool), but this can be also a lack of understanding of what the "real cloud" means. You can meet a lot of developers (even experienced enough) that can say something like "cloud is actually nothing really different from the traditional client/server with backend in some virtualized infrastrucure (or even just Internet)". It is not like this, however.

When you really develop cloud services and the project is about cloud?

First of all, to be clear, by the "cloud services" I mean not the services CSP (Cloud Service Provider, like Amazon Web Services) provides to the consumers, but the services that represent the back-end of the system which really benefits from the cloud-based architecture.

Does this mean that any services, being deployed to the public cloud (AWS, Microsoft Azure, etc.) or private cloud (based on OpenStack, etc.) are automatically becoming cloud services? Well, no. Only those services that can really use the cloud's nature are cloud services.

What is the cloud? I think NIST (National Institute of Standards and Technology) gives very good definition for this type of the infrastructure. According to this widely accepted definition. these are the essential characteristics of the cloud infrastructure (with my explanations):

On-demand self-service. This means the cloud consumer should be able to perform all cloud provisioning actions without any help from cloud provider. This can be fully automated with different APIs or done manually through "service catalogue" (usually Web-based user interface).
Broad network access. Stays for ability to consume the software services in the cloud using different types of clients through standard platform-independent protocols (it is not about broadband connection).
Resource pooling. This is related to the fact that cloud infrastructure is shared among different "tenants" to optimize the utilization. This makes it efficient and usually is a big supporting point for the switch to the cloud infrastructure.
Rapid elasticity. Truly great ability of the cloud to quickly grow the amount of provisioned resources for "tenants" when needed and to contract them when not needed any more.
Measured service. This means in the cloud all resources are "commodities" that are standardized and consumption is measured (like electricity consumption).

Most significant enablers for the services developers there are "rapid elasticity", "broad network access" and "on-demand self-service", to my mind.

OK, how to use this knowledge?

It is important that the cloud is not just virtualized infrastructure, not just virtual machines, etc. This is highly automated, elastic, controllable, efficient environment that should be employed in a right way to benefit from it.

You'd better ask your System Architect, Lead Developer/Expert or other person in charge of the technical decisions in your team how your backend services are compatible with cloud-deployment scenarios, how they address scaling in the cloud, what are the interfaces to them that enable services consumption, how is cloud-backed automation levereged, and so on.

Cloud services can make a real difference for the system/product you develop (especially in terms of scalability and automation) if this is really about using the features of the cloud infrastructure.

Monday, May 25, 2015

Lifecycle of the data analytics project

What is different about data analytics projects

How data analytics projects (those that are related to the building of the models used for predictions, decision making, classification, etc.) are different from traditional software development projects? While final deliverable of the project is still typically some form of automated software system, the project stages are different.

First of all, there is a new essential role in such projects - Data Scientist, a specialist, who possesses a skillset that is not there in a common software development project team. This person is not a Software Engineer, a Business/System Analyst or a System Architect. This is a professional who can make sense of the data arrays, to apply statistical and mathematical methods to the datasets to identify the hidden relations inside, and finally to validate the candidate models.

As far as the whole project success is greatly dependent on the result of the Data Scientist's work, this fully determines the lifecycle of the project.

Most popular models for data analytics projects lifecycle

There are several existing project lifecycle models for data science/analytics related projects, I see as most significant of them these ones:

1. CRISP-DM (CRoss Industry Standard Process for Data Mining) is widely accepted by some big players like IBM Corporation with its SSPS Modeler product.

2. EMC Data Analytics Lifecycle developed by EMC Corporation.

3. SAS Analytical Lifecycle developed by SAS Institute Inc.

Generic data science related project lifecycle

While the most popular models mentioned above use a bit different terminology and numbers of lifecycle phases proposed are different too, there are big similarities in all of these models. In general the phases can be described as follows:

1. Everything begins with business domain analysis.

2. Datasets accumulated as a result of business operations are being understood and prepared (extracted/transformed/normalized/cleaned-up/etc.).

3. A model, based on the datasets is being planned and built.

4. The model is evaluated/validated (including communication of the results to the upper management as this is a business value validation).

5. Operationalization and deployment of the model, including all required software development.

The lifecycle is iterative, the adjacent phases themselves can make several iterations.

As you can see only the operationalization phase is about software development in its traditional form, while all the preceding phases are related to the data science.

Sunday, May 24, 2015

Why not to use Data Science for advanced project management?

Data science and big data analytics projects

These days there are a lot of on-going data science related projects that are promissing and delivering great benefits for the businesses in different domains (from telco to retail and so on).

However the project management itself is mostly not benefiting from the value of the data the project generates.

Project plan execution data

During the execution of each project plan there is a constantly growing dataset of accumulated project execution day-by-day metrics (these are: current task completion percentage, actual man/hours, velocity, estimated man/hours, current resource allocation/availability, etc.)

Usually there is some more or less static WBS (work breakdown structure) that organizes the units of work identified (tasks) and their functional relationships. The relationships between the tasks and resources assigned to them are tracked also.

There is also even more static calendar data that includes the resources availability with regards to the dates.

Whatever a process methodology (Agile, RUP, etc.) is the key factual underlying metrics and work structure and calendars are the pretty much the same.

Project plan execution with data analytics

In order to support the project management activities it is possible to automatically monitor and support the execution with data analytics technics that are widely considered to be a part of the Data Science methods.

The areas of project management decision-making support can be as follows:

Key project metrics forecasting (with Time Series Analysis, etc.).
Identifying the project data patterns and correlations based on history; proactive alarming using these patterns (Linear Regression, Logistic Regression, etc.)
What-if analysis (Linear Regression, etc.)

I believe this list can be much longer being elaborated/structured sufficiently. So let's make "The shoemaker's son always goes barefoot" irrelevant to the software project development itself :).