In the past, I have written critically about running ATG Oracle Commerce in “The Cloud” (Here and here).
I’ve recently spent several months working on a project implementing ATG Oracle Commerce at AWS in partnership with Pivotree and wanted to share what I’ve learned. Pivotree is the only AWS and Oracle ATG partner certified to offer this solution for the AWS cloud – they have received special designation for this through the Amazon Partner Network. I will be discussing AWS, however, virtually everything I say applies equally to GCP and may also pertain to Azure.
One of the biggest changes since I wrote about this topic last is the shift from core/CPU based licensing to metrics licensing for recent ATG Oracle Commerce licenses. This means that you can now deploy on Cloud infrastructure without running afoul of your licensing terms. Metrics based licensing means that the license size and cost is based on the number of transactions or page views, not on the hardware footprint of your servers. This enables you to deploy a large or small infrastructure without buying more licenses. Not only does it open the doors to Cloud, but it also facilitates things like warm or hot DR, active-active load balancing, and region-specific origin clusters. Previously you’d need a EULA to even think about this stuff. Now it’s all fair game! If you have older ATG Oracle Commerce licenses that are CPU or core-based I would strongly recommend speaking with your Oracle representative to see about transitioning to the newer metrics licenses. I’m not an attorney or an Oracle license expert, you should NOT base legal decisions or restaurant choices on anything I say. Ultimately each licensee needs to be responsible for their own compliance and the details are in their contracts; not here or even necessarily in Oracle policy docs.
My partner Pivotree has helped customers work with Oracle to migrate from CPU based to metric based licensing, so if you need help, ask the experts.
Now that the license issues are out of the way, let’s talk about the pros and cons of the technology.
The available performance for EC2 instances has increased dramatically since I wrote my earlier articles. While there’s still some performance hit from virtualization, you can provision EC2 instances that are QUITE fast. You also have a lot of flexibility in how you want to provision: additional smaller instances or fewer larger instances. The pricing is pretty flat, since it is based on overall performance, so the cost doesn’t have to drive your infrastructure approach. IO Performance has also improved. You can provision 25 Gbps network interfaces on many EC2 instance types. Latency is low, even across Availability Zones.
AWS absolutely shines with its flexibility.
While there are plenty of gotchas, odd requirements and limitations, once you have the AWS expertise, you can provision, change, de-provision, rebuild, etc. very quickly and easily. Someone without deep sysadmin/dev-ops skills can do all kinds of infrastructure work at AWS without assistance. From the developer standpoint, being able to jump in via a web console, and spin up environments, change firewalls, test, tear it all down, start over and more in minutes without any assistance is a game-changer.
While you are unlikely to want this type of manual flexibility in your production environment, once you establish your infrastructure plan, you can leverage Automation!
After you figure out what you are going to need for your environment: servers, network layout, firewalls, databases, NAT gateways, etc…, you can automate the creation and management of the infrastructure by using any one of many available tools (Terraform, SparkleFormation, etc). That type of tooling makes managing standards and enforcing security rules easy. Environment build-outs no longer have room for human error, typos, missed steps, etc. That’s true for both a large scale production environment and a smaller, quick development environment for a new hire. Automated environment build-outs are also FAST! With the proper investment in automation upfront, you can start from scratch and have a full set of environments running and ready for EARs and Data Import in minutes, not weeks or months.
The flexibility train continues with your production environment! It’s comparatively easy to scale your production cluster up or down based on traffic, sales, holidays, and other high-traffic seasons. As mentioned above, costs are generally pretty flat for X amount of performance, and since you’re paying for minutes used, scaling up and down to be “right-sized” for your traffic can help you save a lot of money.
I would like to add a couple of caveats.
Cloud still isn’t necessarily cheaper than physical servers at a traditional data center. In general, it will be the same cost or slightly more expensive, once all the little things are added up. You’re buying flexibility, better availability, and access to all the services on offer at your Cloud provider. However, It IS possible to save money IF your traffic is low and you can go from a high availability cluster of redundant physical servers to smaller virtual instances. Remember, when you’re looking at moving to a Cloud infrastructure, cost savings is not the primary driver, but rather focus on business agility.
Another concern may be that ATG Oracle Commerce has many architectural aspects that make cluster changes more difficult. The Oracle Commerce BCC topology is critical to site function, and not only requires manual changes to be made while deployments are not in flight, but there are also critical filesystem assets that need to be kept in sync and copied to new servers and other maintenance tasks. Endeca also has several cluster definitions, one for MDEX servers and one to keep track of Oracle Commerce app servers, which have to be kept in sync with the current cluster size and state. You can’t just configure an Auto Scaling Group and have your ATG Oracle Commerce cluster auto-scale. Unfortunately, it doesn’t work like that. You also need to solve more basic issues like DNS for all the in-cluster communication.
Those issues CAN be solved, but it takes a significant up-front investment in developing the tools and services to get ATG Oracle Commerce to be able to properly take advantage of things like auto-scaling or auto-healing, which in my opinion are some of the real strengths of Cloud hosting.
While migrating your ATG Oracle Commerce application to AWS may not be simple or easy, there are some valuable solution features once you are there.
A properly designed Oracle Commerce production cluster will span at least three availability zones in a region. An availability zone is essentially a data center. A region will have several AZs, or data centers, in close geographic proximity, allowing for low latency, and letting you treat multiple data centers like a single data center (with a few caveats). This provides a level of high availability for your cluster in that a data center level failure or critical issue is unlikely to take down your cluster the way it would in a traditional data center hosting scenario. That said, a multi-AZ cluster does NOT provide the same level of HA you might want with a DR or Active-Active approach. It does not protect against regional crises, such as power failure, flooding, or other disasters. There can also be network/cluster issues that may impact a large geographical area. So it’s a big step up from a single data center hosting approach, but not quite a reliable DR solution.
AWS offers several useful services covering things like identity and access management, firewalls, backups, log aggregation and management, monitoring and alerting, WAFs, CDNs, APM, and much, much more. You don’t have to use their offerings, but in many cases, they can be compelling options.
One important matter of note: AWS rolls out changes and new features at an impressive rate.
This is both a curse and a blessing. There is a lot of value to having new and improved features available to you at your hosting provider. It can also be difficult to keep up with the speedy transitions, and it’s easy to miss out on useful technology simply due to the rate of change. In my experience recently working on a relatively small-scale Proof of Concept project at AWS, new features were rolled out which would have been great to leverage at the beginning of my project, but at the stage they were released, would have required major re-work to be utilized.
In most cases, moving your ATG Oracle Commerce application to AWS, or another Cloud Hosting Provider, is not an easy lift-and-shift move.
There are lots of potential gotchas, and changes to be made. Be sure to allow for plenty of time (and plenty of load testing and tuning) before you go live. Don’t expect to save money. You can, however, take advantage of the many available strengths and advanced services if you’re willing to make the investment in technology and growing your knowledge.
AWS is similar to ATG in that they are both complex, powerful, and sometimes fragile systems.
You will need both ATG Oracle Commerce AND AWS expertise if you want to run your ATG Oracle Commerce infrastructure at AWS. If you don’t have this experience, you will need to work with a company that has plenty of both!