Tag Archives: Vmware Hosting

Cloud Computing and the E-commerce Industry

The unprecedented growth of the Internet and the phenomenal commercial opportunity it has unveiled, has propelled e-commerce to experience growth rates that have never been witnessed before. The web hosting features and resources required for e-commerce functionality call for uninterrupted performance, reliability, and data security among other factors. This is perhaps why we have all noticed the adoption of cloud computing by the e-commerce industry at an accelerated pace. However, not all cloud hosting service providers deliver services at the same level. This is why it is a good idea to conduct a quick assessment of your needs as an e-commerce enterprise when benchmarking various cloud delivery services and platforms against your needs. We present three important considerations. ecom

Choosing a Cloud Hosting Company

There are several talking points to consider when you feel it is time to take your e-commerce enterprise to the next progressive milestone by transitioning to the cloud. Some of these factors and considerations may not come to you as a surprise because of the frequent press they receive. Others may be interesting enough to peak your curiosity. They are all, however, quite significant and deserve equitable if not equal attention as you plan the migration.

Access Speeds of Cloud Hosting Servers

Experts describe the speed of access to web pages as the single most popular reason why e-commerce companies pursue the cloud migration pathway. Amazon increased its overall gross revenue by 1% for every 100 milliseconds of improvement in the speed with which its flagship website served up web pages. Inferior access speeds typically lead to traffic losses which invariably translate into lost revenue. According to a study conducted by the Aberdeen Group, page load delays cost e-commerce enterprises up to $117 million in lost sales annually. Shopping cart abandonment has also been linked to poor server speeds. Owing to its distributed computing capability, near instant scalability during peak times, and built-in redundancies that literally guarantee a zero downtime performance record, cloud servers maintain consistent access speeds regardless of how much traffic they are experiencing at any given point in time. This functionality is even more critical during the holiday shopping season in November and December each year when e-commerce activity is at an all time high. Many leading edge cloud vendors currently use Tier 3 a+ data centers to deliver optimal performance while fully supporting dedicated e-commerce applications such as shopping carts, inventory management software, CRM, live chat and help desk solutions simultaneously without any compromise to access speeds.

Security of Financial Data

Although traditional web hosting companies provide a level of data security that is industry compliant, cloud service providers have seemingly raised the bar through superior technology and self-governance. You will experience no difficulty in locating a cloud hosting company  is ISO 27001 compliant. Many cloud vendors also achieve SysTrust certification. For credit card and financial data processing in a fully secure and encrypted environment, the cloud hosting company you eventually select should be able to design PCI-DSS solutions. It should use the highest level of SSL encryption available which currently stands at 256 kb encryption. There are several other data security measures employed by most cloud hosting service providers such as biometric screening of personnel, two-factor authentication, and IDS which are now industry standards in the cloud.

The Trust Factor

Customer perception has a great deal to do with e-commerce success. If your enterprise can communicate to your customer community through newsletters and email alerts that they will never experience down time, always receive access to your product pages at consistent speeds, and never have to worry about their privacy and credit card data security, all thanks to state-of-the-art 21st century cloud technologies, chances are high that your customer attrition rates will seldom become a cause for concern all things being equal. Share the credentials of the cloud hosting vendor you eventually select with your customers as yet another confidence and trust building measure. Educate them about the multiple layers of protection the cloud provides them. You are sure to experience positive feedback from your customers sooner than later.

Concluding Thoughts

Managed cloud hosting services provided by qualified vendors allows e-commerce enterprises to focus on their core activities related to the sale of products and services on the Internet while technology related issues and challenges are handled by cloud vendors. Moreover, the pay-as-you-go model facilitates improved levels of resource management and usually generates long-term savings. It is therefore no wonder that the e-commerce industry continues to experience unbridled growth worldwide. In a recent study published by ComScore, Q1 2014 saw desktop e-commerce spending rise 12 percent year-over-year to $56.1 billion, marking the eighteenth consecutive quarter of positive year-over-year growth and fourteenth consecutive quarter of double-digit growth. mCommerce spending on smartphones and tablets added $7.3 billion for the quarter, up 23 percent vs. year ago, for a digital commerce spending total of $63.4 billion in the first quarter of 2014.

Has your e-commerce enterprise finally decided to connect with the cloud? What are some of the other factors you will take into consideration as you plan a cloud migration strategy? We would be very interested in hearing from you about your experience so far through your comments below.

Find out more about StratoGen


Deploying Hadoop in the Virtualized the Cloud

Apache Hadoop is a distributed file system for storing large amounts of data cross multiple commodity servers. It is said to store both unstructured and structured data, which is true, but you can use Apache Pig and Apache Hive to write a schema around this data to give it structure. That makes it something you can query. Otherwise it would not be of much use, yes?

Hadoop data is stored in a Hadoop Cluster. A Hadoop Cluster is the single name node plus multiple data nodes that make up the Hadoop Distributed File System (HDFS).  The namenodes keep track of what data is located on which virtual machine.  The datanodes are responsible for writing the files there.  Datanodes also run the batch jobs that retrieve data from the Hadoop Cluster when the user executes a query.

Hadoop queries and gathers using the batch jobs: MapReduce, Pig, Hive, plus other tools.  These are Hadoop tasks that run in parallel, thus giving the boost in performance of a distributed storage scheme over having one big server, like some kind of UNIX mainframe.

MapReduce jobs crawl across the Hadoop Distributed File System (HDFS) to obtain a subset of the data (i.e. Reduce) based on the query (i.e. Map).  Pig and Hive do the same thing.  These are tools to allow the developer to write this MapReduce logic using SQL, which is something practically every developer already knows.  To use this against unstructured data, the developer writes a scheme that describes the different types of data in Hadoop (logs, database extracts, Excel files, and other).  These use regular expressions to split strings of text into their correspond fields which can they be queried using SQL.

Hadoop uses replication to provide fault tolerance.  But how does one use Hadoop in a virtualized cloud environment?  There the vCD (Virtual Cloud Director) user might not have access to the vSphere configuration that spells out what virtual machine is assigned to which SAN LUNs and which blade chassis slot.

Why is this an issue?  Hadoop by default makes 3 copies of each data block.  Hadoop is rack-aware.  The Hadoop data dispersal algorithm copies these data blocks onto different storage medium in a manner designed to provide data redundancy, plus it takes into consideration in which rack is each physical server is located to provide additional data protection.

With vCD riding on top of vCenter, the customer does not have direct access to the vCenter details.  So, in the worst case, multiple virtual machines could all be on the same or nearly the same rack and their data stored on the same LUN (a logical partition of one physical drive).  Stratogen knows about this and configures vCenter to provide the required redundancy.  But part of the responsibility of doing that falls on VMware, which is what the Stratogen cloud uses.

VMware is aware of this issue and has been working since 2012 to address that and provide a tool for deploying Hadoop in VMware. First, they launched the open-source Apache Serengeti project, which is a tool that makes deploying Hadoop clusters across multiple virtual machines easier. Second, VMware has dedicated programmers and architects to the Apache Hadoop community to contribute changes to VMware to “enhance the support for failure and locality topologies by making Hadoop virtualization-aware.”

VMware summarizes the description of what they are doing and have done with the Apache Hadoop project (I fixed their grammar mistakes.  They are great engineers, but need a copy editor.)

The current Hadoop network topology (described in some previous issues like: Hadoop-692) works well in classic three-tier networks… However, it does not take into account other failure models or changes in the infrastructure that can affect network bandwidth efficiency like virtualization.

A virtualized platform has the following genes that shouldn’t been ignored by Hadoop topology in scheduling tasks, placing replicas, doing balancing or fetching blocks for reading:

1. VMs on the same physical host are affected by the same hardware failure. In order to match the reliability of a physical deployment, replication of data across two virtual machines on the same host should be avoided.

2. The network between VMs on the same physical host has higher throughput and lower latency and does not consume any physical switch bandwidth.

Thus, we propose to make Hadoop network topology extendable and introduce a new level in the hierarchical topology, a node group level, which maps well onto an infrastructure that is based on a virtualized environment.

As you can see, the goal is to make Hadoop network-aware to boost performance by adding a node group level.

VMware Hadoop Project Serengeti

Serengeti is a tool that lets the Hadoop administrators deploy and set up a Hadoop cluster in an easier fashion than using Hadoop tools natively.  Some of what Serengeti does is:

  • Tune Hadoop configuration
  • Define storage (i.e., local or shared)
  • Provide extensions to give Hive access to SQL databases
  • Enable VMware vMotion for moving clusters with machines
  • Provide additional control over HDFS clusters

VMware Hadoop Project Spring

Another VMware project is Apache Spring.  Spring is an open-source umbrella of projects.  For example, the Spring Framework provides lets developers model relationships between Java classes using XML so that objects can be instantiated in configuration files instead of given explicitly given in Java code. It also handles things like transactions.

The Spring Hadoop project lets programmers do various tasks like written Java code to do Hadoop tasks instead of using the Hadoop command line. It also extends the Spring Batch framework to manage the workflow of Hadoop batch jobs like MapReduce, Pig, and Hive.  Spring provide data access objects (Think of JDBC or ODBC.) to HBase data.  HBase is a way to turn Hadoop into something similar to a relational database by providing random read write access to the data there. Remember that Hadoop is not one file, like a database, but a collection of files, each of which could be of different types. So HBase is an abstraction layer of that as is Hadoop itself.

Find out more: http://www.stratogen.net/products/hadoop-hosting.html

Migrating to the Cloud – Challenges and Considerations

As organizations continue to experience vibrant growth and rapid entry into new markets, the need to architect new data environments which perform flawlessly, deliver cutting edge technology solutions, and conserve resources has become paramount. It is often assumed that a transition from a private in-house data center to a cloud-based infrastructure is the direction in which most organizations should embark. However, there are multiple challenges and considerations that should be addressed before you take the plunge. Cloud Hosting

Preparing for Migration across the Enterprise

The decision to transition to the cloud is by no means a purely technical one. It involves important issues such as vendor selection, strategies to handle possible service disruption during the transition, and cost considerations only to name a few. Let us examine them briefly:

Vendor Selection

With new Cloud hosting companies appearing on the horizon regularly and promoting themselves rigorously, choices may be difficult to make. Make sure you are looking at more than just the cost or the cheapest deal. Examine issues such as industry reputation, awards, and accreditations, read case studies and ask to speak to a current customer. Find out if telephone support is provided 24X7? Do members of your senior technical team have instant direct access to their counterparts at the cloud hosting provider or do they have to go through several hoops to reach them? These often overlooked factors can end up costing more money in the long run and what appears to be a cheaper provider could end up being much more expensive.

 Service Disruption

Advance planning is the key to disruption management when connecting with the cloud. If your decision to consider the cloud involves only internal corporate data, a replication model may be the right answer. In this model, your data center and your Cloud operation function simultaneously until such a time that the transition is complete. However, if you have a large number of tier 1 customers who rely on you for service as is the case with live chat / videoconferencing / SaaS providers for instance, service disruption will have to be planned for well in advance and your service provider should offer you a migration plan and assistance.

 Resource Optimization and Costing

Cost savings are frequently mentioned as one of the main reasons why enterprises should vote for the Cloud.  Having a hardware free environment can certainly save a huge amount of money and resource. Outsourcing to a cloud hosting provider also gives you the option to re-deploy your technical workforce giving them the ability to concentrate on your core IT. Resource optimization & re-deployment options will vary depending on whether you choose the public, private or the hybrid Cloud model.

 Are you ready to migrate to the Cloud?

You are ready…..

When there are frequent spikes in service usage and on demand resources become an attractive proposition.
When your applications are known to perform better in the cloud (via previous testing).
When data privacy and regulatory compliance become top priorities because of new clients you have recently acquired.
When control cost is important and a pay-as-you-go model becomes viable.
If you are in need of a hardware refresh and want to lower your cost and optimize performance.
If you are moving to a new premise and no longer have in-house space.
If you want to re-deploy technical resource and concentrate on your core IT.

Migration to the cloud, especially by the technically savvy, startups and SaaS has experienced a dramatic rise in the past few years and for good reasons. Enterprise cloud computing investment is expected to grow from $76.9B in 2010 to $210B in 2016, according to a Gartner study.

Has your organization stepped into the cloud yet? Have you finally found your silver lining? What are some of the constraints you have experienced in your decision-making process? We would love to hear from you through your comments.

For more information read the AIP Case Study.