EMC VMAX3 is a Data Services Platform

VMAX3 is now Generally Available and represents the next generation of market leading platforms from EMC that have established themselves as the most reliable and highest performing data storage platforms on the planet. Let’s examine what makes VMAX3 the first Data Services Platform in the industry and why VMAX3 is an even more revolutionary step from VMAX, as VMAX was from DMX. Along the way let’s dispel some myths about the product.

At launch VMAX3 is ready for the most mission critical workloads.  At its introduction, the original VMAX was a revolutionary platform, virtualizing the matrix interconnects between internal components and allowing for both scale-up and a scale-out expansion based entirely on Virtual Provisioning.  These were enormous leaps forward in the simplicity and scalability of Symmetrix.  EMC has developed rigorous testing, manufacturing processes, and incredible architecture designs that gave VMAX and then VMAX2 the title of “Most Reliable Symmetrix” ever produced.

VMAX3 is standing on the shoulders of the VMAX2 and is taking Reliability, Availability, and Serviceability to the next level.  Additional redundancy has been added on the backend DA connections to the DAEs.  More upgrades and service events leave the director online.  Tons of serviceability design changes for CS
have been added such as rear facing light bars (ever try to find one of these in a datacenter?), a work tray in the cabinets, etc. that set the VMAX3 as the RAS benchmark in the industry.

A Data Services Platform must be available, but it must also take functionality to the next level.  EMC’s goal for VMAX3 HyperMax OS is to make it the foundational component in our customer’s data center providing access to all of our best capabilities, FAST, SLO, FTS, Cloud, SRDF, TimeFinder, ProtectPoint, etc…).

HyperMax is our most comprehensive OS rewrite to date that takes the massively parallel preemptive multitasking kernel of VMAX3, and opens the door to running other data services apps inside the array.  The VMAX3 Data Services Platform can be extended with additional software functionality online as it comes available.  Most critically to our customers, code upgrades are done without taking a single component offline.  There is no failover/failback, one-at-a-time reboots, ports offline, etc. No other array can do this!

VMAX3 is designed for Always On operations.  Combined with SRDF, VPLEX + RecoverPoint, or ProtectPoint customer operations are always protected from outage, site failure, or any other outage situation.

We talk to our customers about “always on” Platform 3 infrastructure, and now we’ve built VMAX3 into an “always on” Data Services Platform.

VMAX3 Always On

Our phased release schedule enables us to get products to market faster.  We tell our customers how Agile development has revolutionized the way apps are brought to market and how the new normal is “fast.”   Get products out fast and iterate fast to
ramp up to provide the most important features first, prioritized by customer demand.  It’s one thing to do this in the Platform 3 web space, because the service is designed to be always on, and upgradable without degradation to the end user experience.  To do this in the hardware market, the platform must provide the same Always On availability to allow upgrades that enable the additional functionality.

In exactly this way, VMAX3 provides us a platform that is always available, and through HyperMax, additional data services can be added later online. Getting the base platform right is critical.   As previously discussed the upgradeability of VMAX3 allows us to introduce this revolutionary product now, which is crucial to maintaining market leadership position and building momentum while simultaneously creating a revolutionary new data services platform.

VMAX3 will carry customers toward Storage as a Service.  Fundamental to the value proposition of VMAX3, like VMAX Cloud Edition before it, is the idea that purchase decisions and provisioning are based on Service Level Objectives as opposed to rotational speed of the drives.  This outcome-based thinking is a wave carrying our customers toward the beachhead of Storage aaS.

Three years ago, EMC’s message for introducing ITaaS Transformation was 1) Have C-level sponsorship, 2) Pick a project and grow it, 3) Let the technology get you there.  Well, VMAX3 is getting our customers there.  Whether or not they have financial models that equate to the service levels inside VMAX3, they gain the simplicity and ability to automate processes that Service Level based provisioning provides.

The VMAX3 Data Services Platform can handle any performance Service Level Objective required by the customer.  Designed as an all-flash capable array, any of the VMAX3 family can support all-flash configurations.  Should the customer chose a higher capacity or lower cost design, spinning disk can be used in combination to provide various SLO’s of performance and capacity within the array.  Front-end and Back-end CPU cores are now pooled giving any port full line rate capability, doubling the IOPs capable on VMAX2.  PCI Gen 3 and 6Gb SAS connections to the DAE’s deliver incredible bandwidth for DSS workloads, tripling the throughput of VMAX2.  VMAX3 provides both the rich data services functionality and the performance required to process the avalanche of new data in the datacenter.

In summary, VMAX3 is an “always-on” scale-up-and-out Data Services Platform
that can be online extended via HyperMax OS to provide additional advanced software functionality within the array over its lifespan with incredibly easy to use SLO based provisioning meeting any performance requirements of SAN or NAS attached hosts. This redefinition of the storage array makes VMAX3 a larger step forward from VMAX as VMAX was from DMX.

 

How I installed OpenStack at a Service Provider in 20 mins

I’ve been playing around with my skillz in python, puppet, OpenStack, deployment methods, etc for a few weeks.  Here’s a small example of deploying OpenStack in a non-production environment (no hardening or customization of any kind) in 20 mins.  Maybe it’ll help you in your self education.

For testing purposes I use a service provider called Digital Ocean. I’m sure what they do is competitive to all things EMC, so this is not an endorsement. They do, however have VERY CHEAP costs if you’re just doing testing (on the order of < 1 penny per hour) with linux based services (again YMMV, I have no idea how good they are).

Now when I say 20 mins, that’s how long it takes for the procedures to run once you figure out what to do. I’ve spent a couple of days playing with the site, VM’s, python, devstack, etc. to get the procedure in place. That said, it took 20 mins of wall clock time to get OpenStack running from the devstack.org package on a single machine deployment (ie: not production ready).

At the end of this post is the python code to talk to the Digital Ocean API and deploy the instance. Since it’s not very sophisticated code, I just edit it directly to do what I want to do as opposed to passing command line arguments (list, deploy, destroy).

So, to get a VM I called ‘devstack’ I executed my little script with settings hard-coded:

./go.py

The script returned the following output, parsed from the JSON response:

107.170.89.157 devstack 1790811

Then I did the following to install OpenStack:

$ sudo echo "107.170.89.157 devstack" >> /etc/hosts
$ ssh root@devstack
# adduser stack
# echo "stack ALL=(ALL:ALL) ALL" >> /etc/sudoers
# apt-get install git
# su - stack
$ git clone https://github.com/openstack-dev/devstack.git
$ cd devstack && ./stack.sh
Output:lots of output from stack.sh... 
Horizon is now available at http://107.170.89.157/
Keystone is serving at http://107.170.89.157:5000/v2.0/
Examples on using novaclient command line is in exercise.sh
The default users are: admin and demo
The password: NOTFORYOUREYESTOSEE
This is your host ip: 107.170.89.157
stack.sh completed in 794 seconds.

Here is the python script I mentioned above, It’s not great, but maybe it’ll give you some idea how to talk JSON, and return results to an online API.

#!/usr/bin/python

import requests, json, pprint, time, socket, sys

################
## make the json call to the public api
################
def jsonRequest(targetUrl):
#set the headers for how we want the response
headers = {'content-type': 'application/json','accept':'application/json'}

# make the actual request
r = requests.get(targetUrl, headers=headers, verify=False)

#take the raw response text and deserialize it into a python object.
try:
responseObj = json.loads(r.text)
except:
print "Exception"
print r.text
#print json.dumps(responseObj, sort_keys=False, indent=2)
return responseObj

################
## return list of instances
################
def getDroplets(apiKey, clientId):
#set the target url for the query
targetUrl = "https://api.digitalocean.com/v1/droplets/?client_id=%s&api_key=%s" % (clientId, apiKey)

# make the actual request
resultList = jsonRequest(targetUrl)
return resultList['droplets']

################
## Deploy a new instance
################
def deployDroplet(apiKey, clientId, hostName):
sizeId = '66' #smallest; 62 = 2GB Ram
imageId = '3240036' #ubuntu 64bit
regionId = '4' #NY region
sshKey1 = '153816' #brighs ssh key
sshKey2 = ''
privateNetworking = 'true' # create private network

#set the target url for the query
targetUrl = "https://api.digitalocean.com/v1/droplets/new?client_id=%s&api_key=%s&name=%s&size_id=%s&image_id=%s®ion_id=%s&ssh_key_ids=%s,%s&private_networking=%s" % (clientId, apiKey, hostName, sizeId, imageId, regionId, sshKey1, sshKey2, privateNetworking)

# make the actual request
responseObj = jsonRequest(targetUrl)
return responseObj['droplet']

################
## destroty an instance
################
def destroyDroplet(apiKey, clientId, dropletId):
#set the target url for the query
targetUrl = "https://api.digitalocean.com/v1/droplets/%s/destroy/?client_id=%s&api_key=%s" % (dropletId, clientId, apiKey)

# make the actual request
responseObj = jsonRequest(targetUrl)
return responseObj

######################################

apiKey = "GOGETYOUROWNAPIKEY"
clientId = "GOGETYOUROWNCLIENID"

droplets = getDroplets(apiKey, clientId)
for droplet in droplets:
#for key in droplet:
# print key
print "%s %s %s" % (droplet['ip_address'], droplet['name'], droplet['id'])
#print "destroying droplet %s" % (droplet['name'])
#destroyResult = destroyDroplet(apiKey, clientId, droplet['id'])
#print destroyResult['status']

#droplet = deployDroplet(apiKey, clientId, 'devstack')
#print "%s %s" % (droplet['name'], droplet['id'])
#deployDroplet(apiKey, clientId, 'tester4')
#deployDroplet(apiKey, clientId, 'tester5')
#destroyDroplet(apiKey, clientId, '1786240')

My first python and JSON code

I’m not a developer. There I said it.

I’m a presales technologist, architect, and consultant.

But I do have a BS in Computer Science. Way back there in my hind brain, there exists the ability to lay down some LOC.

I was inspired today by a peer of mine Sean Cummins (@scummins), another Principal within EMC’s global presales organization. He posted an internal writeup of connecting to the VMAX storage array REST API to pull statistics and configuration information. Did you know there is a REST API for a Symmetrix?! Sad to say most people don’t.

He got me hankering to try something, so I plunged into Python for the first time, and as my first example project, I attached to Google’s public geocoding API to map street addresses to Lat/Lng coordinates. (Since I don’t have a VMAX in the basement)

So here it is. I think it’s a pretty good first project to learn a few new concepts. I’ll figure out the best way to parse a JSON package eventually. Anyone have any advise?

###################
# Example usage
###################
$ ./geocode.py
Enter your address:  2850 Premiere Parkway, Duluth, GA                 
2850 Premiere Parkway, Duluth, GA 30097, USA is at
lat: 34.002958
lng: -84.092877

###################
# First attempt at parsing Google's rest api
###################
#!/usr/bin/python

import requests          # module to make html calls
import json          # module to parse JSON data

addr_str = raw_input("Enter your address:  ")

maps_url = "https://maps.googleapis.com/maps/api/geocode/json"
is_sensor = "false"      # do you have a GPS sensor?

payload = {'address': addr_str, 'sensor': is_sensor}

r = requests.get(maps_url,params=payload)

# store the json object output
maps_output = r.json()

# create a string in a human readable format of the JSON output for debugging
#maps_output_str = json.dumps(maps_output, sort_keys=True, indent=2)
#print(maps_output_str)

# once you know the format of the JSON dump, you can create some custom
# list + dictionary parsing logic to get at the data you need to process

# store the top level dictionary
results_list = maps_output['results']
result_status = maps_output['status']

formatted_address = results_list[0]['formatted_address']
result_geo_lat = results_list[0]['geometry']['location']['lat']
result_geo_lng = results_list[0]['geometry']['location']['lng']

print("%s is at\nlat: %f\nlng: %f" % (formatted_address, result_geo_lat, result_geo_lng))