Category: OUIT-S2


Monitoring VMs FTW!

Recently I had to help figure out why some customers have been getting slow performance with VMs.  Reservations were used but didn’t help.  What did I do to find the issue?  We are a VMware shop using vCloud Suite Enterprise which gives us vCenter Operations Manager (has a new name but I will always call it vCOPS) and the ability to use custom dashboards.  Sadly I did not have it setup to use LDAP nor shared dashboards.  After getting it going so our Operations staff can login and see the TOP-N graphs as well as the vCloud Director dashboards, I started seeing off the bat several high CPU Ready% VMs.  Wasn’t too long that I was able to see that the VMs were hammering the vCPUs but the default Pay-as-You-Go organization setting in vCD is limiting the vCPU speed to 1GHz.  No wonder right?  The problem is that you have to make that change in the VDC of the organization and then restart the vApp.  Not something you can do just in the middle of the day.

Like below I created a dashboard for our biggest customer so they can see how their environment is doing.

high-CPU-Ready

Notice that the top VM is at 31.5% CPU Ready.  The issue is that this VM is in vCloud Director with the vCPU limited at 1GHz.  Changed it to 4GHz but we cannot reboot this VM in the middle of the day so I removed the limit within vCenter.

high-CPU-Ready1

You cannot tell since I blurred out their VM names but the one that was being limited is now off the list.  This is actually the 2nd VM in the first picture that’s now at the top.

high-cpu-ready3

The above is right after I removed the limit.  Notice the spike to just above 3,750MHz for this 2vCPU VM.  Since they have 2vCPU they were able to hit 2000MHz (1GHz x 2vCPU) but the demand was more at this time.  They could have added more vCPU’s but then we could run into other issues of too many vCPUs per host if that was the stance.  Now I’m not saying the high CPU Ready% is always going to be this case, it could be the we need to right size VMs across the board and maxing out the hosts so there’s a CPU wait going on.  So use what you can and monitor it as often as you can if you are providing services to customers.  This is one case where I was able to find a problem before the customer reported it.  That’s a win in my books!

 

 

 

Advertisements

I had the great benefit of meeting GS last year at VMworld 2013.  Great guy and has a passion to help bring the virtualization community out of just behind our desks staring at monitors.  Being able to listen to what others are doing is an awesome way to learn and make contacts with those doing the same as you.

Episode 19 – Joey “VM” Ware – VCAC’d on what I’ve been doing here at OU|S2 for the last few years.

Check out what his Virtualization User Podcast as a Service and subscribe in iTunes to get the new episodes.

OUIT Shared Services – 2013

This year has been packed with things that we’ve been doing in S2 at University of Oklahoma.  We went production with both data centers in OKC and Norman being a stretched cluster.  Brought up vCloud Director to give our users self provisioning and better visibility into their environment.  This meant we had to get storage, both block and file, working across then we even upgraded the controllers for the Dell Compellent SAN.  We were able to do this with little to no downtime to our customers which proved what we designed it to do.

Now was it all easy and full of great smelling flowers?  No way.  I would love to say we can engineer an active/active data center with no issues.  But we have enough great engineers to quickly resolve or remedy a problem.  I’d say over all phase 1 of S2 is going pretty well.  If it wasn’t I doubt I would have been able to present over it on a vBrownBag podcast or a session at VMworld 2013 in San Francisco.  Now that was exciting, not everyone can say they spoke at a conference that is attended by 22,000+ people from around the world.

Phase 1 included:

  • Two Active/Active Datacenters (OKC and Norman with Tulsa being a separate data center for now)
  • 4 Dell Compellent SANs (2 each DC) with 800 TB RAW all together – largest environment in Oklahoma currently
  • 2 EMC Isilon NAS (1 each DC) with around 250 TB that can replicate to each other
  • 4 Cisco 7K’s (2 each DC) and multiple 5K’s and 2K’s that are linked together using either our dark fiber between us or OneNET’s path.
  • Juniper SRX firewalls
  • Palo Alto for IPS/IDS
  • 16 vSphere ESXi 5.1 general cluster nodes (8 each DC/Dell PowerEdge blades)
  • vCloud Director 5.5/vCOPS 5.7
  • View 5.2 (OKC only for now)

Phase 2 will be finishing the upgrade of our VMware environment to vCloud Suite 5.5 and seeing what vCAC 6 could bring to the table.  More training is already scheduled for our Operations and Design teams on vCloud Director, vCOPS, View and CommVault.  Simpana 10 backup is installed and configured but slowly rolling this out so we can finish up transitioning this into Shared Services as an offering to our customers.

It will be a busy year for us in 2014 but it will be a ride.. time will tell if it’s a great, good or heck of one!

Many thanks goes out to:

  • My coworkers, specifically David Stricklin (@strickfila) for backing me up when needed/keeping me in check and our VP David Horton (@hortonhearsyou) for doing all that he does for OU/OUHSC.
  • Sean O’Dell (@theseanodell) from VMware on always assisting me the OKC VMUG, keeping us up-to-date with VMW products and co-presenting at VMworld.
  • The vBrownBag community (@vbrownbag/#vbrownbag) like Jon Harris (@thevcacguy), Damian Karlson (@sixfootdad), Cody Bunch (@cody_bunch) and many others.  Check them out at http://www.professionalvmware.com
  • The VMUG community (@myvmug) as a whole and there’s not another one like it.
  • Also all past and current vExperts to which I’m thankful to say I am 1 out of 581 to be named in 2013.

If you do have Twitter, I highly recommend following these people as well as our Shared Services (@ouits2) and OKC VMUG (@OKCVMUG) pages to see what is going on for 2014.

There are several podcasts that I recommend besides just vBrownBag: