Tuesday 27 February 2018

vRealize Operations Musings

vRealize Operations Musings


This post is a quick delve into the world of vRealize Operations. Let me state up front I'm not a fan but I thought I'd try and trace why and have a fresh look at the product to find some good points that would help balance my perspective a bit!

I was responsible for the delivery of Microsoft System Center Operations Manager back in the day for about a year on and off. I thought then and still do that it's over-engineered. You can tell from some products if they were designed well and intuitive to use, and others where nothing is how you'd expect it.

I read a book a few months ago about design theory, related to doors and things like that. A glass door with no obvious handle will confuse people as to which way it opens and via which side, cue broken nose or broken glass everywhere. My opinion is SCOM is like that. VMware vRealize Operations echoes that feeling to me but is no way as bad, but still it's not in a good place compared to other software products I work with which is a shame.

To give you an example - cue how to add a vCenter into vRealize Operations:

When you first log into vRealize and the dust is settling you see this:
Ok, I thought, where do I add my vCenter. I click on the plus and get this:

What's a PAK file?!! Do I need one? That's what I mean. You have to highlight the VMware vSphere solution on the previous screen, then click on the configure cogs button to get this:
Intuitive? Hardly! 

I've already added mine here, seems ok? Wait until you get to the credentials section, type in administrator@vsphere.local and have a laugh to yourself! You have to add a credential - see the plus beside the credential area to get this:
Ok, so it's not a big thing but these two items are enough to trip me up for 5 minutes, figuring out where and why. I know there's documentation but if you've worked in IT for a few years is it too much to expect that you will get a more intuitive start to the product. This is key as this is the impression that will stay with you. I know, it's stayed with me...!

It's that kind of structure that determines how much you're going to like playing with a product long after you've gotten it working the way you want. Take reports as an other example. You can output basic inventory reports, I did so and got blank, zip, nada detail in them:
So I chose to get a Hardware Summary. You need to highlight the report, THEN click the play button on the taskbar (no right click here) - again why is it designed this way, the second from last icon?!!!!! WHY????!! Then you get to choose from the following:
So, the defaults look good, let's go with vSphere World, right? This is the result:
There are two problems here, the lack of data and the formatting. Let's fix the data first, you need to run the report but choose THIS fecker from the non default drop down list:
So instead of the default "custom groups" at the top, I chose "all objects" and then pointed the dozy product at my only connected resource, my vCenter Datacenter. Not exactly hard but completely irrational for a VMware Product. This is the result:
I love the thought that went into the header, footer, 1st page VMware logos and index but there's one small problem, the 40 pages of content looks like the above. Now, you do have a CSV option but we all know that requires a little massaging. If I wanted to schedule a report to email itself weekly, this one isn't going to do it for me. I'd have to edit the template and only add sufficient fields to fill the width. There's no obvious formatting option to correct this so it's pretty much useless. One of the most basic reports I could think of too. Get RVTOOLS here by the way, much better:
Now, about scheduling useful reports.....you can email them and save them to a network share, so that's something. I setup saving them to a share and you can choose weekly or monthly but that's it, no choice over the format either?! You can choose the start hour but not the minute, try testing your settings once per hour that way, see where I'm going?! No way to manually test the schedule either. Still think an Engineer designed  this?!! No idea if it works or what the format is - I gave up here.....!!

I did spot one nice thing when I was configuring the vCenter connection, that is to check compliance against the vSphere Hardening Guide. Brilliant! 
Define Monitoring Goals is one of the optional sections when setting up the vCenter connect - it defaults to No, even after you save it (!) but I am getting alerts for my Hosts and VMs against the hardening guide settings which is very useful. Thought I'd need configuration manager for that but at least it's a bonus here. 
So, that's one thing I'd definitely find useful beyond Host Profiles and the battle I have with them regularly! There's an associated report, great, let's run it!
The format is better at least but look at the number of pages - 236!!! Lol. So not that useful then. Like host profiles I'd be better off using scripts etc to do the hardening (unless you've bought vRealize Configuration Manager) and using this report to audit compliance........ 

There are good growth forecast reports which are useful, I'm not going to be running vRealize in my lab long enough to see the benefit but capacity planning is a strength and good to have. 

One other thing, when you run a report, once it finishes the screen refreshes so you've to scroll down AGAIN to find the damn thing unless you use a filter first. Annoying, simple for someone to fix but I doubt it ever will be.....

Here's a report on reclaimable space:
No percents listed, I've several templates but it's not clear if this means it didn't detect them or as they are thin provisioned there's nothing to reclaim off them. 

I love this report:


Very Graphic, clear and precise and tells us absolutely nothing....!

Now I think you can blame the user on this one, I'm sure after some additional knowledge & upskilling, training and time the product would deliver everything VMware raves about but a door is a door, put a handle on it and don't waste my time searching for things that should be obvious. Defaults should work for the most common use cases, like vCenter objects in reports. Maybe they intended to use this with a much wider palette than vCenter but if that's your core base, play to them first. This doesn't feel like a product designed by VMware to work with vCenter, doesn't that sound wierd to you? Like they wanted to connect this to all your physical servers too (there are agents for that) and suck in everything here. 

I finally decided to find the best view to put on a TV in an Ops room and this was it:
Not too bad. Given time it should provide more detail than my labs shows here and you can customize these (Advanced or Enterprise only) to your needs a bit more. There's a good blog post here on reports via this newer HTML 5 interface in 6.6:

I've not yet found a good resource about creating custom dashboards in 6.6 as all the articles are on the previous version and no longer valid. 

So, it's all down to if you're already bought into vROps via a licensing deal or do you have flexibility to look at other solutions. I'm very fond of Veeam One but whatever you choose / end up using you need to ensure it delivers sufficient quality information and no more, otherwise a spammed inbox isn't going to get any attention. 

Use the VMware Hands On Labs to look at this product or download a trial of the OVF like I did. I hope you find this useful and get the right solution for your Organisation. Best of Luck! 















Tuesday 20 February 2018

Bring out the VVOLS...!!

Bring out the VVOLS...!!


I had a chance to play around with VVOLS today and to see how they've improved since my last encounter. Now, I don't have a physical SAN to play with so how do you get VVOLs in a Lab?! I was hoping StoreVirtual would have advanced by now to support VVOLs but that hasn't happened so I've been scouting around for a replacement that would offer VVOLs but not require dedicated hardware, i.e. a virtual appliance. There are a few around but they can be hard to get. My solution was to use Nimble. Now, the appliance isn't any easier to get hold of so good luck there! I'd recommend trying the NexentaStor with a trial license if you're stuck.

Between shutting down my Lab one day and starting it up the next I ran into an issue where the previous VVOL I'd created wasn't accessible. While I got a REALLY good look at the CLI and deep dived to troubleshoot I found two things:

There is no ESXi CLI command to remove a troubled VVOL Storage Container:

esxcli storage vvol storagecontainer
Usage: esxcli storage vvol storagecontainer {cmd} [cmd options]

Available Namespaces:
  abandonedvvol         Operations on Abandoned Virtual Volumes.

Available Commands:
  list                  List the VVol StorageContainers currently known to the ESX host.

Yep, just list, darn it! I was able to list the fe*ked container but could do nothing with it! 

Second, sometimes it's better to start all over again. I deleted what I could in VMware, removed the iSCSI Target to the Nimble appliance and unregistered the Nimble management interface with vCenter which also pulls out the VASA provider with it. Then rebooted the ESXi host as it was still showing up the Protocol Endpoint. If in doubt, start over! Then I got somewhere!

The Nimble appliance has an interface that sets up the VASA Provider and Web Plugin here:

The Thick client isn't going to help you with vCenter 6.5 but it's there anyway for older versions. I'm on Nimble OS 4.5 and they've since released 5.0 just to note. 

So in vCenter you can check your VASA Provider by bringing up the vCenter configure tab as shown here:
 This is with the Lab Group selected:
This area is very important as you may need to kick, sorry, refresh the VASA provider when you're pulling you hair out. You can choose a rescan or sync option. The Nimble logs as in /var/logs/nimble if you're bored and want to tail something. 

Here is the iSCSI view of the PE (Protocol Endpoint) after I'd created a VVOL folder in Nimble:

You can see the 512 Byte volume on LUN 0 which is the PE. The other Targets are my offline StoreVirtual, ignore those! 

Now you can create a VVOL and hope everything works as expected. If not rescan Storage, rescan Adapter etc etc. I created a small VVOL and Storage vMotioned a CentOS VM onto it:

I took a snapshot and then viewed the VM through the Nimble interface. The snap took 10 minutes to complete but that's most likely due to my Lab. But the Veeam backup snapshot only took a few seconds! Here's the interesting pictures:

This is the contents of the VVOLS1/CentOS folder:

Here is the view via Nimble:
What's surprising to me is that there's only 5 files.....even after a snapshot is taken. The memory snapshot is listed but not the disk snapshot. And what about all the other files?!!

Here is Veeam backup job:

Here are the available storage policies that Nimble makes available:








So you can choose to expose encryption and deduplication options to vCenter Admins. 

By the way Google "vSphere Virtual Volumes Technical Overview" - there a paper dated January 2018 worth reading. Or try this link if it works:

Next have a look at what happens after I take a second snapshot that includes memory (and it still took 10 minutes):


There's the two snaps, 1.6GB each: 
As third snap, taken without the VMs memory makes NO change to the Nimble view. 

So, according to the PDF there are meant to be 5 VVol objects:

(Taken from the VMware PDF referenced earlier)

So what CAN we see? The Config VVol is the first one listed in Nimble, then you have the Memory-VVol, the two Snapshot VVols are only present when you choose to snap the memory and the Data-VVol is the vmdk as listed. The last VVol is the Cluster HA folder so it's nice to know it can use VVols too! 

Now, about those Snapshots. Why don't we see these in Nimble? Well, we do, kinda - watch the last column on the right:

I can also create a local snapshot directly on the Nimble. You can view all the snapshots here. Note it doesn't expose the Nimble snapshot up to vCenter:
Now, let's delete it and restore using Veeam! 
And we're back up and running:
Snapshot free of course! Thanks Veeam!

Now to create a Clone - I can power on the CentOS2 clone and view the space usage. Nimble appears to take care of the DeDup and uses a small snap:
 There 7.4MB snap is shown below:
CentOS2 otherwise uses NO space! Nice!

So, I've had a little play and done the basics. Replication / DR would be the next level but I'm not going there yet. It's more seamless than I thought and maybe Storage Policies are the way to go? With everything moving to flash, with dedup and encryption as standard is there much left to choose apart from the protection level / replication? 

Hope this gave you an insight into VVols and you found it useful. 




Thursday 15 February 2018

TPDSrm.exe


TPDSrm.exe


This post is about an issue I faced recently when integrating VMware Site Recovery Manager on a customer site. As part of the integration you need to get both SRM servers to trust the 3PAR certificate. There is a command to do this but for some reason it was not working for me. Someone else out there may have this issue so hopefully this will save you an hour or two troubleshooting!

The command syntax is as follows:


TPDSrm.exe validatecert –sys <ip address of 3PAR> –user username –pass password

So the symptoms were as follows:

If you just type in TPDSrm.exe and press enter you get the help page. It shows you the various options. I then crafted the full command and entered it but it came back with the help page again! I checked for typos, there were none. I tried the other SRM server and got the same result. I then upgraded the SRA software from 6.5 to 6.5.3 and it was no better! 

I was really scratching my head at this stage. The only way I could get a different reaction was to put <> around some of the values, just to prove it was reading the command, this gave an error of course. Other commands to showcache or listcerts works fine! There was nothing to show but at least it ran the command!

I then tried substituting crap into the value fields as follows:

TPDSrm.exe validatecert –sys george –user bob –pass bob

This got me nowhere, it was as if it didn't matter what the commands I tried there was something else going on here. I was about ready to call it a day and log a support call when I tried typing out the commands one at a time like this:

TPDSrm.exe 

TPDSrm.exe validatecert 

TPDSrm.exe validatecert –sys george 

TPDSrm.exe validatecert –sys george –user bob 

TPDSrm.exe validatecert –sys george –user bob –pass bob

When I typed in the final command it actually did something and came back to say it couldn't find the system. I then went back and typed in my original command and got the help screen again! Then I typed out the command is 5 stages as listed above, but with valid data (although with a fake password as I wanted to rule out a "!" as causing a problem. It tried to connect and failed. I then used the correct password after updating it to remove a "!" just in case and it connected fine. The cert was trusted and I could repeat this process on the other server. 

Conclusion - either a "!" in the password or a copy / paste issue is the only conclusion I could reach. I don't think it was the password but perhaps the notepad file I used to stage the command, copied over RDP, corrupted the command in some way? I did try typing the whole command and this failed too but after so many attempts maybe I didn't try this. 

Anyway, strange one, might get someone out of a hole in the future. 

Tuesday 13 February 2018

Terraform - Azure

Terraform - Azure


This post follows on from the previous ones and demonstrates using Terraform to create an Azure IaaS VM, just for kicks!

Install Azure CLI 2.0 and from a command prompt or powershell type "az" and press enter. You should now see Azure commands available for your enjoyment!
https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-windows?view=azure-cli-latest

Now to login you type "az login" and follow the instruction to authenticate. You copy the URL and enter the given code to authenticate the CLI.

Next, set your subscription ID as follows:
az account set --subscription="<SUBSCRIPTION_ID>"

If you have access to multiple subscriptions this is where you spend a bit of time checking you're targeting the right one! Or don't do this on a Friday....!

Next query the following IDs:
az account show --query "{subscriptionId:id, tenantId:tenantId}"
Copy these to Notepad for later.

Next create Terraform credentials for it to use:
az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/<SUBSCRIPTION_ID from above>"

This gives you the appId and password you'll need.....copy these out to notepad also to make the next step easier. Let's say this is the output from the previous step:

  "appId": "7654321",
  "displayName": "azure-cli-2018-02-09-09-23-18",
  "name": "http://azure-cli-2018-02-09-09-23-18",
  "password": "abcdefg",
  "tenant": "1234567"

This is the command you need to edit:
az login --service-principal -u SP_NAME -p PASSWORD --tenant TENANT

So, open a new powershell or command prompt and in this case we would enter:
az login --service-principal -u "http://azure-cli-2018-02-09-09-23-18" -p "abcdefg" --tenant "1234567"
I've replaced by GUIDs here to show which field maps into the command. This should authenticate you within a second or two and then you can test with a simple query as follows:

az account list-locations
or
az vm list-sizes --location northeurope

Now setup a blank folder and the terraform.exe in it and copy the following file as terraform_azure.tf into it. Replace the Azure Provider section AND ssh key with your own values (Highlighted).

terraform_azure.tf

variable "resourcename" {
  default = "myResourceGroup"
}

# Configure the Microsoft Azure Provider
provider "azurerm" {
    subscription_id = "XXXXXXXX"
    client_id       = "XXXXXXXX"
    client_secret   = "XXXXXXXX"
    tenant_id       = "XXXXXXXX"
}

# Create a resource group if it doesn’t exist
resource "azurerm_resource_group" "myterraformgroup" {
    name     = "myResourceGroup"
    location = "northeurope"

    tags {
        environment = "Terraform Demo"
    }
}

# Create virtual network
resource "azurerm_virtual_network" "myterraformnetwork" {
    name                = "myVnet"
    address_space       = ["10.0.0.0/16"]
    location            = "northeurope"
    resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"

    tags {
        environment = "Terraform Demo"
    }
}

# Create subnet
resource "azurerm_subnet" "myterraformsubnet" {
    name                 = "mySubnet"
    resource_group_name  = "${azurerm_resource_group.myterraformgroup.name}"
    virtual_network_name = "${azurerm_virtual_network.myterraformnetwork.name}"
    address_prefix       = "10.0.1.0/24"
}

# Create public IPs
resource "azurerm_public_ip" "myterraformpublicip" {
    name                         = "myPublicIP"
    location                     = "northeurope"
    resource_group_name          = "${azurerm_resource_group.myterraformgroup.name}"
    public_ip_address_allocation = "dynamic"

    tags {
        environment = "Terraform Demo"
    }
}

# Create Network Security Group and rule
resource "azurerm_network_security_group" "myterraformnsg" {
    name                = "myNetworkSecurityGroup"
    location            = "northeurope"
    resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"

    security_rule {
        name                       = "SSH"
        priority                   = 1001
        direction                  = "Inbound"
        access                     = "Allow"
        protocol                   = "Tcp"
        source_port_range          = "*"
        destination_port_range     = "22"
        source_address_prefix      = "*"
        destination_address_prefix = "*"
    }

    tags {
        environment = "Terraform Demo"
    }
}

# Create network interface
resource "azurerm_network_interface" "myterraformnic" {
    name                      = "myNIC"
    location                  = "northeurope"
    resource_group_name       = "${azurerm_resource_group.myterraformgroup.name}"
    network_security_group_id = "${azurerm_network_security_group.myterraformnsg.id}"

    ip_configuration {
        name                          = "myNicConfiguration"
        subnet_id                     = "${azurerm_subnet.myterraformsubnet.id}"
        private_ip_address_allocation = "dynamic"
        public_ip_address_id          = "${azurerm_public_ip.myterraformpublicip.id}"
    }

    tags {
        environment = "Terraform Demo"
    }
}

# Generate random text for a unique storage account name
resource "random_id" "randomId" {
    keepers = {
        # Generate a new ID only when a new resource group is defined
        resource_group = "${azurerm_resource_group.myterraformgroup.name}"
    }

    byte_length = 8
}

# Create storage account for boot diagnostics
resource "azurerm_storage_account" "mystorageaccount" {
    name                        = "diag${random_id.randomId.hex}"
    resource_group_name         = "${azurerm_resource_group.myterraformgroup.name}"
    location                    = "northeurope"
    account_tier                = "Standard"
    account_replication_type    = "LRS"

    tags {
        environment = "Terraform Demo"
    }
}

# Create virtual machine
resource "azurerm_virtual_machine" "myterraformvm" {
    name                  = "myVM"
    location              = "northeurope"
    resource_group_name   = "${azurerm_resource_group.myterraformgroup.name}"
    network_interface_ids = ["${azurerm_network_interface.myterraformnic.id}"]
    vm_size               = "Standard_DS1_v2"

    storage_os_disk {
        name              = "myOsDisk"
        caching           = "ReadWrite"
        create_option     = "FromImage"
        managed_disk_type = "Premium_LRS"
    }

    storage_image_reference {
        publisher = "Canonical"
        offer     = "UbuntuServer"
        sku       = "16.04.0-LTS"
        version   = "latest"
    }

    os_profile {
        computer_name  = "myvm"
        admin_username = "azureuser"
    }

    os_profile_linux_config {
        disable_password_authentication = true
        ssh_keys {
            path     = "/home/azureuser/.ssh/authorized_keys"
            key_data = "ssh-rsa AAAAB3Nz{snip}hwhqT9h"
        }
    }

    boot_diagnostics {
        enabled = "true"
        storage_uri = "${azurerm_storage_account.mystorageaccount.primary_blob_endpoint}"
    }

    tags {
        environment = "Terraform Demo"
    }
}


The values you need to replace are a little confusing at first, here is a mapping:

ARM_SUBSCRIPTION_ID=your_subscription_id
ARM_CLIENT_ID=your_appId
ARM_CLIENT_SECRET=your_password
ARM_TENANT_ID=your_tenant_id

I copied the previous CLI output to Notepad and stitched together what I was after. The CLIENT_ID in particular confused me at first.

  "subscriptionId": "1020304050",
  "tenantId": "1234567"

  "appId": "7654321",
  "displayName": "azure-cli-2018-02-09-09-23-18",
  "name": "http://azure-cli-2018-02-09-09-23-18",
  "password": "abcdefg",
  "tenant": "1234567"

So from the above this is what we should have after you put in your values:

    subscription_id = "1020304050"
    client_id       = "7654321"
    client_secret   = "abcdefg"
    tenant_id       = "1234567"

You have everything you need, just take your time piecing it together. Then you can use this again and again later. If you've access to multiple subscriptions, be careful here!!

One last thing - the"ssh-rsa" line needs to be updated with a valid key or you'll get an error. To do this download puttygen.exe and click generate with the default RSA parameter option. You can copy out the entire key field and paste it as follows between the quotes. So from this:

key_data = "ssh-rsa AAAAB3Nz{snip}hwhqT9h"

to

key_data = "ssh-rsa AAAAB3NzaC1yc2EAAAAB<snip>y/uvk+dBZ2REP4Uatw== rsa-key-20180209"

It's a LOT longer than that, trust me! Now enter a pass phrase and save the private key to test the new vm shortly.

Run "terraform init"

Next run "terraform plan"
 Truncated here out of boredom.....
Ok, now we're ready to apply this and see what happens. This script is taken from here:
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/terraform-create-complete-vm
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/terraform-create-complete-vm#complete-terraform-script

We run "terraform apply"

Truncated here for boredom.....

Once finished use this to get the public IP:
az vm show --resource-group myResourceGroup --name myVM -d --query [publicIps] --o tsv

Then test by connecting with putty/ssh and that private key you used earlier!
All built and connected. Use "azureuser" as the username, you just need enter that passphrase and you should connect as shown.

Here is the Azure Portal View:
 This is all the Resources just created with this script:


Finally we can destroy everything:
Ttruncated out of boredom.......
So, where are those resources? All Gone:
There you have it!

So in this example we used the Microsoft documentation to generate an Azure VM and all it's associated objects and resources via Terraform, and connect to it.

The official Terraform Azure Provider documentation is here but I didn't need to use it in the end:
https://www.terraform.io/docs/providers/azure/index.html

Like before you can craft the file we used here into multiple files to capture variables and data separately. If you want to have more fun try this with Amazon but that's where I'm going to hold it for now. I've some VVOLs to play with next.....!!


Friday 9 February 2018

Terraform - Level 2

Terraform - Level 2


So here I'll continue on from my previous post and show how you "arrange" terraform files in a better way. Some elements of the previous build.tf file will never change or rarely. The connection information might update it's password but the rest is fairly static. You may wish to change which Datacenter, Cluster, Datastore and Network is used for each set of VMs but the definitions themselves that you want to work with can remain fairly constant. You might decide tags are important and add that to your definitions so that everyone now has to include tag data in their build.tf file.

By the way I first came across Terraform recently from a vBrownbag session by Colin Westwater:
http://www.vgemba.net
https://www.youtube.com/watch?v=nQ7oRSi6mBU
Great work by him on this!

I'd also recommend the following book as a general guide to this whole area:
Infrastructure as Code : Managing Server in the Cloud - Kief Morris
https://www.amazon.co.uk/Infrastructure-Code-Managing-Servers-Cloud-ebook/dp/B01GUG9ZNU

You can check the Terraform version as follows:
They have added a lot of improvements recently but note that your code may give errors from time to time after providers are updated. Then it's back to the documentation to see what's happened!

There are three broad files you should use to start with:

build.tf - tells terraform what to build (what we used in the last post)
variables.tf - define the variables, static, rarely changes
terraform.tfvars - defines the values for the variables, usernames, password, vcenter address etc

So I'd expect there to be a range of build.tf files (each with different names of course) that were written to perform a set of tasks. Developers can handle that. A Senior Developer might be assigned to maintain the variables and values for them so these are controlled and kept sane!

Note: if using Github you should exclude the terraform.tfvars from being updated or you'll leak your environment credentials!

The full range of create / clone VM options is covered in the documentation here:
https://www.terraform.io/docs/providers/vsphere/r/virtual_machine.html

So, let's create a fresh folder, put the Terraform.exe in it and start over. The three files we'll need to create are listed below with their contents to start you off, then we'll do the same commands "terraform init / plan / apply / destroy" as before and see what happens.

terraform.tfvars

# Top level variables that define the connection to the environment
vsphere_vcenter = "192.168.10.17"
vsphere_user = "administrator@vsphere.local"
vsphere_password = "Get Your Own Password"
vsphere_datacenter = "Labdc"

variables.tf

# Variables
variable "vsphere_vcenter" {}
variable "vsphere_user" {}
variable "vsphere_password" {}
variable "vsphere_datacenter" {}

build.tf

# Configure the VMware vSphere Provider
provider "vsphere" {
    vsphere_server = "${var.vsphere_vcenter}"
    user = "${var.vsphere_user}"
    password = "${var.vsphere_password}"
    allow_unverified_ssl = true
}

data "vsphere_datacenter" "dc" {
  name = "Labdc"
}

data "vsphere_datastore" "datastore" {
  name          = "Datastore0"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

data "vsphere_resource_pool" "pool" {
  name          = "Labcl/Resources"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

data "vsphere_network" "network" {
  name          = "VM Network"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

data "vsphere_virtual_machine" "template" {
  name          = "CentOS"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

resource "vsphere_virtual_machine" "vm" {
  name             = "terraform-test"
  resource_pool_id = "${data.vsphere_resource_pool.pool.id}"
  datastore_id     = "${data.vsphere_datastore.datastore.id}"

  num_cpus = 2
  memory   = 1024
  guest_id = "${data.vsphere_virtual_machine.template.guest_id}"

  scsi_type = "${data.vsphere_virtual_machine.template.scsi_type}"

  network_interface {
    network_id   = "${data.vsphere_network.network.id}"
    adapter_type = "${data.vsphere_virtual_machine.template.network_interface_types[0]}"
  }

  disk {
    label            = "disk0"
    size             = "${data.vsphere_virtual_machine.template.disks.0.size}"
    eagerly_scrub    = "${data.vsphere_virtual_machine.template.disks.0.eagerly_scrub}"
    thin_provisioned = "${data.vsphere_virtual_machine.template.disks.0.thin_provisioned}"
  }

  clone {
    template_uuid = "${data.vsphere_virtual_machine.template.id}"

    customize {
      linux_options {
        host_name = "terraform-test"
        domain    = "lab.local"
      }

      network_interface { }

    }
  }
}


Now I don't see people firing up Terraform for one off builds. I see this tool being used as part of an automated strategy where servers are built, updated, destroyed and rebuilt automatically. Someone updates the Template once per week perhaps and someone else may adjust the virtual hardware settings in the build.tf file and next time the automated script runs the environment takes on the new values. This also doesn't address auto-scaling, another level entirely. Your inventory and monitoring solutions should handle these changes with ease.
Of course not all applications will accept this approach and it has to be seamless. But this is a journey so read the book above and see how this approach could be of benefit to you in your particular environment to help stabilise a more agile approach to IT.

In a later post I'll show you Azure in action as that will help coalesce how this tool is more powerful than one which just speaks to a single environment. 

Tuesday 6 February 2018

Terraform - The Basics

Terraform - The Basics


This post is about using Terraform to build up and tear down VMware VMs. I've used PowerCLI for various operations in the past but Terraform is broader in that you can use it with Amazon, Azure i.e. more than just VMware. The full list is here:

https://www.terraform.io/docs/providers/index.html

Now, I doubt the options will go as deep as PowerCLI but having a common tool you can use across all these platforms allows a team to develop a standardised approach and limit the use of different tools for different platforms for common tasks.

This article focuses on VMware as it gives you a chance to test this in a lab environment. You can use Terraform to work with more than VMs too - port groups, datastores, tags, snapshots etc!

So, let's get started and download the latest version of Terraform from the link below:

https://www.terraform.io/downloads.html

I've grabbed the 64-bit Windows edition and extracted the contents to a folder on my PC. All the zip contains is a single EXE file! How's that for light?! You can put this file in your PATH so you can invoke it form anywhere or just open a command prompt and change into that directory and you're ready to go. I'd recommend Notepad ++ or similar as we'll be working with a few text files that feed Terraform the neccessary instructions about what you want it to do.


Ok, you see that build.tf file - this is a very basic intro as normally you split things out differently but to get started this will work fine. Create a new text file called "build.tf" in the same folder as Terraform to contain the following:

provider "vsphere" {
  user           = "${var.vsphere_user}"
  password       = "${var.vsphere_password}"
  vsphere_server = "${var.vsphere_server}"

  # if you have a self-signed cert
  allow_unverified_ssl = true
}

variable "vsphere_user" {
  default = "administrator@vsphere.local"
}

variable "vsphere_password" {
  default = "YOUR PASSWORD HERE"
}

variable "vsphere_server" {
  default = "labvc.lab.local"
}

data "vsphere_datacenter" "dc" {
  name = "Labdc"
}

data "vsphere_datastore" "datastore" {
  name          = "Datastore0"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

data "vsphere_resource_pool" "pool" {
  name          = "Labcl/Resources"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

data "vsphere_network" "network" {
  name          = "VM Network"
  datacenter_id = "${data.vsphere_datacenter.dc.id}"
}

resource "vsphere_virtual_machine" "vm" {
  name             = "terraform-test"
  resource_pool_id = "${data.vsphere_resource_pool.pool.id}"
  datastore_id     = "${data.vsphere_datastore.datastore.id}"

  num_cpus = 2
  memory   = 1024
  guest_id = "other3xLinux64Guest"
  wait_for_guest_net_timeout = 0
  network_interface {
    network_id = "${data.vsphere_network.network.id}"
  }

  disk {
    label = "disk0"
    size  = 20
  }
}

You'll need to edit the script to match your environment, there are 3 variables:

  • vsphere_user (use a UPN for this - e.g. user@domain.com)
  • vsphere_password
  • vsphere_server

and 4 data fields:

  • vsphere_datacenter
  • vsphere_datastore
  • vsphere_resource_pool
  • vsphere_network

Update these 7 fields above, otherwise you'll get errors. Yes, you can use Datastore clusters and get all fancy but for now we're picking a single VM on a fixed environment!

Now you're ready to initialise Terraform and create your first VM. Run the command "Terraform init" and it will download the vSphere provider for you - internet access is a requirement here:


Now run the command "Terraform plan" and you will see the following:
 Truncated here to save space........
It shows it's going to add 1 VM. Now we're ready to apply the change:

Enter the command "Terraform apply" and enter "yes" to the prompt:
  Truncated here to save space........
  Truncated here to save space........

This is just to demonstrate an API call to create a VM with Terraform works and it only took you what, 10 minutes to set up?!!

If we check VMware we'll see the following:
 The VM has been created and powered on.

The code used here is taken from the example usage in this URL:
https://www.terraform.io/docs/providers/vsphere/index.html

If you try to run it raw you'll get hit with missing variables - these are the user/password/vcenter variable fields I've added into my code above.

Now we're ready to tear back down the VM we've created. If you have a quick look at the Terraform folder you'll see it has a new file and folder in it:
The terraform.tfstate file you can open to see it's tracking what we've just built. We just need to execute the following command "Terraform destroy", it shows this will destro 1 VM, Enter "yes" to confirm:
 This trigger the following tasks in vCenter:
If you reload terraform.tfstate you'll see the data has shrunk and the VM it previously recorded has been removed which reflects the current state.

So what's the big deal with creating a single VM with a script? Well, it's repeatable, you can create 100's of VMs this way, over and over again, build them up and tear them down. If your Devs update a template daily, this allows them to update a platform overnight to run the new code. This necessitates the use of VMs as essentially stateless however so config files and databases are held off those VMs.

Have a look at the following book for a good read about DevOps that is presented as story of a company trying to move to DevOps but meeting resistance along the way:
The Phoenix Project
https://www.amazon.co.uk/Phoenix-Project-DevOps-Helping-Business-ebook/dp/B00AZRBLHO
This is an older book based around a manufacturing plant faced with imminent closure in 3 months unless they improve their productivity:
The Goal
https://www.amazon.co.uk/Goal-Process-Ongoing-Improvement-ebook/dp/B002LHRM2O
Both are easily the best business reads I've enjoyed in recent years, well worth checking out!

We'll leave it there. In my next post I'll show you how to split up the build.tf file in a better way so you can align more with a DevOps approach.