Wednesday, 14 October 2015

Installing HP Cloudsystem 9.0 - Part 6

Installing HP Cloudsystem 9.0 - Part 6


So, you need to allow about an hour to start up your Lab according to the Admin Guide instructions starting on page 55.

Management Appliances:

Start by powering on the Compute and Management Hosts and then the OVSVAPP VM on the Compute Host.

Next continue by powering on the ma1 appliance, in my lab it's called cs-mgmt1. If you look at the console it will stay on a screen listing network adapters etc for a few minutes, wait until this clears to a logon screen then SSH on and prep the VM as follows:

sudo -i
service mysql bootstrap-pxc
service mysql status

Power on ma2 / cs-mgmt2
Wait for the logon prompt on the console and check mysql on ma2 from ma1:
ssh cloudadmin@ma2 sudo service mysql status

Power on ma3 / cs-mgmt3
Wait for the logon prompt on the console and check mysql on ma3 from ma1:
ssh cloudadmin@ma3 sudo service mysql status

perform the following on each of the VMs in turn starting with ma1, then ma2, finally ma3:
os-refresh-config
This spins through hundreds of pages of checks, takes 2-3 minutes and ends with this:

Wait until each is completed before doing the next appliance:
os-refresh-config
ssh cloudadmin@ma2 sudo os-refresh-config
ssh cloudadmin@ma3 sudo os-refresh-config

The Healthcheck involves logging into the Management Portal, I use the VIP link as follows:
http://192.168.10.80
Then go to General, Monitoring & Launch Monitoring Dashboard
(I get the odd error "Unable to list alarms - connection aborted, etc....)
This didn't show any useful information at this point
The HA proxy shows all up in green except monasca:
http://192.168.10.80:1993

The Cloud Controllers are next:

Power on cmc or in my case cs-cloud1
Wait a few minutes for it to get to a logon prompt
Go back to the ma1 / cs-mgmt1 appliance and execute the following commands:
ssh cloudadmin@cmc sudo service mysql bootstrap-pxc
ssh cloudadmin@cmc sudo service mysql status

Power on cc1 & cc2 and wait for a logon prompt
(Hit enter as some messages may make it appear to be paused but it's actually ready)
from ma1 run the following commands on the cc1 & cc2 appliances:
ssh cloudadmin@cc1 sudo service mysql status
ssh cloudadmin@cc2 sudo service mysql status

Now perform an os-refresh-config on each appliance:
ssh cloudadmin@cmc sudo os-refresh-config
ssh cloudadmin@cc1 sudo os-refresh-config
ssh cloudadmin@cc2 sudo os-refresh-config
The os-refresh-config completes much faster than it did for the management appliances

Log into the Openstack Console
https://192.168.12.200/admin/info/
User: admin, <password as specified during install>
Check the Admin\System\System Information on the left and inspect each of the sections: Services, Compute Services, Block Storage Services and Network Agents to ensure all are Enabled and operating
The Cloud Controller HA Proxy should also be checked:
http://192.168.10.81:1993

Now for the Enterprise Appliances:

Power on ea1 / cs-enterprise1
Wait for a logon prompt (and yes, it DOES reference a cattleprod!!!)

ssh cloudadmin@ea1
sudo -i
service mysql bootstrap-pxc
service mysql status
sudo -u csauser /usr/local/hp/csa/scripts/elasticsearch start
sudo -u csauser /usr/local/hp/csa/scripts/msvc start
service csa restart
service mpp restart
service HPOOCentral restart
exit
exit

Power on ea2 / cs-enterprise2 & wait for the cattleprod / logon prompt
Power on ea3 / cs-enterprise3 & wait for the cattleprod / logon prompt

ssh cloudadmin@ea2
sudo -i
service mysql status
sudo -u csauser /usr/local/hp/csa/scripts/elasticsearch start
sudo -u csauser /usr/local/hp/csa/scripts/msvc start
service csa restart
service mpp restart
service HPOOCentral restart
exit
exit

ssh cloudadmin@ea3
sudo -i
service mysql status
sudo -u csauser /usr/local/hp/csa/scripts/elasticsearch start
sudo -u csauser /usr/local/hp/csa/scripts/msvc start
service csa restart
service mpp restart
service HPOOCentral restart
exit
exit

At this stage there is no useful information in the Openstack user dashboard or the Operations Console. The HA Proxy appears to be best at this stage:
http://192.168.10.82:1993/

In one scenario my ea2 showed CSA errors above so I ran through the steps above again on that appliance which resolved them and all turned green!

The Co-Stop only blipped to 2/4/6 Milliseconds a few times over this process so I'm happy this is less stressful on the environment!

You can check the Enterprise interfaces such as Marketplace, CSA & OO if you wish here.

Next are the monitoring appliances:

Power on mona1 / cs-monitor1 and wait for the cattleprod / logon screen

Now, I had trouble in one instance shutting down these appliances so on power up I saw a LOT of messages and thought it was shafted. I've since noticed that even after a clean power down it's not happy! You typically notice mona1 cycling through "monasca-notification main process ended messages" a LOT. You may spot a logon prompt on the others hidden around the messages. There is a fix below but run through the mysql steps below first anyway as SSH does work despite the warnings.

On the ma1 appliance connect to mona1:

ssh cloudadmin@mona1
sudo -i
service mysql bootstrap-pxc
service mysql status
exit
exit

Power on mona2 / cs-monitor2 & wait for a logon prompt
Power on mona3 / cs-monitor3 & wait for a logon prompt
ssh cloudadmin@mona2 sudo service mysql status
ssh cloudadmin@mona3 sudo service mysql status


Start the Update Appliance:

Power on ua1 / cs-update1
from the ma1 appliance ssh to it
ssh cloudadmin@ua1 sudo os-refresh-config

That's it - you should be getting health information on the Operations Console & Openstack Console.

Monitor Appliances Fix:

My 3 monitoring appliances are constantly cycling through a pair of errors:

monasca-api main process ended, respawning
monasca-monitoring main process ended, respawning

I've tried power cycling, and bringing up either the first or third one on it's own to no avail. I got pointed by more knowledgeable colleagues to the Admin guide pages 61 & 62 - "Unexpected Shutdown recovery options".

So, even though the console is going Nuts (!) you can still SSH onto these appliances and see what's wrong.

ssh cloudadmin@mona1
sudo -i
cat /mnt/state/var/lib/mysql/grastate.dat
Check the seqno for it's value, mine on all 3 appliances were -1 but I ran the following command on mona1 anyway:
service mysql bootstrap-pxc
service mysql status
then I exited twice and ssh'd into each of the other 2 appliances and ran:
service mysql restart
service mysql status
I then restarted the 3 appliances, one at a time:
ssh cloudadmin@mona1 sudo shutdown -r now
ssh cloudadmin@mona2 sudo shutdown -r now
ssh cloudadmin@mona3 sudo shutdown -r now

So far, no difference, it was the next set of commands that fixed this for me.
ssh cloudadmin@mona1
sudo -s
export PYTHONPATH=/opt/vertica/oss/python/lib/python2.7/site-packages
su dbadmin -c 'python /opt/vertica/bin/admintools -t view_cluster -d mon';
This command firstly showed the following while mona3 was restarting:

 DB  | Host         | State
-----+--------------+--------------
 mon | 192.168.0.33 | INITIALIZING
 mon | 192.168.0.34 | INITIALIZING
 mon | 192.168.0.35 | DOWN

Then it changed to this a few moments later:

 DB  | Host | State
-----+------+-------
 mon | ALL  | DOWN

As all 3 nodes were down we need to restart Vertica from a last good known state. We copy out the vertica_admin_password="XXXXX" from /home/cloudadmin/hosts
vi /home/cloudadmin/hosts
You can just copy everything out of SSH if you're using PUTTY and paste it into Notepad and extract the exact password. Then run the following command:
su dbadmin -c 'python /opt/vertica/bin/admintools -t restart_db -d mon -e last -p
<vertica_admin_password>';
This should do the trick or there is a further command to force the issue:
su dbadmin -c 'python /opt/vertica/bin/admintools -t restart_node -s 192.168.0.33,192.168.0.34,192.168.0.35 -d
mon -p <vertica_admin_password> -F'

Hopefully you can repeat the view_cluster command used earlier to confirm the status as follows:
-----+------+-------
 mon | ALL  | UP

All good!