Monday, September 24, 2012

When too much RAM hurts- Centos/Redhat 5 and writes

Had an odd troubleshoot on linux.  DB server, over 200G of RAM, having problems every 30 seconds with DB queries.

The symptom was looking like blocking on I/O but not much actual I/O happening.  DB was on NFS without an async mount, so all writes being acknowledged.  Problem not really apparent on local disk that had write cache on the raid.  1G NFS mount made more problems than 10G.  So it kinda looked like write acknowledgement was a problem.  DB on a beefy NetApp so performance should be awesome.  Checked all the best practice stuff (NFS window size, netapp options, mysql .conf options. etc. etc.).  Had not gone to async on the mount yet.

Finally looking at what mysql was waiting for and how much BW was going to NFS mount on the netapp (nmon, iptraf, vmstat, iostat, top, etc), we found it waiting on a write but not really any iowait showing.

Then found this doc:

Centos 5/Redhat 5 sets aside cache as a % of memory- good.  It flushes it after it gets a certain amount of memory or after 30 seconds.  Also good.  Unless you have like 28G of cache because it is taking 10% of physical memory- then  flushing 28G can be a problem.

So with a write instensive load turning that down to something the interface can keep up with makes a huge difference.  We went down to 1% of ram and flush basically every 3 seconds, although I'm considering tighter tuning on the flush interval.  Problem gone.  Writes stream to disk no problem.

pSCP and getting CIsco IOS on Cisco routers.

I had a great time with pSCP failing to copy an image to an ASR1000 Cisco router recently.

The router was setup to allow SCP- basically setup aaa so the local or remote has exec and enable etc. and turn on the SCP server with the command "ip scp server enable).  Full document here:

I was trapped on a windows jump box with pSCP instead of scp from the openssh stuff like Linux or OS X.  Which is fine, but maybe the logging would have been easier.  But my pscp kept failing.  Turned on logging on the router and the client and only got this obscure message:

SSH-4-SSH2_UNEXPECTED_MSG: Unexpected message type has arrived. Terminating the connection

Hmmm... so the secret is pSCP did not default to actual SCP.  Here is the command line that got me over:

pscp -scp -2 c2951-universalk9-mz.SPA.152-4.M1.bin username@

Seriously had to force scp and ssh v2 with the two switches at the beginning. The rest is as you'd expect source, target username@host:location.  Crazy.  Apparently it was trying to use sftp.  Good to know that the utility can do that as well, but it is called pscp- wouldn't the default be scp?  Maybe something in the negotiation.  I don't know.  That was an extra hour of my day that I hope I'm saving you next time.

Tuesday, May 19, 2009

Netapp ASIS Dedupe and VMWare

So back here I told you about ASIS dedupe volume limits:  Netapp asis dedupe

Why would I want tier 1 block based dedupe?  Well it doesn't work on unstructured data that well, I mean you might get some dedupe, but who knows how much.  

Where it makes crazy money is: VMWare.  This isn't specific to VDI.  Snapshots and dedupe allow  you to quickly backup your VMWare datastores while using almost no storage.  It also allows you to put 50 VMs in 100GB of space (well depends on your VMs but that is what I have right now).  Normally 50 VMs x 30GB for host OS = 1.5TB of space.  Yeah.  That makes some sense and cents.  You can usually dedupe down to about half the space (but sometimes like similar machines like clones it really works well).

So what is the downside?  Well dedupe uses CPU on your netapp, so if your netapp is pegged don't use it.  And there is the vol size thing linked above (but that isn't usually a problem).   Another interesting thing is you no longer have the data spread across many spindles, which means less IOPS.  But there is an interesting fix for that- PAM card.  Yup a cache card for the Netapp that keeps those commonly used deduped blocks (that are basically the OS data) in cache and gets you tons  of IOPS- only available on 3040 or faster (31xx or 6xxx shipping product).  So I've run a pretty good sized storage backend for VMWare off of 1 shelf and a PAM card.  If it were a lab, you could even do SATA and PAM card. 

This thing is like peanut butter and jelly- ASIS dedupe and VMWare.

Saturday, May 16, 2009

Platforms for OS based Appliances

Sometimes I need a box for snort sensors or linux appliances, or BSD or even firewall distributions:  Has some cool security oriented appliances- the Teak series.  

With iBase and Win Enterprise as competition in this sector.  Some have SSL acceleration or other cool security chips.

Here are some really cool boxes either way though:

The first is a full fledged tiny PC- also with bargain price:

Fit-PC2  It is a full PC even though it is small enough to stick to the back of your monitor.  Power consumption is a ridiculously low 5-8W.

The second is Hero Logic-   in competition with Soekris, but seems like a great polished product.  I'd love to try one out.

Friday, May 15, 2009

ESXi- VMWare's lightest and easiest Hypervisor

ESXi is lightweight and has almost all the features of ESX.  It is super easy to setup.

But did you know you can run it or install it from a USB key?  Well not at the same time, but check this out:

They are a little windows centric.  If you are on OS X or Linux it is easier (as you have dd and the linux boot tools on there.  But you get the point.

PC to Thin client Conversion

Want a terminal box no moving parts, centrally controllable, that speaks all the major terminal dialects (like citrix, VDI, rdp) made from your current old PCs?

PC to Thin Client Conversion to the rescue:,navigation_id,1297,_psmand,9.html

Apparently they are third in market share for terminals (HP and Wyse being in the lead)- and the conversion product is pretty cool.  Trying it in a pilot right now.   It isn't for everyone, but it is a compelling product.

Tuesday, May 05, 2009

Storage VMotion (with a gui)

Ever put some VMs on local disk and wanted to move them to your shared storage?  Have you wanted to do that without outage?

Maybe you wanted to patch an ESX box that has local VMs without bringing them down?

Maybe you didn't listen to me and used a block protocol instead of NFS for your VMs and now you are having to resize your LUNS and you didn't use something that resizes gracefully.

Then use storage vmotion:

For free with a little gui in case you are not a CLI junky.

Thursday, April 23, 2009

Does my Mac support VT instructions?

1) open a terminal
2) sysctl -a | egrep -i "vmx|svm"

Output will include a line like this with vmx in it:


Unless you are on an AMD somehow- svm is AMD's VT instruction, VMX is Intel's.  You really only have to look for vmx but I include both for completeness.

On linux you do:

egrep -i "vmx|svm" /cat/proc/cpuinfo

Monday, April 13, 2009

Riverbed Steelhead Password Recovery

If you lock yourself out of your steelhead box (tested on 4.0 or later), you can recover the password much like a router:

Get Console access.

Boot the appliance (or reboot)

When you see the word grub immediately press E.

Another GRUB menu appears, with options similar to these:
0: root (hd0,1)
1: kernel /vmlinuz ro root=/dev/sda5 console=tty0 console=ttyS0,9600n8

select the line with kernel in it using up and down arrow keys

Press E to edit the kernel boot parameters.

Append " single fastboot" at the end of this line. Note the space before 'single', it is very important. (And do not enter the quotes.). Press Enter.

Press the B key to continue booting.

After the system starts, at the command prompt, type "/sbin/" and press Enter.

The password will be blank.

Type "reboot" and press Enter to reboot the appliance.