John Allspaw of Flickr Ops fame presented his “Operational Efficiency Hacks” talk at Web 2.0 Expo (terrible name!), some very fine points and cool little tricks in his slides, including using IM bots for operational changes and logging! I especially love his last slide :-)
John’s slides also contain a link to the all-powerful Jprall’s 85-operations-rules-to-live-by, which every OpsMgr should have stapled to his forehead, and forced to memorize and recite at regular intervals throughout the day!
For those of you interested in having a more structured way of measuring your Ops deps service management capabilities and perphaps (uh-oh, scary!) related performance, Ops Scoreboard on the Google Code site, is a welcome newcomer on the free ITSM tools scene.
Microsoft has released a Business Intelligence framework for all the ITSM-aware Operations Managers out there, some very nice tools inside to visualize the BI metrics lurking in the murky depths of your OpsMGr Datawarehouse database!
Downloadable Whitepaper and example reports and dashboards here.
Was recently beating my head against the wall trying to troubleshoot some COM+/MSDTC related timeouts on one of our production SQL servers.
The non-transactional queries involved in the display of the websites public frontends were going beautifully, but in the admin backend, some transactions were simply timing out, before the data could be committed to the DB’s. This occured on random webservers on different times throughout the day. Very frustrating issue needless to say.
Our issue was eventually resolved by increasing the DTC transaction timeouts from the default value of 60 secs to 540 seconds. No more timeouts and happy customers once more!
Got some great pointers and tips from a site that I wanted to share – http://vyaskn.tripod.com/watch_your_timeouts.htm – has some very useful info on troubleshoooting the components involved in COM+/MSDTC, SQL, IIS , ASP.NET etc. Worth a look!
Trying out the new ESXi 4.0 hypervisor on a 64-bit Dell SC1420 server, I ran into this issue:
“ESXi Unsupported BIOS setting CPUID is limited” – and the ESXi kernel exited to a debugger.
Hmm …
A bit of Googling around and appareantly, Dell stops the ESXi kernel from reading the CPUID properly, but the solution is simple:
At the initial bootloader screen, press TAB to edit the boot options
Hold down the left-arrow key to move the cursor back to the beginning of the boot options, and then add nocheckCPUIDLimit” after “vmkernel.gz”, so the whole thing becomes “mboot.c32 vmkernel.gz nocheckCPUIDLimit —”
Hit ENTER.
Thanks to this guy, for clearing that up, lot’s of good ESX info at that site – check it out!
UPDATE: In the new ESXi 4.0 bootloader, more modules have been added to the list, including a vmkboot.gz, which now loads before the vmkernel.gz module.
This means the correct loader syntax would actually read: mboot.c32 vmkboot.gz nocheckCPUIDLimit — vmkernel.gz — [etc]. Hope this clears up any minor issues.
Simply the best introduction/manual for implementing proper process control in your IT department – read it first and then read the ITSMF ITIL intro, it’s the best 5-10 hours you will spend in your whole IT career – seriously!