Boot a paravirtualized virtual machine from CD/DVD-ROM

Posted on December 10, 2009

The problem: you cannot boot a paravirtualised machine from a CD-ROM for the purposes of installing a virtual machine. You may also be on a wireless link set up by NetworkManager and WLAN0 isn’t a bridged interface.

Here’s the solution:

Download the ISO of your favourite distro and burn to DVD, then mount on your machine (this will probably happen just by inserting the disc on your drive). If a window opens in your desktop, highlight the path in the address bar and copy it to the system clipboard (CTRL-C).
Install Apache and start the apache/httpd service
In /var/www/html (/var/www on debian, I believe) simply create a symbolic link to the directory where the DVD is mounted. In this example, I am using CentOS:
# ln -s /media/CentOS_5.4_Final/ centos
Now create the virtual machine, by starting up virt-manager, ensuring that it can connect to Dom0 and select New…
In the Installation Source section of the virtual machine creation dialog, specify the following parameter: Installation media URL: http://localhost/centos (the path to the installer repository)
In the “type of network” selection, select Virtual Interface.
Click through the rest of the set up – but BEFORE YOU COMPLETE IT, GET READY TO PAUSE THE VM. The virtual machine will start up automatically when you finish the set-up steps.
As soon as you start the VM, the initial bootstrapping files should load and the distribution’s kernel should start up. Only when the console window opens should you pause it!
If you are using CentOS, you now need to modify the configuration file that’s been created, following these steps:

Download the Xen kernel and initial ramdisk from here: http://mirror.centos.org/centos/5/os/x86_64/images/xen/ (change the path if you’re using an i386 host)
Save them somewhere sensible: I made /var/lib/xen/boot and put them in there.
Un-pause and Shutdown the virtual machine.
Modify the config file, to include the paths to the xen-aware kernel and initrd (put these entries at the top, adjusting for your path as necessary):
```
kernel = "/var/lib/xen/boot/vmlinuz"
ramdisk = "/var/lib/xen/boot/initrd.img"
```
IMPORTANT – also comment out the line for pygrub, so:#bootloader = “/usr/bin/pygrub”
Save the config and run the virtual machine. Nearly there! Now open up the console to the virtual machine…

If you are prompted for a network address or DHCP, try DHCP.

If you are prompted for an installation path, stick to http. In a network interface dialog that may appear, choose a manual address that doesn’t conflict with other hosts on your real network (but make sure it’s valid for your network!!)

Because the VM now has a virtual network interface, http://localhost/centos is a meaningless path. If the installer identifies this and prompts for an alternative path to the stage2.img file [true in CentOS, at least], then do the following on your host (real) machine:
# ifconfig wlan0(substistute eth0 for wlan0 if you’re using a wired ethernet connection)

Paste/type the IP address from the output of ifconfig into the path dialog of the halted installer, but keep the /centos/ directory.

The installer should then run through the rest of the motions and voila – a paravirtualized virtual machine installed from local CD/DVD-ROM.

When the installer has finished running, uncomment the pygrub line in the config file.

If you spot any errors with this process, please let me know so I can correct the procedure.

Happy Christmas! *<(:-##

NFS mount hanging? Not working from client to server, or server to server?

Posted on December 9, 2009

steve

I recently came across this slightly bizarre issue. I was trying to mount a NFS share from one server to another server, using very loose permissions (I was basically sharing a DVD to a machine which had no DVD-ROM drive).

So, what was happening? Well, basically nothing. On the NFS server (the machine with the DVD exported) I ran tcp dump to see what traffic was being received (the server was IP 192.168.10.200):

#tcpdump -nn | grep 192.168.10.1

No output was displayed when I was trying to mount the share on the client. None at all. Well, almost none. The one bit of output that got me wondering was a broadcast packet which was received from the client.

10:14:45.651572 IP 192.168.10.1 > 192.168.10.255 arp who has 192.168.10.201 tell 192.168.10.1

The IP address 192.168.10.201 was a typo made by me the day before. I’d meant to type in .200 in my mount string. My incorrect mount command thus read:

# mount -t nfs 192.168.10.201:/mnt/share /mnt/dvd

It seemed strange that an incorrect mount command that I’d typed in yesterday (and then hit CTRL-C to)might still be working in the background.

Back to the client, I realised that perhaps the mount command worked in a queue/serial-like way. Therefore, each mount command would have to complete – either successfully or not, so long as it finally returned – before the next one was attempted. Checking out this theory, I investigated local processes:

# ps ax | grep mount

Sure enough, there were lots of mount entries pointing to the wrong IP address. These were all my attempts to mount a non-existing server’s share to a local directory. Dumb mistake, eh. Still, CTRL-C didn’t cancel the mount request, which continued to run in the background.

The easiest solution was to reboot the server, but in situations where that’s not practical, killing the rogue processes should suffice.

How fast is your web site? Be prepared…!

Posted on November 5, 2009

steve

3 Nov 2009
I have recently been conducting a little research into hosting companies/ISPs/data centres to understand more about their speed.

One hosting provider in the UK, UKFast, has recently been marketing the advantages of speed as a prime factor. Consequently, they have allegedly invested 25% of their profits year on year into improving their internet connectivity while at the same time ensuring that they never exceed [by that I infer “sell more than”] 40% of total bandwidth available*. Fair play – we all like stuff to be faster. I was also pointed to a 3rd party web site who provide speed measuring of UK-based web hosting providers – WebCop.
* I was told this by a UKFast sales representative.

I was interested by WebCop’s claims, namely that by distributing their testing servers across UK-based hosting centres, they eliminate bias of one or another datacentre and concentrate instead of the actual, average throughput delivered by them. It’s a fair claim, but there could be issues. Today, I sent them this message:

Hi,

I’m interested by your web hosting speed statistics, for two main reasons.

Firstly, there isn’t much info on your site about how you conduct tests – e.g. which web sites are used to measure the hosting companies relative speed. This concerns me, as hosting companies can easily make the most prominent web sites the fastest, putting them on the front line of the data centre, while allocating less bandwidth to smaller web sites.

Secondly, you don’t mention from where you test each hosting company’s sites/servers. So, for example, you could be testing a London-based server using servers in Manchester and Leeds, but the contention in one direction may be significantly higher than in the other direction. Therefore, you could have skewed results. In addition to this, if one hosting provider/ISP has a faster network, how can you prove this by testing on their competitors’ slower networks?

I’m looking forward to hearing back from them. Currently UKFast appears to have leapt ahead in terms of the speed ratings, according to WebCop.

Whois WebCop?

Good question. I ran a #whois on webcop.co.uk and found that the domain is registered by a company in the Netherlands who has a POBox address in Gibraltar! Because whois output is subject to Nominet copyright, I cannot redistribute it here. But if you want to see it, try www.123-reg.co.uk.

I have tried to dig a little deeper; the web is very unrevealing of a company that seemingly wants to stay hidden. I did find out that UKFast’s sister brand, GraphiteRack.com, registered their domain name through ENom, the same registrar that WebCop used, but nothing more.

The public-facing WebCop server seems to be hosted by Tagadab.com, a Clara.net Group company. Interesting that a company (WebCop) with testing servers distributed across the UK, use a London-based ISP with only 6 IP addresses allocated from IANA and some very “comptetitive” prices. Perhaps they want to keep their web traffic well away from testing servers…

Stay tuned…

5 Nov 2009
Not heard anything from WebCop yet…

9 Nov 2009
I got a reply from WebCop:

Our testing servers are located on many different networks and effort has been taken to ensure that they are also geographically evenly located throughout the country. This means that if we test a server located in London it will be tested from all over the country and the average result given. This allows us to display results that are averaged not only across different provider’s networks but also across different geographical locations.

As for your first point, we are currently addressing this and looking to find the best way to ensure that providers don’t cheat in the same way we know they do for Webperf testing. Currently for the larger providers we test a server located in the standard customer space and not their main website, and for smaller providers we test their company website server. We are looking for a way to make this fairer and are working with some of the larger providers to do this.

On the surface this is a fair strategy. However, it’s very, very easy for a data centre to prioritise traffic to/from a particular machine. My feeling is that this could be happening already although, of course, I can prove nothing.

My gut instinct tells me that if the majority of datacentres in the UK felt they could increase sales by claiming the fastest network connectivity, they would.

However, every UK datacentre (apart from one) seems to hover around the same speed of connectivity, which suggests that either the system of tests is not respected amongst the datacentre community (in other words, it isn’t perceived as being particularly accurate), or the service provided by one is much faster than the bigger ISPs with which it peers… which seems rather unlikely.

I respect the WebCop team for this endeavour, but strongly feel that until the testing methodology is properly published for the networking and datacentre community, there can be little value in its findings.

random *nix problems

Posted on October 14, 2009

steve

Currently having a couple of issues with laptop and server. Hmm.. sratching head, thinking cap on, etc..

Problem 1 – laptop swap
On my laptop, I recently resized my root partition (lv) and removed/recreated my swap partition (also a logical volume). On my new swap, I used mkswap, added an entry in fstab and turned it on. All rudimentary stuff.

But when the system came to using it, it hung. No response in X whatsoever, although there seemed to be disk polling going on, suggesting the kernel was still operational. I couldn’t flip to another console or SSH in to find out, though.

I created a new partition directly on the disk, not using LVM, and made that a swap too. Same set up procedure as before, then activated it. This time, when the system needed to swap, it did – as you would expect it to. Bizarre. I can’t think why this might be happening, apart from something going wrong through the device mapper, maybe.

Problem 2 – server dump
The second problem I’m having is backing up the server using dump. In short, when I dumped out a level 0 backup, not all my files were copied. Strangely, also, directory sizes on the tape, and when restored, seemed padded/boundary-aligned – e.g. 4kb, 8kb or 16kb. I’m trying to solve this one too, and am using tar in the meantime (which, if testing proves positive, may stick with).

Diagnose and fix ‘SELinux is preventing mysqld (mysqld_t)’

Posted on October 13, 2009

steve

The full title of this blog should really be ‘SELinux is preventing mysqld (mysqld_t) “search” to ./tmp (public_content_rw_t)’ as that is the problem I’ve been having with CentOS recently (and hence my searches on the web for a solution).

The cause of the problem

I use SugarCRM for customer and project management data – and very good it is too! (Gratuitous plug – I can help your company install and use this fine software :-) ). Except that recently, when listing my Accounts within Sugar, I would not see all of the account context. Only the account data itself would be displayed and none of the subpanels/links.

The query to retrieve more data was failing, with this error message displayed in the browser window:
mysqld: Can't create/write to file '/tmp/#08y2jw' (Errcode: 13)
In my system log (/var/log/messages), I also got multiple SELinux errors like this:
Oct 13 09:07:50 server setroubleshoot: SELinux is preventing mysqld (mysqld_t) "read" to ./tmp (public_content_rw_t). For complete SELinux messages. run sealert -l 1762c478-f3a2-4eeb-be09-bd3dc037d945
Clearly, the reason for “Errcode: 13″ was due to SELinux.

Incidentally. if you have seen a similar error on your web site, but with (Errcode: 28) instead, this is likely due to shortage of disk space. A great way of determining operating system errors like this, is to use ‘PError’, thus:
# perror 28 OS error code 28: No space left on device
# perror 13 OS error code 13: Permission denied

So there we are – two distinct and different issues.

With SELinux, resolving the permission issue can be difficult. By issuing # sealert -l 1762c478-f3a2-4eeb-be09-bd3dc037d945, as suggested above, I got the following output (trimmed and highlighted for clarity):

Summary:
SELinux is preventing mysqld (mysqld_t) “search” to ./tmp (public_content_rw_t).
Allowing Access:
Sometimes labeling problems can cause SELinux denials. You could try to restore
the default system file context for ./tmp,
restorecon -v ‘./tmp’
Additional Information:
Source Context root:system_r:mysqld_t
Target Context system_u:object_r:public_content_rw_t

First things first: issuing # restorecon -v './tmp' didn’t fix it for me. I was also surprised to see that the path to /tmp was relative to the current working directory, so I tried a slightly modified # restorecon -v '/tmp', but to no avail. After restarting mysqld, the problem persisted: MySQL was simply being refused access to /tmp. Somewhere, a policy is disallowing this.

It’s a mistake to assume the the source context and target context should be the same; they don’t have to be, as it’s entirely policy-driven. I made bold those aspects (the file Type) above to highlight this incorrect assumption (that I previously held).

Find and fix a policy?

Although finding the troublesome policy and analysing it is a Good Thing, it’s also time-consuming and requires significant knowledge of SELinux, chiefly to avoid creating security holes. A better way, I found, was simply to relocate where mysqld tries to store temporary data.

Thanks to Surachart Opun’s blog, I learned that you can specify a new location for temporary files. In /etc/my.cnf, add or edit the following:
[mysqld] tmpdir=/tmp # # e.g. tmpdir=/var/lib/mysql/tmp

Now do the legwork to set up the directory properly:

First, create directory with appropriate permissions
# cd /var/lib/mysql # mkdir tmp # chown mysql:mysql tmp # chmod 1750 tmp

Now set the SELinux context up:
# chcon --reference /var/lib/mysql tmp

and make the SELinuiux context permanent:
# semanage fcontext -a -t mysql_db_t "/var/lib/mysql/tmp(/.*)?"

Finally, restart mysql:
# service mysqld restart

Closing thoughts: optimisation

The methods above fixed the particular problem I was having. They didn’t, however, actually pinpoint the cause. This is one of the good things about Linux and SELinux in particular: you are forced to rethink what the system is doing and work out a solution that sits within the predefined security context – or learn how to write SELinux policies. Personally, I prefer the former ;-)

There is an additional benefit to the solution above – namely, optimisation. Because we have specified the security context with semanage, we are free to mount an external file system and use that instead for MySQL’s temporary files. In other words, we can maintain the security but increase the performance. One such filesystem could be tmpfs. tmpfs is actually a RAM Disk, uses a fixed amount of RAM to provide file storage. It is much quicker than an on-disk filesystem and thus perfectly optimised for storing temporary, caching data. There are many resources about tmpfs on the web. A good introduction to tmpfs can be at Planet Admon.

Revoke a GPG key using a revocation certificate

Posted on September 3, 2009

steve

I recently found myself having the need to revoke an old certificate. The steps are actually quite straightforward, but you do need to have your old revocation certificate to hand.

For more info, visit the GNU Privacy Guard site: http://www.gnupg.org/gph/en/manual.html

Simple follow these steps. In a terminal, issue:

gpg –import my-old-key@mydomain.com (0x712AC328) rev.asc
gpg –keyserver certserver.pgp.com –send-key 712AC328

That’s it!

Choice and Freedom… have NOTHING to do with computer systems. Sometimes.

Posted on September 3, 2009

steve

My eyes have been opened to yet another case of foul play by a megacorp.

As if their package management isn’t disgusting enough
(www.theregister.co.uk/2008/07/23/enormouse/), /their BIOS
configuration on laptops of the recent few years leaves MUCH to be desired.

On the bottom of my HP Compaq 6715b laptop is a removable panel which
covers a memory slot and a Mini-PCI-E connector. “Great”, I thought,
having a non-functioning Broadcom card in there, “I’m going to install
an Intel 4925 AGN wireless card here because it’s supported by the
firmware/kernel I use (CentOS 5.3) – and I’m loath to building a new
kernel when I can just plug in a new card ;-)”.

My card, £15 off Ebay, arrived this morning and I carefully fitted it.
Booted the machine, went into the BIOS settings to ensure it was
enabled, and…. wait a minute, it isn’t listed. Perhaps it’s broken…
or …perhaps HP have imposed a blacklist of vendors/subsystems which
THEY don’t allow to be recognised in MY computer. Not listed in lspci,
nor dmesg… basically nowhere.

Is this legal? Did I ever see any restriction declared ANYWHERE before
buying this machine that stated “HP retains sole right to how this
machine is used and with what”..?

What point is there putting this restriction in?! Someone buying a
budget laptop isn’t going to source their over-priced parts from the
OEM! Why, darn it, why?!

If you have – or rather, are thinking of buying an HP, Dell or IBM
laptop, I’d suggest reading these first:
– http://www.engadget.com/2005/02/22/the-hp-bios-that-locks-non-whitelisted-mini-pci-cards-out/
– http://www.aigarius.com/blog/2008/02/07/sneaky-blacklisting-of-wifi-in-hp-laptops/
–
– http://www.richud.com/HP-Pavilion-104-Bios-Fix/
– http://www.paul.sladen.org/thinkpad-r31/wifi-card-pci-ids.html

I’ve thought about actually re-flashing my BIOS with modified code,
partly out of sheer bloody-mindedness towards HP (oh, and I would
publish, intricately, the solution), and partly just out of the
practical need for wireless networking. But now, I’m just baffled by
the whole thing.

Hilariously, as a final insult, the latest BIOS update from HP for my
machine, “updates the Computrace OPTION ROM to version 866”. So…
you’re telling me I have this “Computrace OPTION ROM” installed, huh?

HP – http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=321957&prodSeriesId=3368540&swItem=ob-67085-1&prodNameId=3356624&swEnvOID=2094&swLang=8&taskId=135&mode=4&idx=3

BlackHat (deactivate the rootkit) – http://www.blackhat.com/presentations/bh-usa-09/ORTEGA/BHUSA09-Ortega-DeactivateRootkit-PAPER.pdf

Is there no goodness left in the world? Can old ladies not be helped
across the road any more? MUST we buy battery hen eggs instead of free
range?! Like, where’s the love, man..?

Make a fortune on EBay!

Posted on July 28, 2009

steve

Yes, it seems too good to be true.

Well, guess what?! It IS!!

That’s right. Your old tat (or, you could say, my old tat) is just about as worthless to everyone else as it is to me. I’ve spent ages on ebay and sold almost nothing. And what I have sold, I sold for 99p.

Give to charity instead, that’s what I should have done! Pah!

🙂

Open Source Scores A Major Breakthrough in the U.K.!

Posted on February 27, 2009

steve

No doubt open-source proponents will rejoice over this news: The British government has decided to increase its use of open-source software in the public services field. It will be adopted over Windows whenever it delivers the best value for the money. Schools, govenment offices and public agencies will all give open source a new look.

SMART ain’t so smart, it seems

Posted on February 25, 2009

steve

It’s worry-time on the server:

# tail -20 /var/log/messages
Feb 25 10:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 10:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 10:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 11:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 11:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 11:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 11:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 12:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 12:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 12:39:31 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 12:39:31 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 13:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 13:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 13:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 13:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 14:09:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 14:09:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors
Feb 25 14:39:32 myserver smartd[2785]: Device: /dev/sdc, 9 Currently unreadable (pending) sectors
Feb 25 14:39:32 myserver smartd[2785]: Device: /dev/sdc, 3 Offline uncorrectable sectors

.. and so it goes on. So, I’ll check it out by performing a SMART self-test on the drive:

# smartctl -a -d ata /dev/sdc
smartctl version 5.36 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: Hitachi HDP725040GLA360
Serial Number: GEB430RE15UEVF
Firmware Version: GMDOA52A
User Capacity: 400,088,457,216 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Not recognized. Minor revision code: 0x29
Local Time is: Wed Feb 25 14:55:30 2009 GMT
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (7840) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 130) minutes.

[snip]

I’m not sure what to make of a disk that reports it’s broken to the kernel but reports its “PASSED” to a userspace tool.

One thing’s for certain – it’s being replaced!

dowe.uk

going where some have gone before

Tag: free software

Boot a paravirtualized virtual machine from CD/DVD-ROM

NFS mount hanging? Not working from client to server, or server to server?

How fast is your web site? Be prepared…!

random *nix problems

Diagnose and fix ‘SELinux is preventing mysqld (mysqld_t)’

The cause of the problem

Find and fix a policy?

Closing thoughts: optimisation

Revoke a GPG key using a revocation certificate

Choice and Freedom… have NOTHING to do with computer systems. Sometimes.

Make a fortune on EBay!

Open Source Scores A Major Breakthrough in the U.K.!

SMART ain’t so smart, it seems