Beep! Beep! Beep! Beep! Beep! Beep!

Facts:
A fileserver based on Tyan Thunder K8SD Pro (S2882-D), equipped with 3Ware 9500S-8 SATA RAID card and 6 Western Digital 250Gb disks as raid5 in 2 enclosures Proware MS-324A and Proware MS-223A. The front door of the server case is locked with a key that the owner does not know where it is, no buttons are available outside the door (no power/reset/etc buttons). The server is running Gentoo Linux.

The day starts:
08:55 – Mobile phone rings. I wake up but I don’t pick it up since I am unable to speak on the phone due to sleepiness. I am thinking that there is absolutely NO way something good is ever going to come out of a phone ringing that early.
09:05 – I wake up, check the phone and CallerID says that is one of the customers that I do tech support for. I call them and ask what’s the problem. Conversation follows:

Me: Hello, good day, what’s the problem and you are calling me so early ?
Customer: Oh sorry, did I wake you up ?
Me: It’s ok, I was just about to wake up (HUGE LIE)…
Customer: The fileserver keeps beeping today as it did last night.
Me: Beeping ? Why didn’t you tell me yesterday ?
Customer: I didn’t think it was of any importance, so last night I pulled the plug to make it stop beeping.
(That’s when my lower jaw reached my desk. Remember that the front door is locked, he has no access to buttons and of course he has no Linux knowledge in order to ssh and power it off. Why didn’t he call me last night to do it though ???)
Me: You did what ? You pulled the plug ? And today you put it back online ? And it beeps again ?
Customer: That’s right. Do you know what is the problem ?
Me: No, but let me login remotely to the machine and I’ll take a look. I’ll call you back soon to tell you what’s going on.

First checks:
I ssh to the machine and start checking /var/log/messages. After some searching I find this:
3w-9xxx: scsi0: AEN: WARNING (0x04:0x0042): Primary DCB read error occurred:port=2, error=0x208.
I google for it and at the same time login to IRC to ask some friends if they know anything about that error. Noone seems to have met that before. Some websites say this error is of no importance. Some others say it is very important and that I should call the vendor. I go to 3ware’s site and start searching the knowledgebase. I find these pages:
a)http://www.3ware.com/KB/article.aspx?id=14335
b)http://www.3ware.com/KB/article.aspx?id=14687
c)http://www.3ware.com/KB/Article.aspx?id=12072
I also check the status of the array using tw_cli (3Ware Command Line Utility). It says that is verifying the array, probably due to the plug pulling the customer did.

I call the customer and tell him that the array is being verified and that I will call him back as soon as it finishes.

11:30 – The verify process ends. All is fine with the array.
11:32 – I call the customer and ask him if the beeping has stopped. He tells me that the beeping keeps on.
11:34 – I reboot the server and check the messages again. I now get
c0 [Fri Aug 31 09:42:53 2007] WARNING (0x04:0x0042): Primary DCB read error occurred: port=3, error=0x208

But no verifying process starts. I manually start a verifying process while examining various commands that the tw_cli provides. I ask another friend on IRC and he suggests that some disks might be failing.

Time for some face to face contact:
11:50 – I call the customer again and tell him that I am taking a taxi to go there in order to take a “closer” look.
11:55 – I am waiting for a taxi.
12:10 – Still waiting
12:15 – A taxi comes, I argue with an old man who is trying to take my turn for the taxi. I tell him that I have to go to a hospital immediately so he steps back.
12:25 – I arrive at the customer. The beeping sound can be heard all over the place and even though the server is in a seperate closed room one can hear it from 2 rooms beside.

I take a monitor and a keyboard from another PC and plug them to the fileserver, I reboot it and enter 3Ware’s BIOS. No alarms/no errors are shown. I reboot it and start checking the motherboard’s BIOS. PC Health Status looks fine (the room is airconditioned with a stable temperature of 21 degrees Celcius). I boot into Linux again. No errors at /var/log/messages or through tw_cli but the server keeps beeping. I am by then totally puzzled. I enter 3Ware’s site to create a customer account and open a trouble ticket. I take messages shown from tw_cli show diag command and the previous errors that I posted above along with various data from the machine to fill the needed details. I know that I won’t have an answer for at least 4-5 hours due to time difference with US so I start messing around with the controller through tw_cli trying to find any clues.

13:30 – Since it’s friday and the RAID5 array has no spare drive I decide to order one drive like the others from an online shop. Even if no drive at the moment has a problem it won’t hurt to have a spare drive for the future.

I am also trying to help people continue to do their jobs without the company’s fileserver while messing around with the controller. I run smartctl for every disk to check their SMART attributes using something like:smartctl -a -d 3ware,2 /dev/sda. No errors at all from any disks. Temperatures normal. Then “-t short” SMART tests, no errors.

A strange idea:
14:30 – People have started leaving the company for noon break. I stay.
14:40 – I strange idea comes to mind. What if I remove the 3ware card ? Will the beeping stop ?
14:45 – I start to unscrew the box to pull the 3Ware card out of it. No success. The beeping continues.
14:55 – I pull the power plugs off the first enclosure, the Proware MS-324A. No success. The beeping continues.
15:00 – I pull the power plugs off the second enclosure, the Proware MS-223A. THE BEEPING STOPS!
15:05 – I put back on the power plugs of the MS-324A. NO BEEPING.

So I have found out whose fault is the beeping, right? I try to take the MS-223A out of the server box. The process is rather tricky due to faulty screws or screws improperly screwed (don’t laugh!) by the company who assembled the server (not me! NOT ME!!). I finally manage to take the enclosure away from the box and blow the dust away from it. While doing that I notice that one fan is not acting like the other 2 while I blow air at it. It doesn’t “turn” as fast as the others do. I put some plugs to the enclosure and I start the machine again. The beeping starts but what is clear is that one fan has a spinning problem, I guess it’s due to dust. I try to find the manual of MS-223A on the web. That’s where I notice this:
When a fan's rotation speed is lower than 1000rpm the buzzer will sound.

Trying to fix the problem:
I am now certain of who’s to blame. I try to unplug the fan from the enclosure and put back the enclosure to the server box. It keeps beeping.

16:00 – I start searching for a spare 60mm fan with a 3pin molex. Of course I can’t find any at the customer’s place. I go out and search the neighborhood for a computer store. I am lucky (you can laugh here) and I see a guy just enter his computer store, I go inside and ask him if he has any of the fans that I want. He doesn’t.
16:30 – I am back at the customer’s place. I order 3 60mm fans with a 3 pin molex from the net. Having some spare fans in the future sounds very very good to me.
17:30 – The customer and his employees come back at the company and I explain to him what has happened. I am shocked to learn by other employees that they often heard it beeping again in the past but nobody cared to tell me.
18:00 – I read my emails and 3ware’s support has replied to my case. They propose to download some other diagnostics and do some tests.

I was too tired to test the controller with the new diagnostics. Since it’s friday and the company closes for the weekend I will run the tests when I have the 60mm fan replaced. Until then (which could easily be tommorow if the fans arrive), I’ve shut the server down, just to be sure that there’s nothing wrong with the controller or any of the disks.

Conclusion:
I am almost sure that if it hadn’t been for the beeping sound I wouldn’t even have noticed 3ware’s “errors” which were probably caused by the pulling of the main plug of the PSU. It might sound a bit strange, but I don’t actually worry about the diagnostics test that 3ware’s customer support proposed. I am very impressed by 3ware’s customer support and responsiveness. I don’t know how all this will end yet, but I think it will all be fine by the time I replace the fan.

DAMN FAN! YOU RUINED MY DAY.

I still hear this “Beep! Beep! Beep! Beep! Beep! Beep!” sound inside my ears.

simple shell script to download the frontpage of major greek newspapers

www.in.gr has a very usefull feature on their site, it has all major greek newspapers’ front page scanned and posted in a place called kiosk.

Even though this is very nice, it doesn’t fit my viewing needs, I want all newspapers on my local drive every morning so I can view them with my favorite image viewer. In order to do so I created a small shell script.

The script:

#!/bin/sh
#simple shell script to download the frontpage of major greek newspapers from www.in.gr/kiosk/
#feel free to modify it as you wish :)
YEAR=`date +%Y`
MONTH=`date +%m`
DAY=`date +%d`
mkdir -p ~/in.gr/${YEAR}/${MONTH}/${DAY}/
cd ~/in.gr/${YEAR}/${MONTH}/${DAY}/
i="0"
j="0"
k="0"
exclude=(30 31 32 33 34 58 60 61 69 78 79)
include=()
while [ "$i" -lt 80 ]; do
if [ "$i" = "${exclude[$j]}" ]; then
echo "excluding $i"
j=$[$j+1]
i=$[$i+1]
else
wget -q -nc -c http://assets.in.gr/dGenesis/assets/Content60/Issue/${YEAR}/${MONTH}/${DAY}/${i}_h.jpg
i=$[$i+1]
fi
done
include_len=${#include[*]}
while [ "$k" -lt $include_len ]; do
wget -q -nc http://assets.in.gr/dGenesis/assets/Content60/Issue/${YEAR}/${MONTH}/${DAY}/${include[$k]}_h.jpg
k=$[$k+1]
done

Just add the script to your user’s crontab and you are ready. Since not all newspapers come out in the morning at the same time, you can add that script to run on your crontab every one hour in the morning from 7 o’clock until 12 o’clock.

Some details:
The kiosk has an interesting and weird “feature”. To find a newspaper’s ID-url you can go to www.in.gr/kiosk/ and click on the newspaper you want. A window with a thumbnail will appear, click on the thumbnail and a new pop-up window with a bigger image will come forward. Now right click on the image and select copy image location. It should be something like: http://assets.in.gr/dGenesis/assets/Content60/Issue/2007/08/29/3_h.jpg. Even though most newspapers feature sequential numbering until number 34, some come with a higher number like 53, 60, 61, 78, 79. So while one might think that it’s safe to iterate until 80 to catch them all, that’s not the case. Some sports and all local newspapers have ID numbers like 69389! In order to cope with these, for anyone who might want them, I added another loop in the script that uses an “include” array. Put any high numbers above 80 inside the include array (seperated by a whitespace) and the script will download them. Since I don’t like reading sports and gossip newspapers I have added an exclude array in the main loop in order to avoid downloading them. If you want to download all newspapers simply remove the numbers I have in my exclude list.

I don’t understand what’s the purpose of having both small sequential numbers and bigger “random” ones as IDs. Do you ?

iloog-7.06 on Sony Vaio PCG-SR21K

A new old Laptop to test iloog:

A few days ago my friend Dimitris gave me his old Sony Vaio PCG-SR21K since he didn’t need it any more.
pic1

Specs:

Mobile Pentium III/650 with SpeedStep technology and 256Kb of on-die Level 2 cache, 64Mb of PC100 SDRAM, Intel 440Z motherboard chipset, fixed 10Gb IBM Travelstar DJSA-210 hard disk, external 16x CD-ROM, 8Mb S3 Savage/IX graphics, 10.4in XGA TFT screen, Yamaha DS-XG audio, integrated stereo speakers, integrated V.90 modem, one Type II PC Card slot, expansion port, plus ports for USB, IEEE-1394 and Sony Memory Stick, Windows 2000 Professional (nooooooooooooot!), Sony video-editing suite (crap!). Dimensions: 259 x 209 x 32mm (W x D x H). Weight: 1.4kg.

It is a perfect laptop for iloog testing.

Boot Process:

When I tried to boot iloog to it from it’s external pcmcia cdrom I faced the a problem though, the iloog kernel does not support (yet ?) cdrom drives on pcmcia (I had never thought of booting from such devices when creating the iloog kernel) so iloog’s initrd couldn’t operate as it should. As long as the iloog kernel started and initrd scripts run, they couldn’t find a bootable device since no cdrom device was found, only the hard disk was recognized that far. This laptop is pretty old so there wasn’t either any option to boot from usb. The good news was that the laptop already had an old slackware (version 10 or 11, I can’t really remember) running on it, with two ext3 partitions (hda2 and hda3) and another one for swap (hda1). We had installed that slackware version with Dimitris following the advice posted on this forum http://www.debianforum.de/forum/viewtopic.php?p=9781. One has to add this:

linux ide2=0x180,0x386

to the boot prompt.

Since there was already another linux OS installed on the laptop I didn’t need to put that command on iloog’s boot prompt and decided to take another path. I started slackware normally, then I put iloog-7.06 CD in the pcmcia cdrom drive, mounted it under /mnt/cdrom and then copied the contents of /mnt/cdrom to /mnt/hda3 (where /dev/hda3 was already mounted). I rebooted the laptop and made it boot again from the pcmcia cdrom drive. The iloog kernel and initrd scripts started and could now find the files they needed under /dev/hda3, so it the boot process continued just fine. Since no files but the kernel and initrd scripts ran from the external cdrom the boot process was a lot faster than it would have been running from the old external 16x cdrom drive. This process of storing iloog’s files on a hard disk partition should have provided enough info for those who want to run iloog as a livecd from their hard disk for testing.

Local Install:

The next thing I wanted to do was to install iloog locally on the hard disk than just having it boot as a livecd from the disk. The process was exactly the same as I have described in a previous post about installing an older iloog version to another old laptop. The only difference was that I installed grub instead of lilo, but that’s more of a preference than a necessity.

Incompatibilities:

Even though the laptop has an S3 savage graphics chipset, the only working xorg driver for me is vesa, but it does work just fine. If you don’t know how to change your xorg driver to vesa just use the iloog-vesachange.sh script (found under /usr/local/bin).

Install New Applications:

Since the laptop has only 64Mb of ram, using Firefox is extremely difficult. Fuzz proposed to install Opera. In order to install anything from portage to iloog one must run the iloog-db.sh script first . This script fetches the database of all installed packages on iloog (/var/db/), normally these are not on the livecd because a) they take too much space b) who and why would install new apps on a livecd c) they take way too much space :). When the iloog-db.sh script finishes, one has to run an:
emerge --sync
in order to fetch the latest portage.

The an emerge -avt opera to install the latest opera 🙂

So, after running iloog-db.sh and emerge --sync, you can install any applications you want on an iloog.

Kernel:
I wanted to build a newer stable kernel (emerge -avt sys-kernel/gentoo-sources, gentoo-sources-2.6.21-r4 at the current time) to test the performance and remove unused stuff from the kernel. This newer kernel has also the sony_laptop module included. This is the config I used:

kernel 2.6.21 config for PCG-SR21K

I’ve build the sonypi and sony_laptop as external modules and since udev does not load them automagically I needed to edit
/etc/modules.autoload.d/kernel-2.6 and add the modules in that file.

Extra controls
To control the brightness of the screen, check/set the fan speed, check the temperature and see the remaining battery I run an emerge app-laptop/spicctrl. spicctrl uses the sonypi module, so make sure you have already loaded it. This module also makes the jog dial button act as extra mouse keys. iloog’s xorg file already has support for 5 button mice, so I was good to go. I was no able to scroll up and down using the jod dial button. When the jog dial is pressed it acts as middle mouse button so I can use it as that too.

In order to make use of the extra function keys I needed to emerge app-misc/sonypid. With the help of a brilliant perl script called sonypidd I could assign various functions/scripts to the function keys.
I’ve made some modifications to the original one though, the original script uses aumix to change the sound settings but I like amixer more, so I used that one. I’ve even changed the program it used for playing a click.wav upon pressing the function keys (it used sox, I used play), and the click.wav itself (I’ve used the click.wav that comes with the game gweled).

Here’s my sonypidd: sonypidd for iloog
and click.wav: sonypidd for iloog
To make it all work, extract sonypidd.gz to /usr/local/bin/sonypidd and click.tar.gz to /usr/share/sounds/click.wav. Then, edit /etc/conf.d/local.start and add:
/usr/local/bin/sonypidd >/dev/null 2>&1>
to the end of the file

I also wanted to make the jog dial button to appear the fluxbox menu, so I’ve edited ~/.fluxbox/keys file and changed the line that referred to Mouse2 from:
OnDesktop Mouse2 :workspaceMenu
to
None Mouse2 :rootMenu

Now I can open a fluxbox menu even when I have other applications on top of the desktop by pressing the jog dial button 🙂

I’d be glad if sjog or rsjog worked so I could do more with the jogdial, but I haven’t made it yet. Their development has stopped and their code no longer builds with the current libraries. Any good programmers out there to revive these programs ? 🙂

Speeding up compilations:
To speed up package compilation I’ve set up distcc on the laptop and on my desktop pc. The problem I faced though was that iloog is compiled as i586 arch and my desktop as i686 so I had to use distcc with crossdev. Even though crossdev appeared to save the day it had a minor problem with it. It always installs the latest testing packages for crosscompiling, and since iloog uses stable packages there were some conflicts while installing packages. These were solved by a brutal hack, editing the /usr/sbin/crossdev binary (line 472) and commenting out the “if [[ -f package.keywords ]]” loop. Then I could put the versions I wanted inside /etc/portage/package.keywords like that:

=cross-i586-pc-linux-gnu/gcc-4.1.2 x86~x86
=cross-i586-pc-linux-gnu/glibc-2.5-r4 x86 ~x86

TODO:
Since distcc with crossdev in now installed I will try to install xorg-server with kdrive use flag. I think that Xvesa will be a loooot lighter than X server.

References:
Pages I’ve read during the process:

http://www.linux.it/~malattia/wiki/index.php/Main_Page
http://tjworld.net/snc/
http://freenet-homepage.de/obauer/gentoo.html
http://linuxbrit.co.uk/rsjog/
http://sjog.sourceforge.net/
http://www.popies.net/sonypi/
http://www.comp.lancs.ac.uk/~fittond/gentoo-install.txt
http://www.boulder.swri.edu/~deforest/sonypidd
http://www.siglost.org/vgna197vp/sonypidd
http://www.gentoo.org/doc/en/distcc.xml
http://www.gentoo.org/doc/en/cross-compiling-distcc.xml

Oneliner: text to image using imagemagick

$ convert -size 200x30 xc:transparent -font /usr/share/fonts/dejavu/DejaVuSansMono.ttf -fill black -pointsize 12 -draw "text 5,15 'this is just a test'" test.png

The result:

With this oneliner it’s very easy to create images of e-mail addresses for anti-spam purposes (pretty old-fashioned though).

Δείγμα Γραπτών από Ελληνικό Πανεπιστήμιο

Ένα απολαυστικό δείγμα του υψηλού επιπέδου…

Τμήμα Επιστήμης και Τεχνολογίας Υλικών

(Ό,τι είναι μέσα σε πλαίσιο είναι αυτά που έγραψαν οι φοιτητές, τα υπόλοιπα είναι σχόλια του καθηγητή)

Διαβάστε περισσότερα στο: Καφές και Τσιγάρο – ΖΗΤΩ ΤΟ ΕΛΛΗΝΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ

more netroute2 hacks – new traffic shaper

On my previous post, more netroute2 hacks – high availability, one of the changed files was the dial_conn file. At the end of the diff there was a line with a # in front:
+ sleep 5
+ #/etc/bin/wshaper ppp0 192 1024

Inside netroute2 one can find the /etc-ro/ppp/wshaper file which is the traffic shaping script of the modem/router. Unfortunately it resides in the read-only section of the router so you can’t make changes directly to it. What I did was to make a copy of it on the writable /etc/bin/ and change a line in my /etc/bin/dial_conn to call it from there, right after (5 seconds later) the connection with the ISP has been established.

If you have followed the previous post about high availability the only thing you need to change is to edit your /etc/bin/dial_conn file and remove the # from the live above. Else…read the previous post 🙂

The first argument of the script is the device the rules will apply to, the second argument is the upload speed and the third is the download speed. Netroute2’s own traffic shaping script gets the 3 arguments while syncing with the dslam. The problem with adsl lines here in Greece, and I guess in many other countries as well, is that the speed the modem syncs with the dslam has nothing to do with the real speed you actually get. So shaping for 256kbit upload while never reaching more than 200 is a bit foolish imho. What I did was lower the upload so that I am always (or mostly always) sure that this is my max upload speed at the time. I can now create rules based on the assumption that my upload speed is 192kbit. If the upload speed your modem syncs is 192kbit I would advise you not to put more than 128kbit as the first argument. It’s a trial and error situation.

While lowering my shaped upload speed and keeping the rest of the script intact already made a difference I knew that I could do some more tweaking.
The first thing one has to know before creating any traffic shaping script is to learn what the TOS field is:

#TOS FIELD
# 0x10 – (minimize delay)
# 0x08 (maximize throughput),
# 0x04 (maximize reliability),
# 0x02 (minimize cost)
# 0x00 (best effort)

You can then create rules with iptables to change the TOS field of certain packets, for example:
$IPTABLES -t mangle -A POSTROUTING -o $DEV -p tcp --syn -m length --length 40:68 -j TOS --set-tos 0x10
$IPTABLES -t mangle -A POSTROUTING -o $DEV -p tcp --tcp-flags ALL ACK,FIN -j TOS --set-tos 0x10

A great rule to add to any of your scripts is to speed up ACK packets,(2) by adding them to the highest priority class (on netroute2 that’s 1:10):
$TC filter add dev $DEV parent 1: protocol ip prio 1 u32 \
match ip protocol 6 0xff \
match u8 0x05 0x0f at 0 \
match u16 0x0000 0xffc0 at 2 \
match u8 0x10 0xff at 33 \
flowid 1:10

What is also very very helpfull is to specify the port your torrent client uses (eg 17777) and add it to the lowest priority class (on netroute2 that 1:30):
$TC filter add dev $DEV parent 1:0 protocol ip prio 3 u32 match ip sport 17777 0xffff flowid 1:30
$TC filter add dev $DEV parent 1:0 protocol ip prio 3 u32 match ip dport 17777 0xffff flowid 1:30

Of course you can create your own classes inside /etc/bin/wshaper. If you are carefull enough with the rules you add you will be more than happy with the result 🙂

To monitor how your traffic shaping is going you can download a great perl script from here: http://qos.kallenberg.dk/ called qos.pl. This script reads a machine’s qos classes and priorities and creates graphs like the ones on the site. The problem with netroute2 is that it doesn’t have perl included, so one has to modify qos.pl to make it read netroute2’s qos performance while running from another machine. This is done by making the script run its commands through ssh-ing to netroute2 using public key auth. If you don’t know how to enable this on netroute2 please read part F of my older post: Intracom netroute2 hacks/.

What you need to change on the qos.pl script is:
a) change the $tc line with something like this:
$tc = "ssh root\@NETROUTE2.IP.GOES.HERE /usr/sbin/tc";
b) Find any occurances of “eth2” and replace with “ppp0” (there must be 2 occurances only).

now run the qos.pl script and it will start creating some graphs (png files) and an index.html on the directory from which you executed it. qos.pl depends on gnuplot, so you must install it before you run it.

The graphs are a great visual aid to to tweak your new traffic shaping script more and more.

more netroute2 hacks – high availability

The following post is going to be a one in a series of 2-3 posts regarding netroute2 (the link is in Greek) and some of my hacks/modifications on it. All hacks refer to netroute2 firmware 577 that I have previously posted on my blog. For those who haven’t noticed yet, firmware 577 is unlocked, you can now connect to any ISP you like.

Netroute2 has a strange bug and sometimes (not always) cannot reconnect to the ISP when the connection for some strange reason goes down. To cope with that, the netroute2 developers at Intracom have created a script named high_avail that runs every 5 minutes from crontab. For some even stranger reason this script did not work for me as it should, so I patched it to make it _always_ work.

The problem I faced at the very beggining was that the “high_avail” script resided in the read-only section of netroute2’s flash (/usr/bin/high_avail). My solution to that problem was to create a directory named /etc/bin/ and store there all my new scripts and changes since the /etc dir is writable.

My changes to the high_avail script are these:
--- usr/bin/high_avail 2007-07-03 20:59:21.000000000 +0300
+++ etc/bin/high_avail 2007-07-04 03:31:54.000000000 +0300
@@ -15,25 +15,32 @@
if [ -s /var/run/dial ]; then
act_conn=`$CAT /var/run/dial`
fi
-adsl_iface=$ADSL_BASE
+if [ -z "$act_conn" ]; then
+ act_conn="/etc/wan/current/CHANGEME"
+fi
+
+adsl_iface=$ADSL_BASE
+echo "$act_conn"
reload_module() {
/bin/hangup
+ killall -9 pppd
+ ifconfig eth2 down
/sbin/rmmod $loaded_mod
if [ $? -eq 0 ]; then
$ECHO "done"
else
$ECHO "failed"
$ECHO "ERROR: high_avail: Failed to unload $loaded_mod"
- exit 13
+# exit 13
fi
- [ -n "$act_conn" ] && /usr/bin/dial $act_conn
+ [ -n "$act_conn" ] && /etc/bin/dial $act_conn
}
HIGH_AVAIL_IP=`$GREP HIGH_AVAIL_IP /etc/net.conf | $CUT -d'=' -f2`
-
#Check Current Modem status
if [ -z "$loaded_mod" ]; then
$ECHO "high_avail: No Module found loaded."
+ reload_module
exit 1
elif [ "`/usr/bin/modem_wrap halt`" = "yes" ]; then
$ECHO "high_avail: Module $loaded_mod found in HALTED state"
@@ -41,6 +48,7 @@
exit 11
elif [ -z "$act_conn" ]; then
$ECHO "high_avail: No WAN Connection dialed ..."
+ reload_module
exit 2
elif [ -z "$HIGH_AVAIL_IP" ]; then
$ECHO "high_avail: No Ping Target IP Found ..."

and the whole new script resides here: /etc/bin/high_avail. (You need to gunzip it).

What you need to change for your connection is the part that says CHANGEME. You can replace that with what you can find inside the /etc/wan/current/ directory.

I noticed that when the module for the modem was loaded then the modem was unable to reconnect to the ISP, but upon unloading and reloading of the module, and then trying to connect again, all came back to normal. So what I changed in the high_avail script was making sure the module gets unloaded properly and reloaded when there’s no connection active.

One might notice that inside high_avail I’ve also changed a path from /usr/bin/dial to /etc/bin/dial.
This script is used to call another script that actually makes the call to the isp.

--- usr/bin/dial 2007-07-03 21:00:13.000000000 +0300
+++ etc/bin/dial 2007-07-04 03:39:45.000000000 +0300
@@ -1,25 +1,16 @@
#!/bin/sh
ECHO=/bin/echo
-
conn="$1"
#ATM encapsulation mode for modem
encmode=0
-
$ECHO "Dialing $conn ...."
-
if [ $# -lt 1 -o ! -e $conn ]; then
-
$ECHO "Usage: dial <connection> [ppp_option]"
$ECHO "connection: connection name"
$ECHO "[ppp_option]: optional argument passed to PPPD"
-
exit 1
-
else
-
#Bring down previous processes
/bin/hangup
-
- /usr/bin/dial_conn $conn primary_conn $2
-
+ /etc/bin/dial_conn $conn primary_conn $2
fi

These are my changes to /usr/bin/dial script that is now placed under /etc/bin/dial
The whole scripts resides here: /etc/bin/dial. (You need to gunzip it).

As said before, this script in turn calls another one, dial_conn which is used to actually make the call. My changes to /usr/bin/dial_conn which now becomes /etc/bin/dial_conn:


--- usr/bin/dial_conn 2007-07-03 21:00:13.000000000 +0300
+++ etc/bin/dial_conn 2007-07-04 03:43:11.000000000 +0300
@@ -154,14 +154,13 @@
exit 1
fi
done
-
fi
-
if [ "$2" = "primary_conn" ]; then
#Start the high-availability service
- $ECHO "*/5 * * * * root $PIDOF high_avail > /dev/null 2>&1 || /usr/bin/high_avail > /var/run/high_avail 2>&1" > /etc/cron.d/cron_high_avail
+ $ECHO "*/5 * * * * root $PIDOF high_avail > /dev/null 2>&1 || /etc/bin/high_avail > /var/run/high_avail 2>&1" > /etc/cron.d/cron_high_avail
$CHMOD 755 /etc/cron.d/cron_high_avail
fi
-
+ sleep 5
+ #/etc/bin/wshaper ppp0 192 1024
exit 0
fi

the whole file resides here: /etc/bin/dial_conn. (You need to gunzip it).

What I’ve changed here is the line that gets stored on crontab and calls the high_avail script every 5 minutes to check whether our connection is active or not. The rest of the changes will be the subject of the next post about netroute2 on this blog.

What is left now is to make netroute2 calls these new scripts from /etc/bin/ on boot instead of the ones from /usr/bin.

a) Copy /bin/dial_current to /etc/bin/dial_current, edit it with vi and go to line 5 and change the line that says /usr/bin/dial with /etc/bin/dial.
b) Edit /etc/init.d/rc-run, go to line 243 and change all occurences of /bin/dial_current with /etc/bin/dial_current. There must be 2.
c) Edit /etc/rc.d/rc.dialcurrent with vi, go to line 8 and change /usr/bin/dial to /etc/bin/dial.

So, if you have done it right, you should now have 4 scripts inside your netroute2’s /etc/bin:
a) /etc/bin/high_avail
b) /etc/bin/dial
c) /etc/bin/dial_conn
d) /etc/bin/dial_current
and you should have also changed 2 scripts, /etc/init.d/rc-run and /etc/rc.d/rc.dialcurrent

That’s all. Now save your changes with /etc/init.d/checkpoint and upon reboot your modem will have a nice new high_avail script that will (hopefully) always work.

iputils manpages and docbook-sgml-utils dependency on gentoo

Can you man ping on gentoo ? If not it’s probably because of this: https://bugs.gentoo.org/show_bug.cgi?id=158660

I wanted to install iputils man pages:
# echo "net-misc/iputils doc" >> /etc/portage/package.use
# emerge -avt iputils
[ebuild R ] net-misc/iputils-20060512 USE="doc* -ipv6 -static" 0 kB
[ebuild N ] app-text/docbook-sgml-utils-0.6.14 USE="tetex" 123 kB
[ebuild N ] dev-perl/SGMLSpm-1.03-r5 92 kB

SGMLSpm installation went fine but docbook-sgml-utils-0.6.14 did not complete. Here is some of the last output-errors:
jade:/etc/sgml/sgml-docbook-4.3.cat:1:8:E: cannot open "/usr/share/sgml/docbook/sgml-dtd-4.3/catalog" (No such file or directory)
jade:/etc/sgml/sgml-docbook-4.3.cat:1:8:E: cannot open "/usr/share/sgml/docbook/sgml-dtd-4.3/catalog" (No such file or directory)
jade:/etc/sgml/sgml-docbook-4.3.cat:1:8:E: cannot open "/usr/share/sgml/docbook/sgml-dtd-4.3/catalog" (No such file or directory)
make[2]: *** [api.html] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [backend-spec.html] Error 1
make[2]: Leaving directory `/var/tmp/portage/app-text/docbook-sgml-utils-0.6.14/work/docbook-utils-0.6.14/doc/HTML'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/tmp/portage/app-text/docbook-sgml-utils-0.6.14/work/docbook-utils-0.6.14/doc'
make: *** [all-recursive] Error 1

I issued an eix sgml:

...snip...
[I] app-text/docbook-sgml-dtd
Available versions:
(3.0) 3.0-r3
(3.1) 3.1-r3
(4.0) 4.0-r3
(4.1) 4.1-r3
(4.2) 4.2-r2
(4.3) 4.3-r2
(4.4) 4.4
Installed versions: 3.0-r3(3.0)(17:06:32 09/19/06) 3.1-r3(3.1)(17:06:38 09/19/06) 4.0-r3(4.0)(17:06:16 09/19/06) 4.1-r3(4.1)(17:06:10 09/19/06) 4.4(4.4)(23:51:33 09/18/06)
Homepage: http://www.docbook.org/sgml/
Description: Docbook SGML DTD 4.4
...snip...
...snip...
* app-text/docbook-sgml-utils
Available versions: 0.6.14
Homepage: http://sources.redhat.com/docbook-tools/
Description: Shell scripts to manage DocBook documents
...snip...

I had no docbook-sgml-dtd 4.3 version installed, so what I did was:
# emerge -avt =app-text/docbook-sgml-dtd-4.3-r2

and then:
# emerge -avt iputils
[ebuild R ] net-misc/iputils-20060512 USE="doc* -ipv6 -static" 0 kB
[ebuild N ] app-text/docbook-sgml-utils-0.6.14 USE="tetex" 0 kB

I can now man ping 🙂

useless tip of the day – clockdiff

How much time difference does your box and another host on the net have ?

~# clockdiff www.gentoo-wiki.com
...................................................
host=www.gentoo-wiki.com rtt=215(0)ms/206ms delta=508000ms/508000ms Thu Jun 28 16:25:31 2007
~# clockdiff www.ntua.gr
..................................................
host=achilles.noc.ntua.gr rtt=54(1)ms/49ms delta=-77ms/-76ms Thu Jun 28 16:25:47 2007

clockdiff is inside iputils package (at least on gentoo) and can only by executed as root.

P.S. exams suck bigtime….

gnomad2 – usb_set_configuration: operation not permitted – fix

Gnomad2 is a GTK+ music manager and swiss army knife for the Creative Labs NOMAD and Zen range plus the Dell DJ devices using the Portable Digital Entertainment (PDE) protocol.

Creative does not support these devices under Microsoft Windows Vista, there are no vista compatible drivers for those devices “yet”…so the only hope for owners of Creative Zen devices who wish to use what they have bought under a recent operating system is to use them with Linux (or go back to Windows XP if they wish something not so current). That means that users using Vista cannot upload mp3s to Zen and cannot even charge the device. Unless you have working drivers the device charges very very slowly.

Installation in Linux is distro dependent and should not be a problem. Just find the gnomad/gnomad2 package on the package manager of your distro and install it. On gentoo/sabayon an emerge -avt gnomad should be enough.

For people that will face the “usb_set_configuration: operation not permitted” error on gnomad startup there is a simple fix. Add the following to a file named 99-gnomad.rules and save it under /etc/udev/rules.d/ (You MUST be root to do that).

SUBSYSTEM!="usb_device", ACTION!="add", GOTO="libnjb_rules_end"
# Creative Nomad Jukebox
SYSFS{idVendor}=="0471", SYSFS{idProduct}=="0222", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox 2
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4100", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox 3
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4101", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox Zen
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4108", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox Zen NX
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4109", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox Zen USB 2.0
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="410b", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox Zen Xtra
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4110", GROUP="plugdev", MODE="0660"
# Dell Digital Jukebox
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4111", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox Zen Touch
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="411b", GROUP="plugdev", MODE="0660"
# Creative Zen (Zen Micro variant)
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="411d", GROUP="plugdev", MODE="0660"
# Creative Nomad Jukebox Zen Micro
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="411e", GROUP="plugdev", MODE="0660"
# Second Generation Dell Digital Jukebox
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4126", GROUP="plugdev", MODE="0660"
# Dell Pocket DJ
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4127", GROUP="plugdev", MODE="0660"
# Third Generation Dell Digital Jukebox
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="412F", GROUP="plugdev", MODE="0660"
# Creative Zen Sleek
SYSFS{idVendor}=="041e", SYSFS{idProduct}=="4136", GROUP="plugdev", MODE="0660"
LABEL="libnjb_rules_end"

Then restart udev (or reboot your computer if you don’t know how to restart udev), start gnomad2 and your zen should work flawlessly under Linux.

If you still get errors check that your current user belongs to the group plugdev using the command: groups username . If plugdev does not appear add it using gpasswd:
gpasswd -a username plugdev
replacing username with your username on the box.

References: http://bugs.gentoo.org/show_bug.cgi?id=137728

Πρωτοσέλιδο η εκδήλωση της ILUG

Το πόσο επιτυχημένη ήταν η εκδήλωση της ILUG που έγινε στις 9 Ιουνίου δύσκολα περιγράφεται. Θα προσπαθήσω να γράψω ένα ξεχωριστό post γι αυτό στις επόμενες μέρες. Προς το παρόν ένα πρωτοσέλιδο απο μία εκ των 2 μεγαλύτερων τοπικών εφημερίδων:

ILUG στον Ηπειρωτικό Αγώνα

Και το άρθρο: “Το Linux είναι και… Γιαννιώτικο. Μια πολύ καλή ημερίδα από την ομάδα χρηστών linux Ιωαννίνων“.

Αναφορές της εκδήλωσης υπάρχουν και σε άλλες τοπικές εφημερίδες.

Best Spam Message on my blog

Just…enjoy 🙂

Best Spam message

Hands-on OLPC

Today I was woken up by a courier who brought me a very interesting box. The box had an OLPC and some CDs inside.
The package was kindly sent to me by Mr. Karounos in order to present it in our local LUG event on the 9th of June in Ioannina. (Visit the website for more information).

First thought when I took it out of the bag…”it’s very small but it’s kinda pretty in a way too”. I put it on my desk and tried to open it. Well, it took me more than 1 minute to find out how to open it. I plugged it to the current and pushed the power button. I was greeted by Openboot bios and after 3 seconds it started to boot. It takes more than 1 minute from the time you press the power button until the sugar interface comes up.

When the interface had finished starting up I was a bit puzzled. I didn’t know what exactly to do. If you have used any kind of computer before, sugar will certainly stun you, for good or for bad. I started checking out the applications it comes with. The “Paint” application is really nice kids and so is “BlockParty” which is a tetris clone. Next was the “Camera”, I really liked the camera’s resolution, I didn’t expect it to be that good. “Calculator”..is another calculator with scientific functions as well, if you enable them. “Write” is an abiword clone, it’s very easy to use and you can import pictures taken with the camera and put them inside the document you are writing. What I didn’t like though was that the default “save as” format was “Microsoft Word .doc”. Why ? Anyway, continuing with the applications, next came the “News Reader” which looks like a minimal version of liferea but since I didn’t have any networking yet I could not test it any more. “Web” is a web browser that when you open it it takes you directly to your local Library of e-books. Very very usefull. At that time I couldn’t do any more testing of web sites due to lack of network connectivity. “Read” is a stripped down version of evince for reading various documents. Absolutely necessary for the kind of job this laptop must do. “TamTam” is a music creation tool for kids. And finally Etoys. Etoys is something that needs a lot of studying. It’s a creativity suite for kids. I won’t go into this any deeper for now.

Then was the time to connect OLPC to my access point at home. I tried various stuff from the interface but nothing made the “Web” connect to any sites. I couldn’t resist any more…I had to find access to the linux console somehow. I tried ctrl+alt+ various keys until one got me to the console. There are no F-keys on the OLPC keyboard so it wasn’t so straightforward as one might think. I was very lucky because when you give the root login no password is asked. You are immediately given a shell. I tried the usual iwlist, iwconfig, dhclient commands and …tada! they worked! ifconfig showed that I was given an IP by the Access Point. Back to the sugar interface with another ctrl+alt+another key and the “Web” was finally working. The browsing experience was quite good I can say. Four buttons (up, down,left,right) next to the OLPC’s monitor make the browsing a bit easier because OLPC keys on the keyboard are small. NO, they are not just smaller than a normal keyboard…they are so small that only a five year old kid can press them with ease.

I have been playing for more than 7 hours today with OLPC and did various interesting (at least for me) stuff on it. Even this post is written through OLPC’s “Web” browser (and believe me it’s very very difficult to type, but hey…this laptop is not for me, it’s for kids 🙂 )but I feel that I need to spend a lot more to fully understand the “sugar interface”. I will try to write more about stuff I’ll be doing on the OLPC in the next few days.

Feelings ? Mixed, both good and bad. There were some things I liked a lot, for example the monitor, and some things that I didn’t, for example in sugar’s network manager there’s no “interaction” when you choose/click between mess networking and normal access point (more on how, where, etc on a another post), so there were times that I couldn’t really figure out what was going on, whether it’s trying to connect to my access point or not.

That’s all for today. I hope I can write a bit more about OLPC in the next few days but I have to finish my presantation for ILUG’s event on time too.

I have some pictures from the OLPC on my flickr.

Thanks again to Mr. Karounos for being so helpfull.

P.S. If someone wants to type with greek characters you need to edit /etc/X11/xorg.conf and add these 2 lines inside Section “InputDevice” where Identifier is “ATKbd”:

Option "XkbLayout" "us,el"
Option "XkbOptions" "grp:alt_shift_toggle"

then you can change to greek with alt+shift. e.g. “Ένα Λάπτοπ για Κάθε παιδί”

apt-get install sucks so much sometimes

I was given an old debian machine to do some stuff. I had some networking problems so I thought I could install tcpdump to see what’s happening. Take a _good_ look at the result…

# apt-get update
Get:1 http://security.debian.org stable/updates/main Packages [62.2kB]
Get:2 ftp://ftp.ntua.gr stable/main Packages [5638kB]
Get:3 http://security.debian.org stable/updates/main Release [97B]
Get:4 ftp://ftp.ntua.gr stable/main Release [95B]
Get:5 ftp://ftp.ntua.gr stable/main Sources [1653kB]
Get:6 ftp://ftp.ntua.gr stable/main Release [97B]
Fetched 7353kB in 38s (191kB/s)
Reading Package Lists... Done
# apt-get install tcpdump
Reading Package Lists... Done
Building Dependency Tree... Done
The following extra packages will be installed:
libc6 libc6-dev libpcap0.8 libssl0.9.8 locales tzdata
Suggested packages:
glibc-doc
The following packages will be REMOVED:
base-config initrd-tools kernel-image-2.4.27-2-386
The following NEW packages will be installed:
libpcap0.8 libssl0.9.8 tcpdump tzdata
The following packages will be upgraded:
libc6 libc6-dev locales
3 upgraded, 4 newly installed, 3 to remove and 266 not upgraded.
Need to get 14.9MB of archives.
After unpacking 21.5MB disk space will be freed.
Do you want to continue? [Y/n] n

# uname -a
Linux XXXXXX 2.4.27-2-386 #1 Wed Aug 17 09:33:35 UTC 2005 i686 GNU/Linux

Does it want to remove the kernel I am using??? Why ?

D0h!

Yet another ati-drivers+xorg problem

The ati-drivers saga will never ever end 🙁

The gentoo fellas updated the stable xorg server to version 7.2, so I decided to give it a shot. The process was smooth, no errors.
Upon reboot though, the problems started. No 3D acceleration! I re-emerged the drivers, I used eselect opengl set ati, no result. Damn! I am still using kernel version 2.6.18 and my ati drivers were version 8.32.5.
My first thought was to update ati drivers to the latest available, 8.35.5. The 50Mb download took more than 30minutes on my 1mbit super-duper-extra fast adsl!

<Yet Another Rant>
During the last couple of weeks I can’t even get more than 50kb/sec during the night. During the morning there are times that I can hardly browse through websites with reasonable speed. Real-time streaming videos from youtube is out of the question of course.
</Yet Another Rant>

The drivers compiled fine but I lost my 1280×1024 mode! Out of sync errors and no image displayed on my tft. I couln’t even make it work with a Modeline…The highest mode that worked was 1024×768. I’ve even tried removing the ddc module out of xorg modules directory without any luck though. I am not willing to go back to 1024×768 so I had to download some other, older, ati-drivers version.
This time I chose version 8.33.6. Another 50Mb of download and another 30minutes of wait. The emerge was smooth, no errors. X windows started…but my fonts were trully messed up! Something had made them very very small on menus and input boxes. ARGHHHHHHHHH!! Grepping through the Xorg.0.log I found out that DPI was set to 75×75. Another easy way to check on that is through the
% xdpyinfo | grep resolution

command. Something had went wrong. Then I remembered that I had removed the ddc module, I put it back in and I got:
% xdpyinfo | grep resolution
resolution: 81x86 dots per inch

A bit better but certainly not very good. I had to change the DPI somehow…but how ?

I googled and googled …and I found out that I could put something like this:
Option "DPI" "96 x 96
inside my xorg.conf in the Screen section. But that didn’t to the trick. That used to work on Xorg 7.1 but not on 7.2. Tough luck.

I kept googling until I found out that I could start another X server like this:
% startx -- :1 -dpi 96

get the screen dimensions like this:
% xdpyinfo | grep dimensions
dimensions: 1280x1024 pixels (XXXxYYY millimeters)

and then paste the output of the previous command inside the Monitor section of the xorg.conf file like this:
DisplaySize XXX YYY

That did the trick. I can now enjoy 96×96 DPI fonts.

During the googling I found out that I can also have something like this inside the .Xdefaults file:
Xft.dpi: 96

Now, after more than 2 hours of messing around with ati-drivers and xorg.conf I am able to enjoy my beautiful desktop…pfffff.

Reminder to myself: Dont’ ever ever ever ever buy an ati card again.

P.S. I really hope that the 8.35.5 drivers are somehow fixed in the future so that I can use 1280×1024 if I ever need to update to that version.