Home Forums Classifieds Blogs Today's Posts Search Social Groups



  
SIGN-UP
Notices

Glock Talk
Welcome To The Glock Talk Forums.

 
  
Reply
 
Thread Tools Display Modes
Old 05-08-2012, 16:08   #1
Bushflyr
ʇno uıƃuɐɥ ʇsnɾ
 
Bushflyr's Avatar
 
Join Date: Mar 1999
Location: Western WA
Posts: 4,465
Bizarre System Crash. . Any Ideas?

I noticed my server (Ubuntu 11.10 Server) HDD light was on when nothing was accessing the server. I already had a SSH window open on my Mac. So I tried a few commands to try and see what was going on.

Code:
[/usr/local/sbin/hourly.active]: htop
-bash: /usr/bin/htop: Input/output error
[/usr/local/sbin/hourly.active]: sudo cat /proc/mdstat
-bash: /usr/bin/sudo: Input/output error
[/usr/local/sbin/hourly.active]: ls
/usr/local/sbin/ls: line 50: 19495 Bus error               /bin/ls $@ 1>&1
[/usr/local/sbin/hourly.active]: la
/usr/local/sbin/ls: line 50: 19500 Bus error               /bin/ls $@ 1>&1
[/usr/local/sbin/hourly.active]: cd
[~]: ls
/usr/local/sbin/ls: line 50: 19507 Bus error               /bin/ls $@ 1>&1
[~]: top
Segmentation fault
top, htop, cat, and ls gave errors, but cd worked fine.

I tried a reboot, but wound up with a "no operating system found" sort of error. I had to go to work, so I shut down and switched off the PSU. After coming home I rebooted into recovery, powered down the system normally (halt -p) and rebooted. It came up fine except for a failed sdb in in the raid. It rebuilt fine on the spare. I'm currently running a smart test (smartctl -t long /dev/sdb) but I don't expect any errors as the RAID has dropped disks before and they checked out fine.

It seems odd thought that just failing a raid disk (the OS is on a separate drive) would take the whole system down.

Any thoughts?
__________________
...the secret is to bang the rocks together, guys.

That which does not kill you has made a tactical error. --Tayler
Bushflyr is offline   Reply With Quote
Old 05-08-2012, 22:30   #2
gemeinschaft
AKA Fluffy316
 
gemeinschaft's Avatar
 
Join Date: Feb 2004
Location: Houston, TX
Posts: 4,617
Send a message via AIM to gemeinschaft Send a message via Yahoo to gemeinschaft Send a message via Skype™ to gemeinschaft
I am not sure, but I would consider setting up a Cron job to check your disks daily to monitor to see if you just had a bad drive or what the deal was.
__________________
Check my Photography Site at:
To view links or images in signatures your post count must be 10 or greater. You currently have 0 signatures.
gemeinschaft is offline   Reply With Quote
Old 05-08-2012, 22:44   #3
Linux3
Senior Member
 
Linux3's Avatar
 
Join Date: Dec 2008
Posts: 1,399
Not enough info about your system but.
cd /var/log
ls -al
look at the time stamps on dmesg and syslog.
cat dmesg |grep sdb
cat /var/log/syslog |grep sdb

Or use dmesg.0 and syslog.1
or whatever to match the time of the problems.

Any errors?
__________________
It it's not on fire,
It's a software problem.

To view links or images in signatures your post count must be 10 or greater. You currently have 0 signatures.
Linux3 is offline   Reply With Quote
Old 05-09-2012, 11:50   #4
Bushflyr
ʇno uıƃuɐɥ ʇsnɾ
 
Bushflyr's Avatar
 
Join Date: Mar 1999
Location: Western WA
Posts: 4,465
Thanks for the ideas. I've gone through all the log files and there's nothing there. I'll try a smartctl cronjob, but I don't expect much there. The drives are all new and I've run a long test after each failure with no errors. Different drives have dropped out at different points, but it had been running reliably for a few weeks now with no probs.
__________________
...the secret is to bang the rocks together, guys.

That which does not kill you has made a tactical error. --Tayler
Bushflyr is offline   Reply With Quote
Old 05-10-2012, 00:09   #5
Detectorist
Senior Member
 
Detectorist's Avatar
 
Join Date: Jul 2008
Location: Missouri
Posts: 8,198
Windows 7 would have prevented that.


__________________
NASM-Certified Personal Trainer

The single biggest problem in communication is the illusion that it has taken place. George Bernard Shaw
Detectorist is offline   Reply With Quote
Old 05-10-2012, 23:24   #6
Bushflyr
ʇno uıƃuɐɥ ʇsnɾ
 
Bushflyr's Avatar
 
Join Date: Mar 1999
Location: Western WA
Posts: 4,465
If by "would have prevented that" you mean "would have prevented my even installing a RAID since Win7 wouldn't know a RAID if it bit it on the ASSH," then yes, you are correct.

Oh, wait, it doesn't do ASSH either.
__________________
...the secret is to bang the rocks together, guys.

That which does not kill you has made a tactical error. --Tayler

Last edited by Bushflyr; 05-10-2012 at 23:27..
Bushflyr is offline   Reply With Quote
Old 05-11-2012, 22:06   #7
Detectorist
Senior Member
 
Detectorist's Avatar
 
Join Date: Jul 2008
Location: Missouri
Posts: 8,198
Quote:
Originally Posted by Bushflyr View Post
If by "would have prevented that" you mean "would have prevented my even installing a RAID since Win7 wouldn't know a RAID if it bit it on the ASSH," then yes, you are correct.

Oh, wait, it doesn't do ASSH either.
Win 7 Professional Ultimate supports Mirrored type of RAID.
__________________
NASM-Certified Personal Trainer

The single biggest problem in communication is the illusion that it has taken place. George Bernard Shaw
Detectorist is offline   Reply With Quote
Old 05-11-2012, 22:36   #8
Bushflyr
ʇno uıƃuɐɥ ʇsnɾ
 
Bushflyr's Avatar
 
Join Date: Mar 1999
Location: Western WA
Posts: 4,465
I know. But adding in the exception in ruined the lyrical flow.

And the intent is still correct since Windows 7 Professional Ultimate Super Duper Apex Pinnacle etc etc still doesn't do RAID 5 (which is what I'm using), RAID 6, or any sort of nested RAID. It does RAID 1. And I'm purposely leaving out "RAID" 0 because it's not really RAID as there is no Redundant in it.
__________________
...the secret is to bang the rocks together, guys.

That which does not kill you has made a tactical error. --Tayler

Last edited by Bushflyr; 05-11-2012 at 22:37..
Bushflyr is offline   Reply With Quote
Old 05-12-2012, 23:07   #9
jarubla
Dos Pistolas
 
jarubla's Avatar
 
Join Date: Feb 2010
Location: UT
Posts: 377
Raid 5 is single parity, right? Can you ID which disk failed or hiccuped? Any chance that you had more than the one disk report an issue, or even when it was rebuilding? Smells like a possible RAID rebuild issue to me, and disks sometimes do funny things at the worst possible times. ONe of the main reasons why I am a RAID 6 guy. More costs involved on that extra disk, but can help alleviate dual disk failures.

Are you able to parse through any log files to pinpoint when the issue occurred? Hoping maybe an error message can be pulled and we can wash it through the ubuntu bug tool:

https://bugs.launchpad.net/ubuntu

Also, as a side note, I just saw your thread over on http://ubuntuforums.org, following this as I am curious now as to the outcome.

-Jay
__________________
Give a man a fish, and you'll feed him for a day. Teach a man to fish, and you'll never see him on the weekends.
jarubla is offline   Reply With Quote
Old 05-13-2012, 13:15   #10
Bushflyr
ʇno uıƃuɐɥ ʇsnɾ
 
Bushflyr's Avatar
 
Join Date: Mar 1999
Location: Western WA
Posts: 4,465
Raid 5 is single parity, but I'm also running a hot spare, so there is some extra safety there. I've lost sdb and sde at one point or another, but no errors ever showed up when scanning the drives afterward and I readded them to the RAID without issue.

The first couple times it happened I was thinking maybe cables, but there were no IO errors. And nothing listed in any log files. At this point I'm wondering if it's possibly flaky power in my house. (I haven't gotten the UPS yet, but it's on the list) I recall reading somewhere that RAIDs are particularly sensitive to power fluctuations. And, all my lights dim for a second when the wife turns on the hair dryer.

Also previous RAID failures never took down the OS. Everything is back up and running fine, so at this point
__________________
...the secret is to bang the rocks together, guys.

That which does not kill you has made a tactical error. --Tayler
Bushflyr is offline   Reply With Quote

 
  
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump




All times are GMT -6. The time now is 21:44.




Homepage
FAQ
Forums
Calendar
Advertise
Gallery
GT Wiki
GT Blogs
Social Groups
Classifieds


Users Currently Online: 1,297
329 Members
968 Guests

Most users ever online: 2,672
Aug 11, 2014 at 2:31