tcm bot crashing

Discussion of EFnet's IRCDs (hybrid, ratbox, csircd)

Moderators: Website/Forum Admins, Software/IRCD Moderators

-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

tcm bot crashing

Postby -wassup- » Wed Sep 17, 2003 7:11 pm

i am running a hybrid 6.4.1 server with tcm-hybrid-CURRENT-20030817_0.tgz. everytime i rehash the server the bot crashes instead of doing its normal stats requests. here is the log.
server_link_closed(0x818adb0)
09/17/2003 20:13
server_link_closed(0x0)
09/17/2003 20:13
server_link_closed(0x818adb0)
09/17/2003 20:14
server_link_closed(0x0)
09/17/2003 20:14
server_link_closed(0x818adb0)
thats from the error log.
Hwy
Posts: 66
Joined: Wed Jul 16, 2003 12:27 pm

Postby Hwy » Thu Sep 18, 2003 12:51 am

Sending a gdb bt to the authors would likely get more of a result. Send it to ircd-hybrid@the-project.org

gdb /path/to/tcm-binary /path/to/tcm.core

When you get to a "(gdb)" prompt, issue the command "bt"

Paste that command's output into the e-mail.
-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

Postby -wassup- » Thu Sep 18, 2003 3:47 am

the tcm hasnt left a core file, i just found that in the error log. i'll email the gdb output.
Hardy
Site Admin
Posts: 394
Joined: Wed Jul 02, 2003 4:54 pm
Location: Oslo, Norway
Contact:

Postby Hardy » Thu Sep 18, 2003 11:15 am

TCM does a lot of stats requests to start with, are you sure your sendq is high enough so it simply doesnt flood itself out? when avalonworks gained some thousand clients i noticed i had it to low so it died in the midle of /trace.. The error i got was some of the same you got.. rubbish not really saying anything :)
-- Hardy
Administrator: irc.underworld.no
Services Administrator
http://www.efnet.org admin/staff
-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

Postby -wassup- » Thu Sep 18, 2003 1:10 pm

well the last tcm bot i had used did it fine, i havent changed the sendq lengths. i havent touched any of the Y or I lines since i upgraded.
-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

Postby -wassup- » Thu Sep 18, 2003 1:15 pm

yeah i tried increasing the sendq length and it still had that error
Hwy
Posts: 66
Joined: Wed Jul 16, 2003 12:27 pm

Postby Hwy » Thu Sep 18, 2003 1:21 pm

You're using current cvs, which can be (and often is) broken. That is designed for developers ONLY.

Anyway, what OS are you running? Some operating systems (stupidly) disable dumping core files by default, show us your output from the shell command:
ulimit -c

(or if you are using some form of csh)
limits -c
-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

Postby -wassup- » Thu Sep 18, 2003 2:58 pm

its an updated red hat 7.2 system with the 2.4.20 kernel.
[irc@ns irc]$ ulimit -a
core file size (blocks) 0
data seg size (kbytes) unlimited
file size (blocks) unlimited
max locked memory (kbytes) unlimited
max memory size (kbytes) unlimited
open files 1024
pipe size (512 bytes) 8
stack size (kbytes) 8192
cpu time (seconds) unlimited
max user processes 4096
virtual memory (kbytes) unlimited

i take it i need to change the core file size? or is 0 unlimited? i have had core files appear with this set at 0 before.

here is my gdb session:
[irc@ns bin]$ gdb ./tcm
GNU gdb Red Hat Linux (5.2-2)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) run -n
Starting program: /home/irc/tcm/bin/tcm -n

Program received signal SIGSEGV, Segmentation fault.
0x4019169c in chunk_free (ar_ptr=0x40245300, p=0x818e8d8) at malloc.c:3228
3228 malloc.c: No such file or directory.
in malloc.c
(gdb)

i got that seg fault when i rehashed, although the tcm does not crash when running through gdb. maybe i have set the fd limit too high in the config.h?
/* Maximum connections;
* this is now shared with squid, socks, wingate checks
* XXX could make this FD_SETSIZE, but that might be a tad too large
*/
#define MAXDCCCONNS 1024

the box i am using does have 1024 as its fd limit
-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

Postby -wassup- » Thu Sep 18, 2003 5:19 pm

i just remembered that i did get errors during compiling, having to do with socket.h.

Code: Select all

gcc -I../include -g -O2 -Wall  -c util.c
In file included from /usr/include/sys/socket.h:35,
                 from util.c:11:
/usr/include/bits/socket.h:36: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c actions.c
In file included from /usr/include/sys/socket.h:35,
                 from actions.c:12:
/usr/include/bits/socket.h:36: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c dcc_commands.c
In file included from dcc_commands.c:7:
/usr/include/unistd.h:247: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c logging.c
In file included from logging.c:16:
/usr/include/unistd.h:247: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c main.c
In file included from /usr/include/sys/socket.h:35,
                 from main.c:16:
/usr/include/bits/socket.h:36: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c respond.c
In file included from respond.c:27:
/usr/include/unistd.h:247: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c serv_commands.c
In file included from serv_commands.c:7:
/usr/include/unistd.h:247: warning: empty declaration
gcc -I../include -g -O2 -Wall  -c stdcmds.c
In file included from /usr/include/sys/socket.h:35,
                 from stdcmds.c:23:
/usr/include/bits/socket.h:36: warning: empty declaration
i removed all the bits that didnt have any errors before them. tcm did compile thought.
Hwy
Posts: 66
Joined: Wed Jul 16, 2003 12:27 pm

Postby Hwy » Thu Sep 18, 2003 9:13 pm

-wassup- wrote:i just remembered that i did get errors during compiling, having to do with socket.h.
System header, you can't do anything about it so you can ignore it.
Hwy
Posts: 66
Joined: Wed Jul 16, 2003 12:27 pm

Postby Hwy » Thu Sep 18, 2003 9:18 pm

-wassup- wrote:its an updated red hat 7.2 system with the 2.4.20 kernel.
[irc@ns irc]$ ulimit -a
core file size (blocks) 0

...

i take it i need to change the core file size? or is 0 unlimited? i have had core files appear with this set at 0 before.
ulimit -c unlimited

Some programs, hybrid, hybserv, and Sentinel in particular, will programatically change that value to unlimited, even if your system sets it to 0 (disabled) by default (as most Linux distributions do, for some unknown reason).
-wassup- wrote: here is my gdb session:
[irc@ns bin]$ gdb ./tcm
GNU gdb Red Hat Linux (5.2-2)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) run -n
Starting program: /home/irc/tcm/bin/tcm -n

Program received signal SIGSEGV, Segmentation fault.
0x4019169c in chunk_free (ar_ptr=0x40245300, p=0x818e8d8) at malloc.c:3228
3228 malloc.c: No such file or directory.
in malloc.c
(gdb)
Here is where you would issue the "bt" command, to see what function calls were run prior to the core happening. chunk_free() looks like it may be a double-free bug, but I can't be sure without seeing a complete backtrace.
-wassup-
Posts: 103
Joined: Wed Aug 13, 2003 8:25 pm
Location: Middle East

Postby -wassup- » Thu Sep 18, 2003 10:27 pm

a lot of linux distros put the core file to zero for security reasons, sometimes daemons leave world readable core files with important info

Code: Select all

[irc@ns tcm]$ gdb bin/tcm ./core
GNU gdb Red Hat Linux (5.2-2)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
Core was generated by `tcm/bin/tcm'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libcrypto.so.2...done.
Loaded symbols for /lib/libcrypto.so.2
Reading symbols from /lib/i686/libc.so.6...done.
Loaded symbols for /lib/i686/libc.so.6
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_nisplus.so.2...done.
Loaded symbols for /lib/libnss_nisplus.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
#0  0x4019169c in chunk_free (ar_ptr=0x40245300, p=0x818e8d8) at malloc.c:3228
3228    malloc.c: No such file or directory.
        in malloc.c
(gdb) bt
#0  0x4019169c in chunk_free (ar_ptr=0x40245300, p=0x818e8d8) at malloc.c:3228
#1  0x401913f4 in __libc_free (mem=0x818e8e0) at malloc.c:3154
#2  0x08058dba in clear_dynamic_info () at skline.c:106
#3  0x0805b861 in reload_userlist () at userlist.c:1170
#4  0x0804dfbd in on_server_notice (source_p=0xbfffd8d0, argc=4,
    argv=0xbfffd750) at bothunt.c:544
#5  0x08058b51 in ms_notice (source_p=0xbfffd8d0, argc=4, argv=0xbfffd750)
    at serv_commands.c:73
#6  0x08057a80 in process_server (source_p=0xbfffd8d0,
    function=0x818adc3 "NOTICE", param=0x818adca "BCtcm") at parse.c:306
#7  0x0805776e in parse_server (uplink_p=0x818adb0) at parse.c:116
#8  0x08059626 in read_packet () at tcm_io.c:207
#9  0x08056ccb in main (argc=1, argv=0xbfffddc4) at main.c:279
#10 0x4012c657 in __libc_start_main (main=0x80568ac <main>, argc=1,
    ubp_av=0xbfffddc4, init=0x804bcb0 <_init>, fini=0x805c5a0 <_fini>,
    rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbfffddbc)
    at ../sysdeps/generic/libc-start.c:129
(gdb)
thats the gdb output....i hope this is what you wanted.
Hwy
Posts: 66
Joined: Wed Jul 16, 2003 12:27 pm

Postby Hwy » Fri Sep 19, 2003 12:36 pm

-wassup- wrote:a lot of linux distros put the core file to zero for security reasons, sometimes daemons leave world readable core files with important info
Then they shouldn't tout themselves as a "developer's platform".
-wassup- wrote: thats the gdb output....i hope this is what you wanted.
That information is what you'd send to ircd-hybrid@the-project.org.

Who is online

Users browsing this forum: No registered users and 0 guests