Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Due to spam on this forum, all posts now need moderator approval.
Entire forum
➜ SMAUG
➜ Running the server
➜ Server crash
It is now over 60 days since the last post. This thread is closed.
Refresh page
Posted by
| Kronos
USA (35 posts) Bio
|
Date
| Tue 30 Dec 2008 07:46 PM (UTC) |
Message
| Hello, I'm running a modified version of SWR... I cleared up a bunch of bugs and it compiled fine, but I get this when I run the startup script:
*** glibc detected *** ../src/swreality: double free or corruption (!prev): 0x09
0f4660 ***
======= Backtrace: =========
/lib/libc.so.6[0x460b16]
/lib/libc.so.6(cfree+0x90)[0x464030]
/lib/libc.so.6(fclose+0x136)[0x44f866]
../src/swreality[0x814bed2]
../src/swreality[0x80c681b]
../src/swreality[0x80b25c6]
/lib/libc.so.6(__libc_start_main+0xdc)[0x40ddec]
../src/swreality[0x8049621]
======= Memory map: ========
003da000-003f4000 r-xp 00000000 08:03 65538 /lib/ld-2.5.so
003f4000-003f5000 r-xp 00019000 08:03 65538 /lib/ld-2.5.so
003f5000-003f6000 rwxp 0001a000 08:03 65538 /lib/ld-2.5.so
003f8000-00535000 r-xp 00000000 08:03 65554 /lib/libc-2.5.so
00535000-00537000 r-xp 0013d000 08:03 65554 /lib/libc-2.5.so
00537000-00538000 rwxp 0013f000 08:03 65554 /lib/libc-2.5.so
00538000-0053b000 rwxp 00538000 00:00 0
00543000-00568000 r-xp 00000000 08:03 65571 /lib/libm-2.5.so
00568000-00569000 r-xp 00024000 08:03 65571 /lib/libm-2.5.so
00569000-0056a000 rwxp 00025000 08:03 65571 /lib/libm-2.5.so
006c0000-006cb000 r-xp 00000000 08:03 65743 /lib/libgcc_s-4.1.2-20080102.so
.1
006cb000-006cc000 rwxp 0000a000 08:03 65743 /lib/libgcc_s-4.1.2-20080102.so
.1
08048000-0824e000 r-xp 00000000 08:03 14812408 /home/sentient/test/src/swreali
ty
0824e000-0824f000 rwxp 00205000 08:03 14812408 /home/sentient/test/src/swreali
ty
0824f000-08277000 rwxp 0824f000 00:00 0
090f4000-091db000 rwxp 090f4000 00:00 0
40000000-40001000 r-xp 40000000 00:00 0 [vdso]
40001000-40002000 rw-p 40001000 00:00 0
4000a000-4000c000 rw-p 4000a000 00:00 0
40100000-40121000 rw-p 40100000 00:00 0
40121000-40200000 ---p 40121000 00:00 0
bfeb3000-bfec8000 rw-p bfeb3000 00:00 0 [stack]
Abort (core dumped)
Any ideas?
-Kronos | Top |
|
Posted by
| Kronos
USA (35 posts) Bio
|
Date
| Reply #2 on Wed 31 Dec 2008 05:56 PM (UTC) Amended on Wed 31 Dec 2008 07:19 PM (UTC) by Kronos
|
Message
| Okay I made the best of gdb and here is some info...
Quote:
[sentient@lix area]$ gdb ../src/swreality core.27294
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db lib
rary "/lib/libthread_db.so.1".
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Core was generated by `../src/swreality 2009'.
Program terminated with signal 6, Aborted.
#0 0x40000402 in __kernel_vsyscall ()
(gdb) bt
#0 0x40000402 in __kernel_vsyscall ()
#1 0x00420d10 in raise () from /lib/libc.so.6
#2 0x00422621 in abort () from /lib/libc.so.6
#3 0x00458e5b in __libc_message () from /lib/libc.so.6
#4 0x00460b16 in _int_free () from /lib/libc.so.6
#5 0x00464030 in free () from /lib/libc.so.6
#6 0x0044f866 in fclose@@GLIBC_2.1 () from /lib/libc.so.6
#7 0x0814bed2 in load_structures () at structure.c:283
#8 0x080c681b in boot_db (fCpyOver=0 '\0') at db.c:894
#9 0x080b25c6 in main (argc=2, argv=0xbfcbef24) at comm.c:283
(gdb) list
157 #endif
158
159 MEMORY_INIT();
160
161
162
163
164
165
166 emergency_copy = FALSE;
(gdb)
now based on this I can't exactly tell if MEMORY_INIT() is causing a problem or setting emergency_copy to false... I did a little testing and found that no matter how much space I put between MEMORY_INIT() and emergency_copy it will ALWAYS show me emergency_copy at the bottom and MEMORY_INIT() wont even be visible when I type list. What are your thoughts?
UPDATE: Okay, I looked into it more and I noted that emergency_copy is defined as a global variable... I tried a couple of things with no luck, and decided to just see what would happen if I commented it out. What happens is it instead goes to the next line, which looks like this:
Quote:
161
162 //emergency_copy = 0; //false
163 num_descriptors = 0;
I know the numbers don't make sense if you look at the last list i posted but I removed some lines of empty space. does all this shed any light on my problem?
-Kronos
| Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #3 on Wed 31 Dec 2008 07:25 PM (UTC) |
Message
| What I deduce from what you have shown is this:
- You have called main -> boot_db -> load_structures.
- At line 283 in structure.c (ie. in function load_structures) you are calling fclose.
- The fclose is crashing
The function fclose takes a FILE* argument, so your likely reason for the crash is one of:
- The file is already closed or was never opened.
- The FILE* argument is NULL (probably because the file was not opened).
- The FILE* argument has become corrupted somehow.
I would investigate along those lines.
One thing to do is print a warning if the file cannot be opened by fopen, and then not try to close it (that is, test if the fopen returns NULL).
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Kronos
USA (35 posts) Bio
|
Date
| Reply #4 on Wed 31 Dec 2008 08:19 PM (UTC) Amended on Wed 31 Dec 2008 08:41 PM (UTC) by Kronos
|
Message
| Ooooh, very insightful! Thank you! The file stream in fpReserve is externalized in mud.h and used to open a file before boot_db is called, here's the section of code for that with the important bits in bold:
/*
* Reserve two channels for our use.
*/
if ( ( fpReserve = fopen( NULL_FILE, "r" ) ) == NULL )
{
perror( NULL_FILE );
exit( 1 );
}
if ( ( fpLOG = fopen( NULL_FILE, "r" ) ) == NULL )
{
perror( NULL_FILE );
exit( 1 );
}
/*
* Get the port number.
*/
port = 8000;
if ( argc > 1 )
{
if ( !is_number( argv[1] ) )
{
fprintf( stderr, "Usage: %s [port #]\n", argv[0] );
exit( 1 );
}
else if ( ( port = atoi( argv[1] ) ) <= 1024 )
{
fprintf( stderr, "Port number must be above 1024.\n" );
exit( 1 );
}
}
if (argv[2] && argv[2][0])
{
fCopyOver = TRUE;
control = atoi(argv[3]);
control2 = atoi(argv[4]);
}
else
{
fCopyOver = FALSE;
}
/*
* Run the game.
*/
#ifdef WIN32
{
/* Initialise Windows sockets library */
unsigned short wVersionRequested = MAKEWORD(1, 1);
WSADATA wsadata;
int err;
/* Need to include library: wsock32.lib for Windows Sockets */
err = WSAStartup(wVersionRequested, &wsadata);
if (err)
{
fprintf(stderr, "Error %i on WSAStartup\n", err);
exit(1);
}
/* standard termination signals */
signal(SIGINT, (void *) bailout);
signal(SIGTERM, (void *) bailout);
}
#endif /* WIN32 */
log_string("Booting Database");
boot_db( fCopyOver );
There seems to be a catch that will close the program if the file can't be opened, so where is it going wrong?
here is the load_structures() function for good measure:
void load_structures( )
{
FILE *fpList;
char *filename;
char structurelist[256];
char buf[MAX_STRING_LENGTH];
first_structure = NULL;
last_structure = NULL;
sprintf( structurelist, "%s%s", STRUCTURE_DIR, STRUCTURE_LIST );
fclose( fpReserve );
if ( ( fpList = fopen( structurelist, "r" ) ) == NULL )
{
perror( structurelist );
exit( 1 );
}
for ( ; ; )
{
filename = feof( fpList ) ? "$" : fread_word( fpList );
if ( filename[0] == '$' )
break;
if ( !load_structure( filename ) )
{
sprintf( buf, "Cannot load structure file: %s", filename );
bug( buf, 0 );
}
}
fclose( fpList );
fpReserve = fopen( NULL_FILE, "r" );
return;
}
The offending line is in bold. Any ideas?
UPDATE: I looked at the function that is called just before load_structures and at the end is this:
fpReserve = fopen( NULL_FILE, "r" );
return;
so there IS a file open to be closed, why might it crash?
-Kronos | Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #5 on Wed 31 Dec 2008 08:58 PM (UTC) |
Message
| Which one is line 283?
Anyway, you have identified that there is an open there without a test for success. I would replace:
fpReserve = fopen( NULL_FILE, "r" );
with the other code:
if ( ( fpReserve = fopen( NULL_FILE, "r" ) ) == NULL )
{
perror( NULL_FILE );
exit( 1 );
}
Check what the NULL_FILE define is. And check that file exists. After all, if that file is in an invalid path, or does not exist, then that fopen will return NULL, and the fclose at the start of load_structures will fail. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Kronos
USA (35 posts) Bio
|
Date
| Reply #6 on Wed 31 Dec 2008 09:44 PM (UTC) Amended on Wed 31 Dec 2008 10:04 PM (UTC) by Kronos
|
Message
| I changed every instance of this to the code you suggested... also I know the NULL_FILE exists because it's called many times in the code prior to this crash... but anyway here it is:
#define NULL_FILE "/dev/null" /* To reserve one stream*/
it still crashes in exactly the same spot. However, I commented out that line in load_structures and the mud runs... but I would like to understand what is going on here so I can make sure it's fixed properly. What do you think is the cause now? | Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #7 on Wed 31 Dec 2008 10:06 PM (UTC) |
Message
| Is this Linux or Cygwin? |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Kronos
USA (35 posts) Bio
|
Date
| Reply #8 on Wed 31 Dec 2008 10:57 PM (UTC) |
Message
| |
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #9 on Wed 31 Dec 2008 10:58 PM (UTC) |
Message
| The purpose of the fpReserve file in the first place is to keep a file descriptor available to the MUD. In older versions of Unix / Linux, there was a limit to the number of files that could be open - this may or may not still be the case.
The problem they are trying to solve here is that, a network connection is effectively a file (as the socket uses a file descriptor). Thus if the limit of open files is reached (say, with 1,000 players on), then when it came time to save a player's data to disk, there was no free descriptor for opening the player file.
Thus, the MUD opened a file itself (fpReserve) that it didn't really need, using "/dev/null" as the device (ie. the null device, or no device). Then every time it needs to open a file it first closes fpReserve (so now it has at least one free descriptor), opens the file it really wants, closes it, and then re-opens fpReserve, so it is still reserving a file for next time.
Quote:
I know the NULL_FILE exists because it's called many times in the code prior to this crash ...
Well I am guessing that somewhere in the code, it closes fpReserve, opens and closes another file, and then does not re-open fpReserve. Thus the file is not open, and it crashes closing it, on the line that is causing you the problem.
I suggest you check that, everywhere it closes fpReserve and opens and closes another file, it wraps up by re-opening fpReserve again. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Kronos
USA (35 posts) Bio
|
Date
| Reply #10 on Thu 01 Jan 2009 02:02 AM (UTC) |
Message
| I went through every instance where it closes fpReserve and noted that it then opens another file, eventually closes it, and then reopens fpReserve, and it did just that every time... except where i commented out the one fclose( fpReserve ) in order to stop it from crashing in load_structures() ... curious eh?
-Kronos | Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #11 on Thu 01 Jan 2009 02:34 AM (UTC) Amended on Thu 01 Jan 2009 02:35 AM (UTC) by Nick Gammon
|
Message
| Hmm, when it crashes in gdb, type "f 7" (frame 7) to get to the stack frame which is where the fclose is, and then "p fpReserve" (print that variable). It would be interesting to see if it is NULL or not.
It would also probably help to add:
directly after each time you do a fclose (fpReserve).
That way, you know that the fpReserve file is definitely closed. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
30,996 views.
It is now over 60 days since the last post. This thread is closed.
Refresh page
top