On Fri, 17 Jan 1997 BuckFloyd@aol.com wrote: - Umm.. I'm having a NASTY crash bug that seems to - occur most often after zone_reset.. I've got most of the - functions in heartbeat() in comm.c flagged to log() - where they are at in the code.. no avail.. I can't track - it down.. so, I'm going to compile with gdb enabled. - - My problem? I'm not on UNIX. I'm on OS/2, running the emx - ports of the more-common unix utils, including gcc and gdb. - I don't have any man pages for gdb, so i'm kinda stuck. - - Any pages/docs available on the net, or should I go avail - myself of a UNIX book from the store? (My unix skills are - limited to "ls, cd, ps, and grep"). GDB has some online help, tho it's not the best. It does at least give a summary of commands and what they're supposed to do. I just mailed someone on some of my tricks. Here's a short intro to gdb and then my note on expanded bughunting. If you've got a core file, go to your top circle directory and type: > gdb bin/circle lib/core If you want to hunt bugs in real time (causing bugs to find the cause as opposed to checking a core to see why the mud crashed earlier) use: > gdb bin/circle If you're working with a core, gdb should show you where the crash occurred. If you get an actual line that failed, you've got it made. If not, the included message should help. If you're working in real time, now's the time to crash the mud so you can see what gdb catches. When you've got the crash info, you can type "where" to see which function called the crash function, which function called that one, and so on all the way up to main(). I should explain about "context" You may type "print ch" which you would expect to show you the ch variable, but if you're in a function that doesn't get a ch passed to it (real_mobile, etc), you can't see ch because it's not in that context. To change contexts (the function levels you saw with where) type "up" to go up. You start at the bottom, but once you go up, and up, and up, you can always go back "down". You may be able to go up a couple functions to see a function with ch in it, if finding out who caused the crash is useful (it normally isn't). The "print" command is probably the single most useful command, and lets you print any variable, and arithmetic expressions (makes a nice calculator if you know C math). Any of the following are valid and sometimes useful: print ch (fast way to see if ch is a valid pointer, 0 if it's not) print *ch (prints the contents of ch, rather than the pointer address) print ch->player.name (same as GET_NAME(ch)) print world[ch->in_room].number (vnum of the room the char is in) etc.. Note that you can't use macros (all those handy psuedo fntions like GET_NAME and GET_MAX_HIT), so you'll have to look up the full structure path of variables you need. Type "list" to see the soource before and after the line you're currently looking at. There are other list options but I'm unfamiliar with them. Hmm that's all I can think of for basics. Usuaally when you get a crash, you'll see the line where the crash occurred, and printing all the variables on that line will come up with at least one null pointer, which means you probably forgot a check for null. I think I'm getting into some of the topics in the attached message so I'll move on to that now :) Sam - I am having a lot of crashes recently, and anything you can tell me in - how to track them down with gdb would be helpful. One problem I am - encountering with gdb a lot is that it doesn't know where it crashed - (just gives empty brackets) and gives a ?? for the files. Any idea what - this could be? I have no idea how to use gdb really effectively, and - trying to track down these crash bugs is a pain in the butt. Or, if you - can just point me in the direction of some manuals or something that - would be useful to learn from, I would be most appreciative. Tanks man. THere's only a couple of commands I use in gdb, though with some patience they can be very powerful. The only commands I've ever used are: run well, duh :P print <variable> also duh, tho it does more than you might think list shows you the source code in context break <function> set a breakpoint at a function clear <function> remove a breakpoint step execute one line of code cont continue running after a break or ctrl-c I've run into those nasty problems you mentioned quite a few times. The cause is a memory problem, usually with pointers. I think the most commom cause is pointers to nonexistent memory. If you free a structure, or a sting or something, the pointer isn't always set to NULL, so you may have code that checks for a NULL pointer that thinks the pointer is ok since it's not NULL. You should make sure you always set pointers to NULL after freeing them. Ok now for the hard part. If I remember right, this was a problem with medit, right? If so, then you can probably duplicate it by using a specific sequence of actions. That makes things much easier. What you'll have to do is pick a function to "break" at. The ideal place to break is immediately before the crash. If I remember right, the crash was when you saved mobs, so you might be able to "break mobs_to_file". Try that one out first. When you medit save, the mud will hang. GDB will either give you segfault info, or it will be stopped at the beginning of mobs_to_file. If it segfaulted, pick an earlier function, like copy_mobile, or even do_medit. When you hit a breakpoint, print the variables that are passed to the function to make sure they look ok. Note that printing the contents of pointers is possible with a little playing around. For example, if you "print ch", you get a hex number that shows you the memory location where ch is at. It's a little helpfule, but try "print *ch" and you'll notice that it prints the contents of the ch structure, which is usually more useful. "print ch->player" will give you the name of the person who entered the command you're looking at, and some other info. If you get a "no ch in this context" it's because the ch variable wasn't passed to the function you're currently looking at. Ok so now you're ready to start stepping. When GDB hit your breakpoint, it showed you the first line of executable code in your function, which will sometimes be in your variable declarations if you initialized any variables (ex: int i = 0). As you're stepping through lines of code, you'll see one line at a time. Note that the line you see hasn't been run yet. It's actually the _next_ line to be executed. So if the line is "a = b + c;", printing a will show you what a was before this line, not the sum of b and c. If you have an idea of where the crash is occurring, you can keep stepping till you get to that part of the code (tip: pressing return will repeat the last GDB command, so you can type step once, then keep pressing return to step quickly). If you have no idea where the problem is, the quick and dirty way to find your crash is to keep pressing return rapidly (don't hold the eturn key or you'll probably miss it). When you get the seg fault, you can't step any more, so it should be obvious when that happens. Now that you've found the exact line where you get the crash, you should start the mud over and step more slowly this time. What I've found that works really well to save time is to create a dummy function. THis one will work just fine: void dummy(void){} Put that somewhere in the file you're working on. Then, right before the crash, put a call to dummy in the code (ex: "dummy();"). Then set your breakpoint at dummy, andwhen you hit the breakpoint, step once to get back to the crashing code. Now you're in total control. You should be looking at the exact line that gave you the crash last time. Print *every* variable on this line. Chances are one of them will be a pointer to an unaccessable memory location. For example, printing ch->player.name may give you an error. If it does, work your way back and print ch->player to make sure that one's valid, and if it isn't, try printing ch. Somewhere in there you're going to have an invalid pointer. Once you know which one it is, it's up to you to figure out why it's invalid. You may have to move dummy() up higher in the code and step slowly, checking your pointer athe way to see where it changes from valid to invalid. You may just need to NULL a free'd pointer, or you may have to add a check for a NULL pointer, or you may have screwd up a loop. I've done all that and more :) Well that's it in a nutshell. There's a lot more to GDB that I haven't even begun to learn, but if you get comfortable with print and stepping you can fix just about any bug. I spent hours on the above procedure trying to get my ascii object and mail saving working right, but it could have taken weeks without gdb. The only other suggestion I have is to check out the online gdb help. It's not very helpful for learning, but you can see what commands are available and play around with them to see if you can find any new tools. +-----------------------------------------------------------+ | Ensure that you have read the CircleMUD Mailing List FAQ: | | http://cspo.queensu.ca/~fletcher/Circle/list_faq.html | +-----------------------------------------------------------+
This archive was generated by hypermail 2b30 : 12/18/00 PST