View  Info
 
Note
No changes were made. These revisions are identical.

Podcast 020

Revision #16, 9/3/2008 10:17 PM
80.101.130.8: "Fixed typos"
Tags: (None)

Previous Next 

Podcast 020

Revision #16, 9/3/2008 10:17 PM
80.101.130.8: "Fixed typos"
Tags: (None)

Previous Next 

Ads, intro

[1:00]

Spolsky: Today is the day that we did not launch, although we planned to. But then... We'll wait for another week.

Atwood: Yeah, well, the good news on that is that we did actually figure out what that problem was.    

Spolsky: Oh, oh, I want to hear, I want to hear, I want to hear.

Atwood: Eh. So, it was a third party library. Indirectly, I mean. It's the third party library, and our particular use of it. It was Log4Net.

Spolsky: Oh!

Atwood: We were logging in such a way that the log.... during the log call was triggering another log call. Which is normally okay, but with the load that we have, eventually they would happen so close together that there's also a lock. So, there's two locks going on there. There's a lock of like disposing of the database stuff that's going on. Then there's lock of like actually writing to a file...

Spolsky: Hm!

Atwood: And... Huh... They happen in the opposite order, so it's like a classic deadlock, right. So, you release the lock on the database, then you release the lock on the file. And then the other call was doing in the other order. And they were happenning so fast that... it was deadlocking eventually. And it was one of those things that would happen.. like... it was very [intermittent [??]], right.

Spolsky: Right.

Atwood: So we had to dust-off Win Debug.

Spolsky: How on Earth do you find things like that?

Atwood: Well. You bust that Win Debug. One nice feature in Windows 2008, and I think this is in Vista as well. In Task Manager, you can right-click a task and take a dump of it.

Spolsky: Yeah!

Atwood: Like right there.

Spolsky: Aha.

Atwood: So we took a dump of the W3Service process, and...

Spolsky: [Then [??]] take a dump.

Atwood: Yeah, I know, any time you do this it's like.. It's like the territory for jokes. It's just...

Spolsky: [giggles]

Atwood: [laughs]. And then we loaded up a... Win... Debug.

Spolsky: Windbg! Yeah

Atwood: ...and then some .Net managed extensions you can, sort of load. You need like a chi-chi to figure out what the commands are. And then you load the dump, and you load the manage tools. And then you can sort of just investigate all of the threads. You can take the "Show me all the manage threads." And then say "Show me what's the call stack was for that thread." And what we saw was like tons and tons of threads that were all going "Hey, I would like to log something..." And it was like "Hmmmm... [laughs]... Interesting!" Right, you have like 80 threads that all try to write something to the log. So... Right then we kind of knew where the problem was.

And then somebody on Twitter actually volunteered to help us diagnose the dump. So I put it up on our server, and he a.... he nailed... he had a great description of it, like line by line, blow by blow of exactly what was happening. I mean, I'm... I'm competent enough to sort of figure out roughly what was going on, but he really knew this stuff and really helped us out, and I do appreciate that.

[3:33]

Spolsky: That's really awesome.

Atwood: So yeah. We.. we ripped out all the logging. I was never a huge fan of logging. I mean I guess there's a couple of philosophies on this. Like which is your philosophy with logging, like as you sit down to write a function, like would you add logging to it? I mean what's your ..what's your philosophy ? I'm curious. [[??]]

Spolsky: I'd never .. no ... I never do... I nev... But you know I don't... I don't think I've ever worked on code that is sort of [operational] in the same way.

Atwood: ah hm.

Spolsky: Eh.. because we definitly eh.. put a lot more ... oh [you know] I did, at Yuno we used to have all kinds of logging.
The trouble is that my philosophy has always been that you .. you.. you have a tendency to wanna log everything. But then you just get logs that are, you know, a hundred megabyte per user and you get thirty of them a minute and it can't possibly be analyzed or stored in any reasonable way. So the next thing you have to do is to start culling your logs or just have different levels of debugging, where it's like in high debug mode everything is logged and in low debug mode nothing is logged. And... it's kind of hard to figure out what you really want in a log. You you know you know .. a lot of logs, like I think of the logging that we did in Yuno, where people would call with a complaint and you try to figure out where this programm is crashing. And obviously a log of the crash, that's easy. Ehm, but then there's some line above the crash which hopefully gives you a lot of information about where it happend. And there's some line you don't see that should have been after that, after the crash, but it never got there 'cause it crashed sometime before there. And essentially what you're doing as you're adding logging, is you're doing binary search, right, where you're [sticking in] like "well gosh, I got to here and then got to there. But there's an awful lot of code between point A and point B. So let's make an A you know half-way from A to B, log point of some sort". Then you put that in and then you eliminate 50 percent of the possible places to look for your crash.
Um, but I've never really been able to...

Atwood: I mean that, ironically, to troubleshoot this hang, which turned out to be because of logging, we were adding more logging.

Spolsky: [laughs]

Atwood: The joke just writes itself! The joke just writes itself, right...

Spolsky: It does... How many... How many third-party tools do you have... uhh... How many third-party tools are a part of the StackOverflow code base?

Atwood: Well, okay, so... [chuckles] Uh, Dare [pronounces it as the English word "dare"] Obasanjo [pronounces it "oh-bih-san-ho"]... I don't know if I'm pronouncing it correctly.

Spolsky: Okay, "Dare" [pronounces it "daray"]... Obasanjo [pronounces it "oh-bih-san-ja"]... It's "Dare."

Atwood: Is it "Dare"?

Spolsky: Yep.

Atwood: Really... Okay, I didn't know that. Well, I've learned something. But he had a whole blog entry about how, you know, I had chosen to write my own sanitizer, and that was a very deliberate choice for me...

Spolsky: Mm hmm.

Atwood: ...for a number of reasons that I won't get into. But he was very critical of this, because, of course there were bugs in the sanitizer...

Spolsky: Mm hmm.

Atwood: ...which there were going to be, and to me, it's about, like, it's about your velocity; it's not about where you are; it's about where you're going, and we're gonna fix that stuff, right, and I'm making the sanitizer public as well, so other people can have a sanitizer that's not ten thousand lines of code, and ridiculous, and uh, so there's a philosophy there of building something that's reusable for everyone. Um, but I thought it was ironic, because he was talking about how developers should just pick a third-party library and go with it, and I think obvio... it's a balancing act, because we picked this logging library, right, which kind of caused a problem for us, right, I mean partially it was the way we were using it, but the way it was logging the files was a design issue in terms of the way it logged for networks.

Spolsky: Right.

Atwood: So I... I think it's a trade-off. I don't think it's always as clear-cut as "you should always pick a library" or "you should never pick a library," right? I think there's always some in-between there. So, for us, I'm definitely a minimalist—I don't like third-party libraries; I feel like we have a giant third-party library called "Windows," called ".NET"... huh... ASP.NET MVC is technically a third-party library. Um, but these are, you know, major vendor stacks. And I do feel like—as much as we talk about open source and stuff—there's a certain level of quality you associate with these major first-party stacks, right, whether it's from Apple or Microsoft or Sun or whoever. That may or may not be true, but hopefully usually is true: that these things are really heavily tested.

Spolsky: There is definitely, yeah, there is definitely... I mean, there's something I've learned over the years, and, you know, I started out with, uh, working on the Excel team, um... The developers on that team had a motto, which was "Find the dependencies and eliminate them." You know, they had their own compiler; they would not use untested libraries from other groups at Microsoft even... Uh...

Atwood: I love that they had their own compiler. That is so hardcore. I can't even, like, I could not even hang out with those guys... right... that hardcore.

[7:50]

Spolsky: edit me!

Atwood: edit me!


[To be transcribed]

[1:02:50]

Atwood: Two things, we have a wiki, for people who can't listen to this. Where people can contribute transcriptions of our incredibly boring podcasts, and we thank you very much for that. Although I do have one request for the transcriptionists, and the ironic thing is you going to transcribe this, which I think is hilarious. When you transcribe, don't write down every time I say 'uh' or pause or 'yeah'. Make me sound awesome, that's my one request for the transcriptionists.

Spolsky: It doesn't have to be word for word. It doesn't necessarily read as well when it's word for word you can leave [out] 'uhms' and 'uhs'.

Atwood: In fact, leave out whole. If you think it reads better a certain way, just make me say whatever makes the transcriptionists sound the most awesome.

Spolsky: And it's wiki, go ahead and edit it.

Atwood: People edit anyway, you're right, it's hilarious. I've been reading the [revisions], it's very funny.

Atwood: The other thing is, if you do contribute to the wiki. Since our beta has been pushed back a week, this will get you in the same day, to the StackOverflow beta. If you want to be in, just email me after you've done a little bit of transcription, one minute or whatever you're comfortable with. If you want to get your question answered on the air, send a less than 90 second recording to podcast@stackoverflow.com, we will put it in the queue and hopefully answer it on the next podcast.

Spolsky: Alright, that's it. Thank you very much, see you next week!

Atwood: See you next week.

[1:04:04]

[Outro]

Ads, intro

[1:00]

Spolsky: Today is the day that we did not launch, although we planned to. But then... We'll wait for another week.

Atwood: Yeah, well, the good news on that is that we did actually figure out what that problem was.    

Spolsky: Oh, oh, I want to hear, I want to hear, I want to hear.

Atwood: Eh. So, it was a third party library. Indirectly, I mean. It's the third party library, and our particular use of it. It was Log4Net.

Spolsky: Oh!

Atwood: We were logging in such a way that the log.... during the log call was triggering another log call. Which is normally okay, but with the load that we have, eventually they would happen so close together that there's also a lock. So, there's two locks going on there. There's a lock of like disposing of the database stuff that's going on. Then there's lock of like actually writing to a file...

Spolsky: Hm!

Atwood: And... Huh... They happen in the opposite order, so it's like a classic deadlock, right. So, you release the lock on the database, then you release the lock on the file. And then the other call was doing in the other order. And they were happenning so fast that... it was deadlocking eventually. And it was one of those things that would happen.. like... it was very [intermittent [??]], right.

Spolsky: Right.

Atwood: So we had to dust-off Win Debug.

Spolsky: How on Earth do you find things like that?

Atwood: Well. You bust that Win Debug. One nice feature in Windows 2008, and I think this is in Vista as well. In Task Manager, you can right-click a task and take a dump of it.

Spolsky: Yeah!

Atwood: Like right there.

Spolsky: Aha.

Atwood: So we took a dump of the W3Service process, and...

Spolsky: [Then [??]] take a dump.

Atwood: Yeah, I know, any time you do this it's like.. It's like the territory for jokes. It's just...

Spolsky: [giggles]

Atwood: [laughs]. And then we loaded up a... Win... Debug.

Spolsky: Windbg! Yeah

Atwood: ...and then some .Net managed extensions you can, sort of load. You need like a chi-chi to figure out what the commands are. And then you load the dump, and you load the manage tools. And then you can sort of just investigate all of the threads. You can take the "Show me all the manage threads." And then say "Show me what's the call stack was for that thread." And what we saw was like tons and tons of threads that were all going "Hey, I would like to log something..." And it was like "Hmmmm... [laughs]... Interesting!" Right, you have like 80 threads that all try to write something to the log. So... Right then we kind of knew where the problem was.

And then somebody on Twitter actually volunteered to help us diagnose the dump. So I put it up on our server, and he a.... he nailed... he had a great description of it, like line by line, blow by blow of exactly what was happening. I mean, I'm... I'm competent enough to sort of figure out roughly what was going on, but he really knew this stuff and really helped us out, and I do appreciate that.

[3:33]

Spolsky: That's really awesome.

Atwood: So yeah. We.. we ripped out all the logging. I was never a huge fan of logging. I mean I guess there's a couple of philosophies on this. Like which is your philosophy with logging, like as you sit down to write a function, like would you add logging to it? I mean what's your ..what's your philosophy ? I'm curious. [[??]]

Spolsky: I'd never .. no ... I never do... I nev... But you know I don't... I don't think I've ever worked on code that is sort of [operational] in the same way.

Atwood: ah hm.

Spolsky: Eh.. because we definitly eh.. put a lot more ... oh [you know] I did, at Yuno we used to have all kinds of logging.
The trouble is that my philosophy has always been that you .. you.. you have a tendency to wanna log everything. But then you just get logs that are, you know, a hundred megabyte per user and you get thirty of them a minute and it can't possibly be analyzed or stored in any reasonable way. So the next thing you have to do is to start culling your logs or just have different levels of debugging, where it's like in high debug mode everything is logged and in low debug mode nothing is logged. And... it's kind of hard to figure out what you really want in a log. You you know you know .. a lot of logs, like I think of the logging that we did in Yuno, where people would call with a complaint and you try to figure out where this programm is crashing. And obviously a log of the crash, that's easy. Ehm, but then there's some line above the crash which hopefully gives you a lot of information about where it happend. And there's some line you don't see that should have been after that, after the crash, but it never got there 'cause it crashed sometime before there. And essentially what you're doing as you're adding logging, is you're doing binary search, right, where you're [sticking in] like "well gosh, I got to here and then got to there. But there's an awful lot of code between point A and point B. So let's make an A you know half-way from A to B, log point of some sort". Then you put that in and then you eliminate 50 percent of the possible places to look for your crash.
Um, but I've never really been able to...

Atwood: I mean that, ironically, to troubleshoot this hang, which turned out to be because of logging, we were adding more logging.

Spolsky: [laughs]

Atwood: The joke just writes itself! The joke just writes itself, right...

Spolsky: It does... How many... How many third-party tools do you have... uhh... How many third-party tools are a part of the StackOverflow code base?

Atwood: Well, okay, so... [chuckles] Uh, Dare [pronounces it as the English word "dare"] Obasanjo [pronounces it "oh-bih-san-ho"]... I don't know if I'm pronouncing it correctly.

Spolsky: Okay, "Dare" [pronounces it "daray"]... Obasanjo [pronounces it "oh-bih-san-ja"]... It's "Dare."

Atwood: Is it "Dare"?

Spolsky: Yep.

Atwood: Really... Okay, I didn't know that. Well, I've learned something. But he had a whole blog entry about how, you know, I had chosen to write my own sanitizer, and that was a very deliberate choice for me...

Spolsky: Mm hmm.

Atwood: ...for a number of reasons that I won't get into. But he was very critical of this, because, of course there were bugs in the sanitizer...

Spolsky: Mm hmm.

Atwood: ...which there were going to be, and to me, it's about, like, it's about your velocity; it's not about where you are; it's about where you're going, and we're gonna fix that stuff, right, and I'm making the sanitizer public as well, so other people can have a sanitizer that's not ten thousand lines of code, and ridiculous, and uh, so there's a philosophy there of building something that's reusable for everyone. Um, but I thought it was ironic, because he was talking about how developers should just pick a third-party library and go with it, and I think obvio... it's a balancing act, because we picked this logging library, right, which kind of caused a problem for us, right, I mean partially it was the way we were using it, but the way it was logging the files was a design issue in terms of the way it logged for networks.

Spolsky: Right.

Atwood: So I... I think it's a trade-off. I don't think it's always as clear-cut as "you should always pick a library" or "you should never pick a library," right? I think there's always some in-between there. So, for us, I'm definitely a minimalist—I don't like third-party libraries; I feel like we have a giant third-party library called "Windows," called ".NET"... huh... ASP.NET MVC is technically a third-party library. Um, but these are, you know, major vendor stacks. And I do feel like—as much as we talk about open source and stuff—there's a certain level of quality you associate with these major first-party stacks, right, whether it's from Apple or Microsoft or Sun or whoever. That may or may not be true, but hopefully usually is true: that these things are really heavily tested.

Spolsky: There is definitely, yeah, there is definitely... I mean, there's something I've learned over the years, and, you know, I started out with, uh, working on the Excel team, um... The developers on that team had a motto, which was "Find the dependencies and eliminate them." You know, they had their own compiler; they would not use untested libraries from other groups at Microsoft even... Uh...

Atwood: I love that they had their own compiler. That is so hardcore. I can't even, like, I could not even hang out with those guys... right... that hardcore.

[7:50]

Spolsky: edit me!

Atwood: edit me!


[To be transcribed]

[1:02:50]

Atwood: Two things, we have a wiki, for people who can't listen to this. Where people can contribute transcriptions of our incredibly boring podcasts, and we thank you very much for that. Although I do have one request for the transcriptionists, and the ironic thing is you going to transcribe this, which I think is hilarious. When you transcribe, don't write down every time I say 'uh' or pause or 'yeah'. Make me sound awesome, that's my one request for the transcriptionists.

Spolsky: It doesn't have to be word for word. It doesn't necessarily read as well when it's word for word you can leave [out] 'uhms' and 'uhs'.

Atwood: In fact, leave out whole. If you think it reads better a certain way, just make me say whatever makes the transcriptionists sound the most awesome.

Spolsky: And it's wiki, go ahead and edit it.

Atwood: People edit anyway, you're right, it's hilarious. I've been reading the [revisions], it's very funny.

Atwood: The other thing is, if you do contribute to the wiki. Since our beta has been pushed back a week, this will get you in the same day, to the StackOverflow beta. If you want to be in, just email me after you've done a little bit of transcription, one minute or whatever you're comfortable with. If you want to get your question answered on the air, send a less than 90 second recording to podcast@stackoverflow.com, we will put it in the queue and hopefully answer it on the next podcast.

Spolsky: Alright, that's it. Thank you very much, see you next week!

Atwood: See you next week.

[1:04:04]

[Outro]