View  Info
 
Compare   Restore  
   Revision #14 - 6/11/2009 3:22 PM     

Podcast 039

[1:15]

Atwood: Oh yes, I love QuickBooks!

Spolsky: He he.

Atwood: Very exciting bit of software, very fun to use.

Spolsky: Yeah, uhh Michael and I were having a little laugh about it today

Atwood: Yeah, it's necessary, I appreciate the necessity of it.  But, I gotta tell ya, I really feel that QuickBooks is a good example of software that just doesn't sort of .. hasn't adapted to sort of the new paradigm that we have for software

Spolsky: Well, neither has accounting so he he he get used to it.  You know what I mean?  Like its .. the thing about QuickBooks is that it has evolved over many years to be the perfect accounting application.  Or I should say bookkeeping -- bookkeeping and accounting.

Atwood:  Really?  Perfect?  There's nothing?

Spolsky:  It's perfect, yes, it, yes, you'll see, it's perfect, it doesn't gather any data that you don't - yes, it takes you a long time to learn that it's doing things the way they are because you know - you have to have gone through about 5 or 6 years of filling out every possible tax form before you finally appreciate the craziness that's in there?

Atwood:  Right, well no, I appreciate the wisdom you guys are bestowing upon me because you've done this stuff and I do I do understand that it's necessary and I appologize for being very balky about it umm I know it's necessary but I do want to put this to our listeners, I'm kind of curious because Joel maintained and I can see your point on this Joel, to be fair I totally see your point on this that some of the emerging online stuff like I think, uh gosh, I can't really remember the names but there's a couple of really popular web based sort of accounting solutions.

Spolsky: Yeah, like Net Ledger is that one of them?

Atwood: uhhh, Wesabe I think is one of them..

Spolsky: Oh, those are .. Net Suite that was the one I was thinking of Net Suite, there was something called Net Ledger I'm pretty sure and uhh I don't know what Wesabe is but I think those are umm those are like personal accounting systems not suitable for business accounting..

Atwood: But, you can't use those for small business at all?

Spolsky: No, you have issues when you're a small business of of that that are different from .. yeah..

Atwood: Ok, that's fair, but one thing that you had warned about was the tendency for these guys to hold your data hostage which I think particularly as a business would be really dangerous right

Spolsky: Oh yeah yeah yeah, the web based ones definately did that and I don't even think I'm giving away the pooch if I can say that I think it was Net Suite, ahhhh, you know I should check ahhh definitely some of them, some of them ran for a few years and then raised their price a few times.

Atwood:  Yeah, no, I hear where you're coming from and I think as a small business that's a serious concern, I mean from a personal level it's kind of a concern but .. I mean, I don't know

Spolsky: Even QuickBooks every once in a while requires you to upgrade, um, you know because something changes in tax laws or something and they just - I don't know why - and they get - people get furious because they're forced to buy the upgrade after you know every three years or something but it's a lot more reasonable than the online ones usually and the other real reason to use QuickBooks is every accountant in the world knows how to use it because it's just become such a standard in the way that word is.

Atwood: Right, I still think, you know that said, I still think there's an opportunity for someone to be QuickBooks compatible but be a little less painful to use. Because I post it on Twitter to voice out my dissatisfaction and you know who, uh, concurred Richard White of User Voice saying "I challenge anyone to find out a software with a worse start-up experience than QuickBooks and he was using the Mac version." So apparently it's...

Spolsky: It's sort of hard to disentangle the start-up experience of QuickBooks with the fact you are doing business-level bookkeeping for the first time.

Atwood: Um-hmm.

Spolsky: Like if, it's really... once you know business-level bookkeeping, you know, the, the, the trouble is that, that the way you bookkeeping for a business is.. kind of hard, that double-entry bookkeeping and stuff. Uh, it isn't for all our listeners.

Atwood: Uh, OK. That's enough of that.

Spolsky: We are going to leave all that.

Atwood: It is reality, though. I mean, you are to be an entrepreneur, then this is something that you have to do unfortunately, believe me.

Spolsky: You do.

Atwood: I'm not happy about it but you have to do it

Spolsky: It's not that hard. ... You see, usually what they recommend and this is still what I'd recommend to a random person: um, get, wait, when you start your business, hire your accountant, the guy who is going to do your taxes at the end of the year, go to his office, have him to spend an hour to setting up QuickBook for you and showing you how to enter all the things that happens and how to enter them. And just have him, you know, check over every, you know, check over every once in a while. Uh, because you, you, you'll be presented with -- it doesn't add up to that much time and true to the history of Fog Creek, you know, I do not know ANYTHING about double-entry bookkeeping or anything like that, and I read one book about how double-entry bookkeeping should work, uh, which is very helpful. And I read the manual -- does QuickBook still comes with a printed manual?

Atwood: Uh... I bought it online, so... no.

Spolsky: Oooh. See now the printed book that used to come with QuickBooks was quite good in that it wasn't just teaching you how to use the application, it was actually teaching you how to track money for your small company. Which is, um, above and beyond what you'd expect from a user manual that was pretty well written in the sense that it tell you what you need to know and doesn't confuse you with the things that you don't need to know.

[6:15]

Atwood: Right. The good news is that, from the business viability standpoint, I mean, one of my goals is to make enough from advertising that I could, you know, pay other programmers each.. and ideally myself would be nice as well.

Spolsky: Yeah.

Atwood: And that's doing really well. I haven't given you guys the latest data but that's looking pretty good. So..

Spolsky: Awesome.

Atwood: ..that's encouraging..

Spolsky: Woo-hoo.

Atwood: 'cos I want this thing to be sustaining. And, the reason I want this to be sustaining, I'm going to segue into my next topic, is we are continuing to builing out new parts of the site and we wanted to continue to make it better.. and, you know, more complete.

Spolsky: Yeah.

Atwood: And to that end, now that Jarrod is full-time and I brought Jeff on as well, we are able to complete two major features last week.

Spolsky: I saw them.

Atwood: Yeah, so it's pretty exciting. So one is question bounty, that is actually live and working now. Although we did actually defer... so there's something that has to happen after seven days because there's a timeout for the bounty, we haven't actually done that seven day part yet [laughs] 'cause it, we figure out that we have seven days to get this working.

Spolsky: [laughs] I don't know how many times I, I have pull that stunt in my time.

Atwood: Oh, that's funny. But yeah, Jarrod is going to finish obviously before the seven day timeline we are going to finish the end state. But the, the actual core part of it is working. If you put up a bounty on a question, you put up some of your own reputation, it has to be at least two days of old... and we've seen it work. I have actually, we have a little admin page that shows, uh, closed bounties.

Spolsky: Jarrod, I would say Jarrod is pratically the definition of get things done.

Atwood: Oh, Jarrod is great.

Spolsky: Like there's no risk whatsoever of it isn't going to get it done.

Atwood: Oh, no no no. These guys are great, you know, these are, I have talked about this before, but these are hand-picked people I have worked with before so, you know, I have total confidence in their abilities. But bounty is working.

Spolsky: Bounty's working?

Atwood: We've actually seen people get really good answers from it. I mean, nothing works at everytime. It's a bit of a gamble, I think people are upset a little bit because like they'll go "How can you GUARANTEE we are going to get an answer?" or "How can you GUARANTEE...", I mean, there's no guarantee, when you're born there's not like a form that says "OK, we will guarantee x". I mean this is an illusion, I mean, you don't know what you're going to get. But the good news is...

Spolsky: Yeah.

Atwood: It definitely increases the interest in your question, and... I see many of them work. Like, really work.

Spolsky: Someone placed a bounty and they get a good answer

Atwood: Yes!

[8:20]

[Garbled]

Spolsky: I like it not being a guarantee because if you know that you're asking a question... like if a lot of times, there are these questions that are like how do I do... the following, and it's just not possible. Right? Like let's say I wanna make -- here's something I could post a question for. I wanna make it so that Chrome, you know, the new Google Web browser, does not animate animated gifs. 'Cos it gives me a frigging headache. And, and, that's all I want. And I don't - I don't even need ad - I'll look at as many ads as you want as long as they don't flash. Umm, but there's no way to do that. There's just not - they did not expose the functionality to turn off the animated gif-ness.

Atwood: Well there's ...

Spolsky: I guess you have the source code, right? So you could ... edit the source code. But I just want a -- I want a dialog box. Check off, it'll... turn off animated gifs. And they don't have it. So, if somebody's asking a question like that, and the answer is, "You can't do that. There is no way to do that, I'm sorry, tough it out," they're probably not gonna place a bounty, because there's a very good [laughing] risk that they're not gonna get an answer, and they're gonna lose their money at that point.

Atwood: Yeah, I have seen some like that. I've also seen some like, "I want to do this really complicated thing. How should I do it?" You know, and it's kind of like, "It's complicated! I mean, what do you want us to do, do it for you?" 

Spolsky: Right, right, right.

Atwood: So, it varies. Obviously.

Spolsky: Is there like a continuum from Stack Overflow to, like, those rent-a-guru sites that just write code for you? Or is there... I think there's a big enough gap in the middle. 

Atwood: Yeah, there could be. But there's definitely a gambling aspect to it. Now that I think about it. People have also complained that it theoretically turns reputation into a form of currency. I don't entirely agree with that, because --

Spolsky: [gasps]

Atwood: Because I feel like it's something you earn, 'cos, you know, you had that whole rant about --

Spolsky: Yeah.

Atwood: About how, "Don't pay people!" But this doesn't, this doesn't feel like payment to me. This feels like --

Spolsky: Tipping.

Atwood: -- something nice that you're doing... yes, tipping. Thank you. Perfect. And a little bit like gambling, honestly. [Laughing] 'Cos you're saying, Ooh, put 150 rep on the line, spin it around, and see what happens. But you know, gambling is fun, and there's definitely a game-like aspect to Stack Overflow and part of it is intentional, I mean you know we don't want to go too far with it because it gets ridiculous, but I'm comfortable with that.

Spolsky: So this is -- I was, ah, randomly, uh, yesterday sitting in the office -- sorry, the day before yesterday -- sitting in the offices of Google in Munich, um, with a bunch of developers there, and I was like "So how many of you have ever heard of Stack Overflow?" And it was all of them.

Atwood: Wow.

Spolsky: I mean people use it now. I just thought I'd mention that at random.

Atwood: No, that's great. Speaking of which, let's talk about it. How was your European vacation?

Spolsky: Well it wasn't a vacation... 

Atwood: I'm just going to pretend like it's the movie, because that's more exciting...

Spolsky: European Vacation

Atwood: Yes.

Spolsky: It was more like The Bourne Identity. I was running around European capitals, first-class lounges, taxis... oh my God, they go sooo fast on the road in Germany. When they picked me up from the airport, the speedometer was at 190. How much is that in dollars? Hehe, how much is that in dollars? 

[Laughter]

Spolsky: 190 kilometers per hour in miles per hour... 118. Oh  my God. It was so fast - and it was like a Mercedes limo, too, and it was passing all the other cars on the road, and you didn't even feel it moving. Practically. Because it was -- [shudders]. Anyway. That was cool. That was the first cool thing. But the conference was not so good. You know if you're ever involved at all in any kind of conferences, you don't want to ever be on a panel or in any way going to any kind of conference that has panels, or... Nothing with panels. Panels: bad.

Atwood: Well, they do those at Mix, and I think if you don't have too many of them they're OK. If everything's a panel you're in trouble. I agree with that, but... 

Spolsky: Yeah.

Atwood: A handful of panels, not too bad.

Spolsky: It depends how many people. If you and I were on a panel, it would be fun. But if there were four other random people there, also talking, you just -- you don't get enough airtime, so it's very hard to have -- maybe 3 people on a panel is okay.

Atwood: You also need a good moderator. Which, I'll tell you is a real skill. Moderating people on a panel. You have to have someone in charge that knows what they're doing. Otherwise it's gonna go south.

Spolsky: It depends also who's on it. I was on a software p- this is not a software conference, this is digital life and design, so this is just a very generic conference about everything, hosted by Birdo which is a big publishing company over there in Germany. And, um, my panel was called "Software", and the -- the people on it, uh, were a very very wide selection of people that really did not have -- we didn't have that much in common, I guess, and, um, and so the, uh, moderator tried very hard, Marisa Mayer, from Google, tried very hard to come up with interesting topics that would be interesting to anybody --

Atwood: Mm-hm.

Spolsky: But the trouble is, everything had to be so vague and so generalised that it was very hard to come up with anything, uh, interesting. 

Atwood: Mm.

Spolsky: And so I could see that this was a conference that was actively trying to take the very very smart people that they had on their panels, and reduce them to saying trite and stupid things. 

Atwood: [Laughing] You're not just saying that because that's what happened to you, here?

Spolsky: No.

Atwood: You're blaming the conference now.

Spolsky: No. Max Levchin was on my panel, you know him? He's kind of cool. He's one of the founders of PayPal in the past. Now he runs a start-up called Slide, which does slideshows on Myspace or something like that. Anyway, uh, [laughing] I would have been bored out of my mind except that Max Lev... Levchin challenged me to sneak into one of my answers the words "the last of the Mohicans". Which was really hard.

Atwood: [Laughing]

Spolsky: I snuck it in only at the last minute. And I told him he had to say "righteous indignation" at some point.

Atwood: Well that's an easy one. Righteous indignation? Come on, who doesn't have that every day?

Spolsky: I know but he had to say those words. He didn't just have to be it. And that was the only way I had to amuse myself and the thing I was speaking on. But it's a shame, right, because if I had been given the equivalent amount of time, like fifteen minutes to talk, whatever my slice of the time was, ten minutes, even, I could have gotten up and said something interesting in ten minutes that that audience would have learned something and taken something home. But... instead I didn't. Anyway.

Atwood: No, that's a good point, it's like a pitfall of a panel-type discussion versus, you know, more free-form.

Spolsky: Yeah.

Atwood: Little "grok" talks, is what I've ... little ten-minute sessions are really what I refer to as "grok" talks where somebody gets up and talks on some topic for a very narrow amount of time.

Spolsky: If they prepare, those can be awesome. We did, uh, we did this thing at The Business Of Software last year called Pecha-Kucha which is this Japanese thing -- you put up some PowerPoint slides, and you have to come prepared with 20 PowerPoint slides, and they advance for you automatically every twenty seconds. You don't get to advance them yourselves.

Atwood: [Laughing] That's awesome. I remember reading about that. 

Spolsky: Six minutes and forty seconds and you have to be so well prepared to have these slides handy at the right time, and we had a contest, we had, I dunno, six or eight people doing these, and the winner was Alexis Ohanian, the guy who draws the cute aliens on Reddit, and he had prepared his so well, that like the slide changed at like the perfect time, and provided the perfect punchline for what he had just said, and it was just an awesome six minutes and forty seconds. And, and most of the other pecha-kuchas were pretty good too. People just prepared a lot more and they tried to... tried to distil their idea down to six minutes and forty seconds. Uh, and make it clear and punchy. It really worked well, and the ones that didn't prepare as well or didn't have something as interesting to say, uh, fortunately were mercifully short. [Laughing] You only had to wait a couple of minutes before the next person was all done.

Atwood: Well, you know, Jeff Dalgis, the adjunct member of the team, recently gave a presentation on Stack Overflow in Corvallis, which is where he's from, to a user group...

Spolsky: Oh, cool.

Atwood: And the key piece of advice I gave him on presentation was like "Always end early. Don't ever ever go over. If you do anything else --"

Spolsky: Yeah.

Atwood: "-- end on time. Ideally end early. Because nobody says, at the end of a presentation, 'Wow! I wish that had been much, much longer.'" I mean, right? So, to me, that's the key piece of advice. It sounds like what you're describing is where you've taken that, you've made it part of a ruleset like a [garbled]

Spolsky: It depends, I mean there are people that have excellent one-hour presentation. Seth Godin, for example. There's a lot of people who will do a fantastic hour. Ah. It's not that you want it to be longer, but if you did ask them to do it in half an hour, uh, you'd miss out a lot. You just wouldn't learn as much. 

Atwood: Right. But it takes a lot of skill to pull of those long -- it's just like writing. If you can write something really long, pull it off --

Spolsky: -- It's a lot of preparation --

Atwood: -- extremely good.

Spolsky: You have to write probably twenty pages of writing, to equal an hour of speaking. If you were just to read off of those pages.

Atwood: Well to me it's almost like programming. In all cases try to err on keeping your code, like, small, because large code is just -- gonna have more bugs, more problems, more things that can go wrong with it. So if you keep it short you're going to be doing better. On the whole. So, uh, it's a good generalised piece of advice to give people on making presentations I think. Stay short, stay small, stay punchy.

Spolsky: So if you're doing an hour then you wanna, and you know you're gonna need about 18-20 pages for that hour, you wanna write forty pages and you wanna cut, and delete, and consolidate, 'till you've got it down to about 18 pages. And then that's going to be an awesome speech. Hey, ah, speaking about conferences and such, is that all you have to say about the Mix, uh, thing? 

Atwood: Er, not yet, we're in discussions to be, uh.

Spolsky: It's not official.

Atwood: To be part of Mix in some way, but yeah. We'll talk about that more, later I think, on [garbled]. But I did want to get to the next feature, because we actually had two feature roll-outs

Spolsky: You had the bounty.

Atwood: The other ---

Spolsky: And the news.

Atwood: The bounty is one. The other one is, um, one thing that people complained about, and I empathise with too, is that you can't really tell when people have replied to you on Stack Overflow. There's nothing poking you and saying "Hey look! This guy answered your question!" Or "These people commented on your stuff!" So, now we do. There's a little envelope icon. It's sort of cribbed from Reddit actually, that lights up next to your name, and actually it's lighting up --

Spolsky: It's the sluttiest envelope icon in the world. Cribbed from Reddit. It's probably a font of a font somewhere. It's just everywhere, that little envelope icon. 

Atwood: Yes. In fact, mine was lit up, and I just clicked on it, and I had a response two hours ago. So, uh, it's common.

Spolsky: This is awesome, you can give it like a range of dates...

Atwood: Yeah. Yeah! It's fun. It's something that Jeff worked on, that was a major thing...

Spolsky: Woah! 

Atwood: This is going to fold into our email eventually. For people that want email notifications of changes.

Spolsky: What if I go all the way back to the day I was born?

Atwood: You can't.

[Laughter]

[18:42]

 

Atwood: Another thing I can actually talk about is we have a bit of a {garbled}, because the way we chose to store what we call "posts", which is questions and answers, and then revisions-- so we have two tables, we have posts and revisions.  For viewing a post, the revisions we usually care about are the first revision and the current revision.  So, there's basically pointers in the post table to records in the revision table.  And this sounds fine on paper.  It has a few downsides, in that for example if you retag something, due to the way we're storing the data, we have to store duplicates of the post and everything else, so it wasn't optimal from that perspective.  What turns out to be the big killer with this approach is just the fact that we're joining, all the time.  Like, in order to talk about a question, I need to go get the current revision, right?  In any case.  'Cause I need to figure out who the author is, and stuff like that, and that's all stored in the revision, because there could be ten revisions to a question.  You could have a revision, I could have a revision, Jon Skeet could have a revision.  So, it doesn't sound like much, this is like relational databases in a nutshell, right?  Just go do a join.  But it turns out these joins are unbelievably expensive.  I mean, if you want to do things --

Spolsky: Really?  Is it just not indexed right?

Atwood: No, believe me, it's indexed --

Spolsky: Oh, it's those memo fields, that's why.

Atwood: Uh, well, that's right.  There's some large strings attached to revisions.

Spolsky: Yeah.  Those always take longer, because they're not in line with the rest of the table, they're in some big blob storage place, and even the APIs never get them out directly, they always read them out like 64kb at a time.

Atwood: Yeah.  Right.  No, there's definately some semi-large fields there, depending on the size of the post.  They're variable-sized fields, of course, by definition.  But we find that even when you're dealing with just regular tables, joins are not free, by any stretch of the imagination.  Every time that you're doing a join, that's a very real cost to the query.  So, we're going to have to do a massive refactoring to fix this.  We're going to move a lot of data up into the post table, so that like 99% of the time when we're talking about the current revision of a question or answer, we'll just have to look at that one record in posts.

Spolsky: This brings up sort of a more general problem, that probably a lot of people have been thinking about.  Because if you look at those...  What am I talking about here?  A lot of people have developed these libraries that are less than relational databases for use on very very large, highly-scalable websites.  So, for example, I think Google's thing is called BigTable.  A lot of times they just have some big, gigantic, ultra-super-duper scalable name-value pair storage.  Think of something like berkley db, where, I can store things for you, I can do it pretty fast, but I'm not going to give you full relational capabilities, and in particular you don't get joins.

Atwood: Right.  I can appreciate why they do that, because I've worked with databases for a long long time, going all the way back to db4 or whatever it was called, dbase4, and then Fox Pro... I've done this for a long time, so I really understand how databases work -- not that I don't make mistakes, believe me, I do -- but it's just interesting to me that we built up the new database server, which is now ready to go, by the way, all this stuff is ready to ship finally, and I was doing some just basic benchmarking, so I loaded the stackoverflow database on our new database server.  So this thing is basically 50% faster than our old server, roughly.  It has 50% faster CPU, a lot more memory, faster bus, more level 2 cache, all that good stuff, so you'd expect it to do about 50% better, and--

Spolsky: Well, not if it was already maxing out, sort of.

Atwood: But it's not.  I already noticed that it's not even close to maxing out.

Spolsky: Well, wait, that's not what I meant.  What I meant was, you wouldn't expect it to be any faster.  For example, if you had more memory, it would be faster, but only if that means you could keep more things in memory, whereas if you weren't already using up all your memory in the first place, then that additional memory isn't going to help.

Atwood: Right.  You're always playing a game of "where's the bottleneck."  This is, hopefully, as a programmer, this is why I like programmers to mess with hardware, because you learn to play this shell game of like "oh, now the bottleneck's the disk, now the bottleneck's the memory, now the bottleneck's the CPU."  And you're really just trading off bottlenecks, at some level, right?  You can do compression, for example, compression's a classic example of trading CPU bottleneck for memory, which is usually a good deal, based on CPU speed.  But anyway, when I loaded up the stackoverflow database, and was running some comparison queries, I'd run them on my local machine, I'd run them on the new database server, and I'd run them on our current, live database server.  It's not really a fair comparison, because the live database server is live, although, in all honestly, our load is really -- our CPU graph is almost null.  We've gotten to the point where we're not using CPU at all.  We're running out of memory a little tiny bit, but it shouldn't have a dramatic impact on our results.

[23:32]

Spolsky: And this is... er, and this is serving ten million hits a month... ten million pages a month?

Atwood: Oh yeah. Yeah yeah yeah. It's doing it no problem. What I found was, so this is the interesting... the main difference between these machines is really CPU. Like, my desktop has a 3.5 gigahertz dual-core.

Spolsky: Mm-hm.

Atwood: The new server has 2.5, and then the current database server has 1.8. So you got the continuum of 3.5, 2.5, 1.86, [garbled] CPU, and I've found that for a lot of queries, unless you're just massively exceeding the amount of memory, like you're doing some query that's pulling, like, gigabytes of data...

Spolsky: Mm-hm.

Atwood: Um, it scaled really linearly with CPU. Like, I'm talking like, you go from like 100 milliseconds, to, to, like, 60 milliseconds on the new server, and the on my machine it'd be like 40 milliseconds.

Spolsky: Huh. Wait, why don't we shove your machine in the data center then?

Atwood: [Laughs] Well the shocking thing is people, when you talk about, you know..

Spolsky: How did you... wait, how do you even have a machine that's faster than the one in the data center?

Atwood: Well, it's just CPUs. Because server CPUs are not, you know, they're not like top end, like they go with the conservative...

Spolsky: Are they?

Atwood: ... process model. So Intel... the fastest CPUs aren't necessarily the server-room CPUs. In terms of clock speed. Um. But this is contrary to what you're told. You're told, "Oh, CPU's not that important on a database server. What you really want is lots and lots of memory, and super, super-fast disks."

Spolsky: Well that's 'cos that used to be the problem. 

Atwood: But I think for our database, because, first of all we have 24 gigs of memory now which is just a ton, ah that's six times more than what we have now...

Spolsky: How big is the database? If we have 24 gigs of memory how big is the database?

Atwood: Ah, gosh. I mean, oh disk? 

Spolsky: Yeah.

Atwood: As a backup? It's like, 3.5 gig...

Spolsky: OK, so its all in RAM.

Atwood: It's not quite all in RAM any more. One thing I've noticed is that certain queries, when I run 'em, like, they'll be just slow the first time. And the second time...

Spolsky: Because they... they have to page-fault in, eventually. But eventually most operating conditions eventually... everything just sits in memory. 

Atwood: Yeah. Right. No, for the most part it's in memory although I am seeing some cases where, uh, we're definitely paging now, in some form. Um, but I just wanted to talk a little bit about that because, a, I want to make sure that new database server is faster, and the good news is it is, it's not fifty percent faster as I'd hoped, but it's more like thirty-three percent faster, like across the board. Like in general, anything you're doing, any query on this new server will be about thirty-three percent faster. But also that, you know, CPU matters. 

Spolsky: Yeah.

Atwood: If your query's in memory, it's all about the CPU...

[talking over each other]

Spolsky: Don't forget that thirty pecent... a thirty percent improvement doesn't mean the average person gets their reply thirty percent faster. Because it's possible that several people have all submitted a query at the same time and you're waiting for all of them. So it might mean that you're... you know, a bunch of people have submitted and you gotta wait like five seconds, and now you're only going to have to wait like three seconds instead of five seconds while their things'll finish. And that might mean that the computer gets to idle faster, which may mean that an increasing number of people don't have to wait online at all.

Atwood: Sure.

Spolsky: Did you ever take a, uh, queuing theory class or study queuing theory in any way?

Atwood: I don't think so.

[26:30]

Spolsky: It's kind of interesting. There's a whole mathematics of, like, people arrive at the bank and they wait in line, and how long does it take them to get served and so forth. And, um, one of the interesting rule of thumbs that I remember as being, you know, not a formal result of queuing theory but something that you can kind of keep in mind if you're ever, you know, setting up a restaurant or, like, a, uh, line of people to get coffee at your coffee shop, or whatever, any kind of situation where there's a line serving people. Um, is that there's various measurements that you use in queueing theory. One of them is called "utilisation". And that's the percentage of the time that the people serving the queue are busy. So if you've got tellers at a bank, what percentage of the time are they working, what percentage of the time are they idle, or, you know, telephone operators, people to pick up the phone at [Unknown]. And um, ah, the total amount of, of time that they spend working divided by the total amount of time that they have available, that they're sitting at their desk ready, is called the utilisation. And one of the common results in queueing theory is that at a utilisation of 80 percent all kinds of things start to go wrong. And the... the length of the lines, the average amount of time that somebody waits in lines, tends to get really really bad if the utilisation goes above approximately 80 percent. And so obviously, umm, the number will be a higher percentage if there's a smaller number of... a larger number of small transactions, and it would be a lower percentage if there's a... etc, but the point is that, uh, kinda as a rule of thumb, and this is something you can measure for your own scenario, but as a rule of thumb, um, typically if you have a bunch of bank tellers and they're working 79 percent of the time, most customers will come in and not have to wait in line at all. but if you just get up to like 81 percent utilisation that they're busy, then the average customer might come in and wait fifteen minutes on-line. 

[pause]

Atwood: Wow.

Spolsky: Because there's a certain point at which... um, you know there's a certain... you can do this sort of mathematical simulation when you say "people are arriving with a certain probability at any given time" and that's something called the Poisson distribution... the probability that people will arrive if everybody arrives independently, and, uh, they tend to distribute according to a certain curve, and so there's a certain probability that a big chunk of them will arrive all at the same time. And, um, the... the, if the utilisation goes too high, you get to the point where, maybe you're serving... imagine a utilisation of 100 percent. You are serving everybody. But, you know, the fourth person to arrive in the day is going to have to wait for the first three people to be done, and the last person in the day is probably just going to get dealt with right away, but, but during the day you're gonna have all kinds of people who arrive and have to wait for hours. To - to wait their turn, basically. 

Atwood: Hm.

Spolsky: So, uh...

Atwood: I remember reading [garbled]. This has a lot of operating system implications, 'cos isn't the way the scheduler works in the operating system really critical to overall performance? Like...

Spolsky: Kind of, yeah.

Atwood: I remember even in Vista, like they improved some things about, like, you could get priority scheduling, like if you're doing multimedia playback so you can't get kicked out of the queue, and have stuttering, and, and...

Spolsky: Right, right...

Atwood: This is really the art of divin... designing an operating system, isn't it? This making sure that... it's just playing the shell game with bottlenecks, right? IO, disk, CPU, and making sure that nobody's, like, starving, unless...

Spolsky: Right, right...

Atwood: ... something catastrophic is happening, in which case you're just... you're just hosed.

Spolsky: One thing I've noticed over the years is whenever is something's taking too long, I'll switch to another window, open a browser, and try to do something else! So you're kind of punishing the CPU for being slow or, your computer in general. You're actually making it a little bit worse, which is.. you're launching other apps to keep you entertained while your computer works on your first process and making it even more overloaded at precisely the moment where it needs some extra CPU power. 

Atwood: Well one thing I like about, you know, Vista introduced this new system performance metric -- which is a reasonable set of benchmarks, but what I liked about it that I thought they really got right -- so pay attention, something that Vista actually did correct, Joel, you may want to write this down...

Spolsky: I'm getting my fountain pen.

Atwood: ... it actually does the benchmark, and it, it actually takes the lowest number, is your score. In other words, wherever the bottleneck is, that number determines your entire score.

Spolsky: Mm-hm.

Atwood: So if you have a really slow disk, say you get a one-point-oh on your disk score...

Spolsky: Mm-hm.

Atwood: That's your overall score.

Spolsky: Oh this is the, uh, the...

Atwood: Which makes total sense because that's really how you should look at it, like, "Where is my bottleneck?" That's the thing I need to improve, so really incentivise you so... "Wow, one-point-oh on disk. I can have a four-point-three, all my other stuff is at four point three, if I just rid of that stupid slow disk." Um, and that was really smart and really the correct way, um, to look at performance.

Spolsky: Does anybody else... uh, I put Windows 7 on a laptop here, and it just freezes. The whole system just freezes. The mouse won't move, it's just like hard frozen. 

Atwood: Ooh.

Spolsky: Is that the laptop, or is that Windows 7?

Atwood: I haven't -- I have not heard of that. All I've heard is, like, people liking Windows 7. 

Spolsky: Yeah, that's what I heard.

[30:57]

Atwood: [Laughing]

Spolsky:

But I don't like it 'cos it makes the whole laptop freeze. But the laptop may just suck, you know. It's a Dell.

Atwood:  Huh. Interesting. No, I dunno. Could be a hardware problem. It's beta software.

Spolsky: I- I mean... yeah.

Atwood: I'm not a big beta operating system guy. I don't really enjoy--

Spolsky: Well it's a laptop that I hardly use for anything but it's the laptop that we use here at the office for demos, when somebody has to put on a demo on the main, uh, on a main stage.

Atwood: Well it'd be nice -- one thing I am looking forward to is I view this as just a polished version of Vista. 'Cos Vista just lacks so much polish, so it sounds like they're going to get the polish right this time, from what I'm hearing, which is nice. 

Spolsky: Yeah.

Atwood: So finally people can get off of XP. I think my concern was that XP is freakin' ancient. I just am -- I deeply am concerned with people who are comfortable running a 2001-era operating system. And granted it's been patched--

Spolsky: It's secure. Er, it's... it's still faster than Windows 7. 

Atwood: Well, it was, ha -- you know what the mem -- you know what the minimum memory requirement was for, for XP, do you remember?

Spolsky: 512?

Atwood: 64. 

Spolsky: Ha! Awesome.

Atwood: Yeah. 

Spolsky: That's not true, I -- that may be the official requirement you really needed probably 256 or 512.

Atwood: Actually no, I take that back, it was 128k I believe.

Spolsky: Ah, OK. 128. So you probably needed 512 to be really, actually, happy. But let's put it this way. I have this little laptop that I got a year ago, it's a little dinky... it's the Thinkpad X61? It's like the, the super-lightweight, tiny little low-end Lenovo Thinkpad. And it's awesome, I take it with me on a plane, I take it with me everywhere. I love it. It's running XP. Do I really want to put Windows 7 on that, or is that just going to bog that poor thing down?

Atwood:  Uh, supposedly it does better on like, netbooks and stuff. And by the way, when I said kilobytes earlier I meant megabytes. My brain is malfunctioning.

Spolsky: That's OK, we -- hopefully --

Atwood: You guys just mentally translated --

Spolsky: Ah, nobody can keep track of any of that stuff anyway, anymore. Megabytes, giga-, tera-, zetabytes... ah, who cares?

Atwood: Do we have any listener questions this week?

Spolsky: Ah, yeah, I uh, I messed up again and didn't prepare them because I was sort of busy right before th... but let's see. Let's pop something up and see what comes up. [Laughing] It worked pretty well last week. Although I did get an email. I got an email saying -- did you see that email?

Atwood: I did. But I let you read those. I figure you're in charge of this part of the podcast.

Spolsky: OK. Um. Let's see. Ah, there's a -- there's a whole bunch. Here's one that I'm interested... comments on the Solid discussion.

[33:29]

Atwood: Yeah, let's do that, because there were a lot of -- there's a lot of feedback on that.

Spolsky: Yeah, I wasn't... I didn't really have a... the truth is, let me be completely fair here, uum, I don't -- ah, what's the word I'm looking for -- know anything about anything. [Laughs] Sorry, I really -- I did not do my research on what Solid is, what it talks about, and I didn't really want to overly criticise any particular principle of object oriented design. Or whatever. I just do often have a feeling when I'm listening to those people, especially based on the examples that they give, which now this week I was listening to Hanselman again and now for two weeks in a row they've brought up the stupid issue about whether a, uh, square is a subclass of a rectangle. But who the hell has classes for squares and rectangles? What kinda application is this, where you have classes for squares and rectangles? And so it made me think, you know I really do feel like, ah, and this is really the only point I wanted to make, is that a lot of the strong object oriented design kind of stuff that you hear about, ah, and that you hear from, is, um, doesn't seem like it's people who are writing a lot of code. So for example, um, there's this business of test-driven development, which, you know, you write the test first, and, and, not necessarily a test in order to provide QA, so to speak, but a test in order to say "I'm about to write some code, and I think it's going to do the following thing", and before I even write the code I write the test, and the test is going to fail because the code isn't written and then when I write the code it's gonna instantly pass because, uh, and that'll... allow me to always have a little unit test for that little piece of code that I wrote. And I thought about, um, some little piece of code that we were thinking of adding to Copilot. And this is just one tiny example and I can give you a million. But Copilot is this remote desktop application. And in order to make it work really, really well, under very low bandwidth conditions, it uses JPEG compression on the screen, which makes it a lot faster. And, uh, we use really strong JPEG. So you can't quite, I mean, all the text is blurry, but that's OK because it's a good trade-off, you're just trying to do quick tech support over the Internet and with somebody who probably has a really crappy Internet connection...

Atwood: Mm-hm.

Spolsky: ... and if the text is blurry it doesn't really reduce your ability to do tech support. But if the... it takes forever to display the page that does reduce your ability to do tech support over the Internet. So that's a good trade-off. But I sort of thought that, um, it would be cool, I entered a little bug into the bug database here, Fogbugz, Fog Creek, umm suggesting that maybe we give the user a switch to turn off the compression or to reduce the compression if they happen to be on a high-bandwidth connection and they'd rather have it show up clearly rather than blurry. And, um, so, ah, we'd probably still use JPEG compression, we'd probably just use like a lower level of JPEG compression. And, um, ah, and I thought about that and I thought about how you've got this screen that's showing basically, effectively real-time video, a real-time screen image from somewhere else over the Internet, and it's got all these JPEG artefacts on it. And I now need to write a function and a little button on the toolbar that's going to reduce the JPEG scaling. So whatever -- think about how much code you have to write, to make a button on the toolbar, and it's just going to change a parameter to the JPEG compression library, from a  37 to a 10, let's say. Right? That's all it's going to do. And so this is, you know, five, ten, twenty lines of code to implement this feature, let's say. But to implement the test, once, you have to somehow create a JPEG that is the same as this other JPEG that you have, but compressed at a different level, using some ... there is no way to actually construct this test in advance of actually running it. Or if you did it would be extremely hard, I mean it would take a lot of work to write some kind of test that's gonna know what that other machine you're connecting to, which would have to be some kind of simulator machine that generated certain kinds of simulated experiences, and then you'd have to, I guess, get your own JPEG library and hope that it's the same as the JPEG library that we're using, and let it do both kinds of compression and this is just... it would take way more work to write this test than to write the code. And this is I think a classic example of "You Ain't Gonna Need It" which is all this work that you put into that test, and yet this code is only doing ten lines of code. It's going to work. It's not, you know whatever, changing a parameter in some function call from 37 to a 9 is not really gonna fail, you're not really gonna have a bug there, necessarily. So, um...

Atwood: I always got the impression that these unit tests were written more for, like, encryption libraries, and, you know, more like core libraries...

Spolsky: Oh yeah, that's fine, and I'm all for it. When you say it's a core library. Anything that's basically doing manipulation on data directly, there's nothing real-time, there's nothing video, there's nothing GUI, there's nothing Webby, there's nothing HTML-ey... all those kind of things it makes a lot of sense but I mean who works on... that's a very finite number of apps that are like that, and, um, you know, a lot of times it's so much harder to construct the test than to... and I, I think it's OK that people then... I think it's great if you use test-driven development for your encryption library or your compiler or something like that. But when you start to get kinda religious about it because you're listening to the object oriented design gurus writing their books, and you start to try to do this all the time because you feel dirty or smelly... I think they use the word "code smell" to describe what you're doing if you're not doing exactly what they tell you to do [laughing] then you start to feel kinda guilty about that and sort of ... and maybe not for the real reasons. So, so I can definitely imagine going back ... hey, do you listen to Hanselman's podcast? Scott Hanselman?

Atwood: I have a few times.

Spolsky: Well, anybody that listens to... if this is the only podcast you listen to and you want a second podcast, that might be the second one to listen to 'cos that's a really ... I think a lot of developers will really enjoy that. And, um, but the second time he was ... Scott was talking about some person he worked with that was just driven to have 100% code coverage. Or 100% test coverage of all their code. Which, um, which he thought was a fine thing and which I think is probably a real waste of time and to me it sounds almost like, uh, like a mental illness. Right? Like if you were mentally, like compulsive, like obsessive-compulsive behaviour that's causing you to not think, but instead just make sure that you have 100% code coverage, 'cos that's not free, that 100% code coverage and all these tests that you wrote, you don't get that for free, you got that because you decided to spend time doing that instead of something else. And the time that you spent doing that may or may not be something that really adds a lot of value, you know what I mean? Like you may be able to add to the quality of the product... [Question starts] Sorry I started playing the thing too soon.

Atwood: [Laughing] Yeah, I think with Stack Overflow, certainly, I'm pro-testing, I'm pro- anything that gives you good quality product, but I think, again, it's like that shell game. You're playing a shell game, you're balancing resources, and like you said this stuff is not for free, so if you feel like you want to put your effort in the unit-testing basket then by all means do that but I think for me the way we --

Spolsky: No! I disagree! I say, you should pick the things that are most gonna benefit from unit testing or test-driven development, which are different things. And pick the things that are most going to benefit from that, and by all means then go ahead and do it. But there's a lot of other stuff where the bang for the buck you're going to get off of that is nothing compared with the bang for the buck out of other stuff you could be spending that time on.

Atwood: Well I tend to agree and that's where I was going to... I like to do things that result in a better experience of the site, whether it's answering support emails, following up on Uservoice, polishing some feature on Stack Overflow... there's very few cases where I'm like, okay, I'm going to sit down and write a unit test, and this is going to result in a measurably better experience for the average user that comes to Stack Overflow.

[ Unfinished -- 41:07 ]


Last Modified: 1/6/2012 8:50 PM

You can subscribe to this wiki article using an RSS feed reader.