A few days ago I was waiting for something or other to get sorted before I could proceed with the current round of bugfixing (the latest project is very nearly done) when I got into a discussion with
channelpenguin about the syntax of the for statement in C#. She was complaining that it felt wrong - it didn't really fit with any other statement in C#, what with having semicolons inside brackets. In fact, the for statement was the one I kept having to check the syntax of when I first started using C#, so I could sympathise with her.
The syntax is:
for instance:
meaning "Create an integer. Set it to zero. While said integer is less than 5, output it to the console and then increment it by one."
Now, I thought about how you could write the for statement differently, and I couldn't think of one that was any clearer without expanding it out entirely into its equivalent while statement. Which looks like this:
In fact, the more I thought about it, the more it seemed likely that the for statement was just a macro - and at compile time the for statement was translated into the equivalent while statement, and _that_ was compiled - so that the original syntax statement above would become:
(the braces around the entire of the expanded statement are to make sure that any variables created in the initialisers are local to that code segment)
Fortunately, Visual Studio comes with a neat little disassembler (ILDASM), so I compiled the two above bits of code and they both came out exactly the same, as:
which gave me a warm glow inside. It's nice to be right about something, especially when you're just stabbing in the dark.
Of course, I'm now curious about .NET's Intermediate Language (which is what that stuff at the end is). So possibly a look into _that_ is in order next.
[Poll #705445]
The syntax is:
for (initialisers; expression;iterators)
{
Stuff();
}for instance:
for(int myInt = 0;myInt < 5;myInt++)
{
Console.WriteLine(myInt);
}meaning "Create an integer. Set it to zero. While said integer is less than 5, output it to the console and then increment it by one."
Now, I thought about how you could write the for statement differently, and I couldn't think of one that was any clearer without expanding it out entirely into its equivalent while statement. Which looks like this:
int myInt = 0;
while (myInt < 5)
{
Console.WriteLine(myInt);
myInt++;
}In fact, the more I thought about it, the more it seemed likely that the for statement was just a macro - and at compile time the for statement was translated into the equivalent while statement, and _that_ was compiled - so that the original syntax statement above would become:
{
initialisers;
While (expression)
{
Stuff();
iterators;
}
}(the braces around the entire of the expanded statement are to make sure that any variables created in the initialisers are local to that code segment)
Fortunately, Visual Studio comes with a neat little disassembler (ILDASM), so I compiled the two above bits of code and they both came out exactly the same, as:
.locals init ([0] int32 myInt) IL_0000: ldc.i4.0 IL_0001: stloc.0 IL_0002: br.s IL_000e IL_0004: ldloc.0 IL_0005: call void [mscorlib]System.Console::WriteLine(int32) IL_000a: ldloc.0 IL_000b: ldc.i4.1 IL_000c: add IL_000d: stloc.0 IL_000e: ldloc.0 IL_000f: ldc.i4.5 IL_0010: blt.s IL_0004 IL_0012: ret
which gave me a warm glow inside. It's nice to be right about something, especially when you're just stabbing in the dark.
Of course, I'm now curious about .NET's Intermediate Language (which is what that stuff at the end is). So possibly a look into _that_ is in order next.
[Poll #705445]
no subject
Date: 2006-04-06 06:49 pm (UTC)no subject
Date: 2006-04-06 07:16 pm (UTC)no subject
Date: 2006-04-06 07:00 pm (UTC)Does C# have C-style break and continue? If so, I think the translation no longer works since "continue" in a for loop runs the iterators again but in a while loop they will be skipped. (The fix is to translate "continue;" to "{iterators;continue;}"
no subject
Date: 2006-04-06 07:15 pm (UTC)I have to admit that the majority of the time I have foreach rather than for, which is much more intuitive to use. But sometimes for is necessary.
no subject
Date: 2006-04-06 07:31 pm (UTC)Also, the reverse translator for the Psion 3a (REVTRAN) had an option to turn p-code back into while loops or for loops.
no subject
Date: 2006-04-06 08:01 pm (UTC)To cut down on words: #0,#1,... refer to the local variables 0,1,... $0,$1,... refer to the first,second,third items in the stack .locals init ([0] int32 myInt) # initialisation, set up counter in #0 at 0. IL_0000: ldc.i4.0 # Push 0 (int32) onto the stack IL_0001: stloc.0 # Pop $0 off stack (0) and store in #0 IL_0002: br.s IL_000e # Jump to TEST # loop body. Increment #0 by 1, and call WriteLine. LOOP: IL_0004: ldloc.0 # IL_0005: call void [mscorlib]System.Console::WriteLine(int32) # Write stuff! IL_000a: ldloc.0 # Push #0 onto the stack IL_000b: ldc.i4.1 # Push 1 (int32) onto the stack IL_000c: add # Pop two numbers, add, push result IL_000d: stloc.0 # Store result (top of stack) in #0 # Test for exit condition. TEST: IL_000e: ldloc.0 # Push #0 (counter) onto stack IL_000f: ldc.i4.5 # Push 5 (int32) onto stack IL_0010: blt.s IL_0004 # If counter ($1) < 5 ($0) # jump to LOOP: IL_0012: ret # return to callerno subject
Date: 2006-04-06 09:24 pm (UTC)In this case it's not quite a macro, for reasons pointed out elsewhere. Unless C# happens to have python-style 'continue' blocks, in which case it'd map to a macro that spans the entire for loop and places a 'continue' block at the end.
The reality of the situation is more likely to be that the compiler recognises the syntax of the for (;;) loop independently of the syntax of the while() loop and translates them both to equivalent structures in the compiler's intermediate representation.
The real thing that's abhorrent about the C 'for' construct, from my point of view, is that (syntactically) there are three statements, with entirely different usage semantics, but visually entirely undistinguished from each other except by position. You can remove the code inside the '()' and drop it elsewhere in a program and it'll be a syntactically valid sequence of statements. Same code, entirely different semantics. A sensible syntax would identify them independently in some way.
The only thing that 'for(;;)' has going for it over a plain and simple 'while ()' is that it lets you put that S3 up at the top of the loop body rather than at the end. If it wasn't for that... its days would be numbered.
no subject
Date: 2006-04-06 11:31 pm (UTC)no subject
Date: 2006-04-07 12:30 am (UTC)in a while, you have to put S3 (the increment.) at the end.
no subject
Date: 2006-04-07 01:34 pm (UTC)no subject
Date: 2006-04-06 10:17 pm (UTC)no subject
Date: 2006-04-06 11:33 pm (UTC)no subject
Date: 2006-04-07 08:27 am (UTC)(IIRC, continue in a for loop means 'skip to the iterator part' but in a while loop it means 'skip to the test-expression part' - I imagine it compiles to the equivalent of 'goto the closing curly bracket of the loop'... but I hardly ever use continue anyway.)
Interesting to see that for and while loops compile exactly the same in C#, as they should. Fancy a quick test of continue too to see if I'm right?
* Kernighan & Ritchie, "The C Programming Language", the definitive tome on C. Although it is a mere pamphlet by comparison with definitive tomes for other languages.
no subject
Date: 2006-04-09 09:56 pm (UTC)I can, however, make them behave exactly the same by wrapping the iterator in the "while" in a finally block, like so:
int counter = 0; while (counter < 5) { try { if (counter ==3) continue; Console.WriteLine(counter); } finally { counter++; } }which doesn't compile to the same code as the equivalen "for" but seems to have the same effect.
no subject
Date: 2006-04-10 08:21 am (UTC)no subject
Date: 2006-04-10 08:25 am (UTC)no subject
Date: 2006-04-10 09:05 am (UTC)I'm not entirely averse to goto. I've seen some hairy deeply-nested loops where a judicious goto to just get the hell out of there would not be as bad as the morass of continues, breaks and flag variables you need to get round it. Mind you, probably better to turn that section in to a function and just 'return' from the depths, which would transform a mess in to what's widely accepted as perfectly good style - which I think it what I'd tend to do when others might use continue.
no subject
Date: 2006-04-10 09:26 am (UTC)bool passedValidation = true;
try
{
DoSomeValidation();
}
Catch (ValidationException ex)
{
MessageBox.Show(ex.Message);
passedValidation = false;
}
and then I can have twisty-turny mazes of validation nested as deep as I like, and jump back to this level instantly (well, with a 0.1 second lag for exception throwing) as soon as I hit something that's invalid.
no subject
Date: 2006-04-10 09:33 am (UTC)no subject
Date: 2006-04-10 10:34 am (UTC)Should you be worried about that kind of performance loss then you will want to optimise for it. Personally, I'm not running that kind of app - I suspect I've spent more time typing this reply than I'll ever save in my life through optimising away function calls.
no subject
Date: 2006-04-10 11:03 am (UTC)So if, say, that code was in the main loop of your Shiny New Brute-Force Sudoku Solver, an order of magnitude improvement would be well worth having. (Although you still have another half-dozen or so orders of magnitude to somehow knock off the computation time to get it complete before you die. Ah well.)
The key, as ever, is recognising the one situation from t'other.
The right sort of optimisation can be an astonishing thing to behold, particularly when the function call you're optimising away is to an API. I once wrote some code to display large 2D datasets on screen. My initial attempt simply handed each data point (suitably scaled) to the drawing API in turn and told it to draw a line to there from the previous point. It took more than 10 minutes to refresh the screen on a smallish test dataset - not good when a key point of the app was to be able to fly around in the data. So I optimised by tracking the highest and lowest point for each horizontal pixel before handing it to the API, and bingo - pretty much instant, even for the larger datasets I was using.
no subject
Date: 2006-04-10 06:55 pm (UTC)But that was then, etc.
I know what you mean though, optimising the right things in the right way can dramatically speed things up. I sped up some code a while back by two orders of magnitude by storing trimmed values for text rather than trimming it when it was being compared. Who knew that trimming text took so long?
no subject
Date: 2006-04-07 06:06 pm (UTC)If you compile a for loop "as is", you have to have (and this is in english I'm afraid, as my x86 assembler consists of 'NULL'...) basically [initialisers] [check] [do something] [check condition] [increment] [do something] [check condition] [increment] ... [check condition] [end instruction block], which is functionally the same as what you've just discovered :). The --funroll-loops option "funrolls" the loops, so you get [initialisers] [do] [advance] [do] [advance]....[end], which is a lot faster. Or something like that.
At any rate, for, and while are just GOTO in disguise :).
no subject
Date: 2006-04-07 10:32 pm (UTC)To be precise, they're IF and GOTO in disguise.
But then pretty much all programs break down mostly into MOVE, BRANCH and COMPARE commands.
Oh, and how does funroll-loops work with loops that can work different numbers of times?
no subject
Date: 2006-04-07 10:38 pm (UTC)Hence, http://www.funroll-loops.org. It has a nasty tendency to break lots of stuff. Me, I stick to -O2 (second level of optimisations) in 'home' code, and -O1 for important stuff. It's fast enough that you notice the difference, and yet not so insanely optimised that you notice the need to punch in assembler from the front panel.....