I was just starting to read the very first pages of the great book “Beautiful Code” (by Andy Oram, Greg Wilson) this morning as a refreshment and as an indirect result to Scott Hanselman‘s list of basic must-read books (although it didn’t include this book – BTW, I read parts of it before, and I don’t remember why I stopped), I had to write this post.
To all of you guys thinking in DDD, TDD, MVC, ASP.NET, shiny AJAX and RIA (Flash/Silverlight) controls, GC, SharePoint, Rails, Python, ORMs (NHibernate, SubSonic, Linq2SQL,…), etc.. etc… Please get back to basics and read the PLAIN OLD C CODE I’m quoting in this post. Hopefully it’s not illegal to quote such!
The code is a VERY simple RegEx (Regular Expression) matching. Some code that you send a pattern and text to match. It returns 1 if the text matches the pattern and 0 otherwise. The pattern domain is a very stripped version of RegEx than only includes:
- c => (Any character) matches any literal character c
- . => (Period) matches any single character
- ^ => matches the beginning of the string (meaning there must be beginning, not empty string)
- $ => matches the end of the string (meaning when tested, there should be no more characters in the string)
- * => (appearing after a character) matches zero or more occurrences of the previous character in the string
The big story of this code is included in the book “Beautiful Code” (It first appeared in another great book called “The Practice Of Programming“) – Do yourself a favor and get a copy of both if you can afford it! (Or search your company library, otherwise ask them to get that in the next books batch)
1: /* match: search for regexp anywhere in text */
2: int match(char *regexp, char *text)
3: {
4: if (regexp[0] == '^')
5: return matchhere(regexp+1, text);
6: do { /* must look even if string is empty */
7: if (matchhere(regexp, text))
8: return 1;
9: } while (*text++ != ' ');
10: return 0;
11: }
12: /* matchhere: search for regexp at beginning of text */
13: int matchhere(char *regexp, char *text)
14: {
15: if (regexp[0] == ' ')
16: return 1;
17: if (regexp[1] == '*')
18: return matchstar(regexp[0], regexp+2, text);
19: if (regexp[0] == '$' && regexp[1] == ' ')
20: return *text == ' ';
21: if (*text!=' ' && (regexp[0]=='.' || regexp[0]==*text))
22: return matchhere(regexp+1, text+1);
23: return 0;
24: }
25: /* matchstar: search for c*regexp at beginning of text */
26: int matchstar(int c, char *regexp, char *text)
27: {
28: do { /* a * matches zero or more instances */
29: if (matchhere(regexp, text))
30: return 1;
31: } while (*text != ' ' && (*text++ == c || c == '.'));
32: return 0;
33: }
Now, I am not doing to discuss the code. Not just because it’s not that hard to analyze (hey, but I didn’t get it from first look either), but as the discussion in the book itself is very comprehensive, and I’d advice you again to get it (and also copying as much from the book sure will not be allowed by the authors).
Instead of this, I’ll give you a very simple example to trace:
match("c*xyz", "cxyz" );
Clearly this matches. The character c occurs once in the string and the part xyz as well. It should return 1, but tracing it is the most interesting part.
If you are interested in discussing your trace of this example, or your example, just drop a comment here. I’ll be very excited to share in this discussion, as it’ll not be quoting or repeating the book notes :).
NOTE: This exact code is not the only alternative, you can read more in the book.
So what? I’ve seen A LOT OF WAY BETTER CODE
It’s not about this specific example, but about CODE READING
OK, if you search in the algorithm specific books with all their more interesting usage of recursion and pointers sometimes and/or advanced data structures, or even in this same book but advanced chapters, you are likely to see WAY MORE IMPRESSIVE EXAMPLES of interesting code. It’s just this is the one I was reading this morning when I thought: We need so many such examples of this kind of old code, as we’re kept far away from all that, and we really should get ourselves a treat by reading such codes.
I’d even go further and say read code in languages you don’t even know. Of course when demonstrated in blogs with clear discussion. Reading the posts like “Doing {..the very big task..} in less than …. lines of code with {..any language..}” is really a very important practice to keep from time to time.
More Modern Code: Reading .NET Open Source Code With Discussions Provided
Scott Hanselman has a series he calls “Weekly Source Code“. In this series he discusses many .NET open source projects that are used in practice, like all the managed DIGG, Facebook,etc…. API wrappers, some Visual Studio and Live Writer Plugins and many other examples. He demonstrates some of the interesting parts of the code and goes into great discussion about such. If you don’t have Hanselman‘s feed in your feed reader list yet, you really should do!!
I’m not sure if I’ll keep posting such examples (from different sources of course), but it really sounds like a good idea! :D :D
Related posts:
- The Easiest Way To Write Async Code – Reading #FunnelWeblog Code
This one is a Back-To-Basics style post. Last month, I... - Videos from NDC 2009: SOLID Principles, Legacy Code, WCF, Software Design,…
Here’s another email from the internal mailing list of Injazat... - Twitter OAuth, Persistent OAuth, TweetSharp: Presentation & Code Nuggets
This is a PowerPoint Presentation (and extraction of the contents)...
CV Download (.docx)
Google Reader Shared Items
LinkedIn Recommendations
Twitter Updates
no comment untill now