Close

Copying & Repetition

You ever hear parents complain about their kids TV habits?  “Oh my god! If I hear Blues Clues one more time! Timmy plays that damn video over and over and over.”  What Timmy is doing is learning.  Timmy probably also mimics his parents and siblings actions, copies their speech patterns, observes their habits, and repeats them over and over.

Copying other people and repetitive training is the foundation of education, but in today’s education this has been thrown out in favor of “conceptual learning”.  The idea of conceptual learning is if you expose someone to the concept of a subject then they’ll have a higher more refined understanding of the topic than simple copying and repetition (what they call “rote learning”).  The reality is conceptual models of education simply find students lucky enough to naturally know the topic, and then leave the rest to fail and flounder.

In the united states, there is even a slight racist tinge to the attitude of conceptual vs. rote education.  I’ve heard many people say that “Asians really can only copy others because they use rote education in school.”  If you’ve spent any time studying Asian art and culture you know this isn’t true at all, and is a very racist attitude.  Whether it’s the Ruby Programming language, or BABYMETAL, or Old Boy, it’s entirely wrong to think that Asians are unoriginal little robots because they learned by rote.

There’s also a strange fear associated with rote learning that says if you learn rote you’ll somehow be less “creative”.  The problem with this is that nearly every creative thing you do requires rote practice.  The idea that I’m going to learn the major scale on a guitar by just learning the concept of a major scale is laughable.  Nobody who teaches music thinks that.  I learned guitar from repetition and copying other guitarists.

Painting might be the next discipline someone who believes in “concepts” puts forward as an example of avoiding rote learning.  Again, there’s a very long history or repetitively copying the works of other artists. There’s even a term for it: “Master Copy”.  Every great artist and almost all art schools have copying other artists as a way to learn to paint or draw.

If doing rote copying turned painters or musicians into unoriginal robots then all of them would be that way.  Painters and musicians are frequently put forward as the pinnacle of creativity, so clearly rote copying doesn’t impact your originality.  In fact, the dividing line between amateur and professional is how much they practice, and practice is repetition. Artists do small studies in a formal way. Musicians play scales their whole life, again repetitively copying.

How about writing?  Again, you learn to write by first copying the alphabet, then small stories, then trying to write on your own, and reading and trying to emulate your favorite authors.  Copying and repetition is all there.  Memorizing a poem is copying and repetition.  Reading and pulling out quotes and phrase structures is also copying and repetition.  Every author who is any good copies other authors and repeatedly writes almost obsessively.

Martial Arts, Dance, Singing, even Mathematics is full of copying and repetition.  Denying the role of these two practices in education denies what is a foundational aspect of human learning.  This is even the foundation of non-human learning, so why is it that people in the computer science field think there is no role for copying and repetition?

Rote in Computer Science Education

Copying and repetition is necessary in education because it builds instinctual basic skills someone needs to understand the more abstract conceptual parts of a discipline.  Nobody thinks you can memorize all of Jazz, but they definitely know that if you can’t instinctively play a scale then you’re probably not going to be able to play Jazz.  Nobody thinks you can memorize all of art, but if you drawing or color isn’t instinctual then you are going to struggle.

I believe Computer Science education could benefit greatly from copying and repetition at the beginner level and possibly later.  Copying is how a vast majority of programmers learned to code, but many CS educators deny this fact.  If you’re imagining yourself at 12 trying to learn to code, then I’m betting you had either a book or website with code that you copied and made work.  This should just be how we start people in programming, and not the current method of conceptual “weed out” classes.

Repetition is a mostly un-researched aspect of CS education that I’d like to explore more.  I believe that repetition happens naturally if you have copying as a base part of the educational experience.  However, I feel that drilling and repeating aspects of a language that need to be instinctual would improve retention.  For example, if students had to memorize all the lexemes and syntax structures of a language while they’re copying small working programs.

I think the main reason why this is ignored or vilified in CS is the same reason that most programmers simply can’t teach:  They are so far removed from their beginner experience that they forget that they actually learned to code via rote learning.  We see it all the time when a programmer attempts to teach non-developer and immediately tries to get them to use Vim and write C code.

The experienced programmer has completely forgotten the nights they spent repeatedly copying other people’s code and writing and rewriting buggy code to make it work.  To them this isn’t “rote” because they were so deep in it that they can’t see all the implied rote work actually being done.  They were also 10, so their brains were very bad at meta-cognition and can’t really say why they thought anything, so how can their recollection of their self-education possibly be accurate?

Hopefully Computer Science will adopt the educational style I’ve found in Music for beginner, and painters for intermediate developers.   I believe an early training that involves a mixture of rote (scales, chords, ear training) followed by copying and modifying (learn a song and try to improvise) will benefit beginners.  For intermediate programmers I think the Painting style of education would work well:  copy master works and create your own studies of simple subjects.

Adopting these two models would make CS accessible to more people, and make it easier for beginners to transition to intermediate and then advanced skills.

Killing Magic

I’m sitting with a friend who is an accomplished musician.  Record deals, multiple albums, and you’ve probably heard her songs on a TV show or commercial or two. She tells me that she doesn’t want to teach music because she’s afraid it would lose its magic.  There’s a mystical mystery about how she makes music and she’s afraid she’ll ruin that special quality if she has to figure out how she does it.  It won’t flow the same.

My response was something that I’ve believed my whole life:  “Magic just hides something’s true beauty.  It’s a con.  A trick that makes you love the magic rather than the real thing.  Once you actually learn how it really works, sure, the magic goes away, but then you get to fall in love with the beauty of the real thing. Real things are always simpler and more beautiful than the magic hiding them.”

Or something like that.  I probably actually sounded a lot less cool than that, but that was the idea.  I’ve found that magic just obfuscates and blurs what I’m really seeing.  Whether that magic is an accident of my perception of reality–or an actual sleight of hand by someone else–doesn’t matter.  What does matter is once I strip the magic away, and find the real simple principles hidden by the wizard, I see the real thing is better.

Of course sometimes I strip the magic away and find that the real thing is an ugly turd hiding in a golden box.  A lot of programming languages and technology are like this.  There’s all this bluster and flourish pushing a magical view of their benefits.  Then I dig a little and this magic simply hides a terrible design, poor implementation, and random warts.  It seems everyone in technology aspires to nothing more than creating enough of a code mannequin to hold up an invisible emperor’s gown.

One of the reasons people resent my opinions on technology is I have an ability to crush their fantastical magical views of technology.  It’s hard to be an Apple fan when there’s a guy pointing out that they frequently allow developers to invade their customer’s privacy, stole wages from employees, and make shitty  hardware that crashes and reboots if you don’t log in fast enough.  You can’t be enamored with Python if someone points out that its APIs are constantly asymmetrical and that Python 3 has a shitty UTF-8 strings implementation.

My mission in life has been to illuminate magic to expose the ugliness or beauty it hides because I believe magic enslaves people to others.  With magic you can convince them of almost anything, and even change the magic and they’ll keep following the wizard’s edicts.  Stripping the magic away gives people the freedom to choose what their reality will be, rather than rely on someone else to define it for them.

A key element of this mission is education.  I proved with my books that there really is no magic to learning to code.  The people who could do it weren’t special geniuses. Almost anyone could learn to do it given enough time and the right learning material.  Once it was clear that programmers aren’t special, it freed others from the magical aura surrounding programming and opened the practice up to a much wider range of people.

Education then becomes the practice of breaking magic to expose reality.  I study a topic and figure out how people are really doing it.  I find all the tricks they use, strip away the things that are just bluster and showmanship, find the lies they use to puff up their personas, and then teach the simplest real version of the topic.  This then opens the topic to a much wider range of people who can now enjoy it and improve their own lives.

Many times the practitioners aren’t purposefully trying to hide what they do because they don’t even know how they do it.  Most practitioners simply cargo cult a set of random practices they’re sure are the secret sauce.  Usually these secret practices are nothing more than extraneous rituals getting in the way of the real task at hand.  This educational acetone sometimes embarrasses these practitioners since nobody wants to be seen as believing in pointless rituals and magic.  That’s fine, but really they should be happy to find another path to what they love.  One that’s not full of obfuscation and rituals that only serve to enslave them to a limited palette of skills.

 

Learn More Python Rough Draft Up

My move to Miami has pushed out the deadlines for most of my books by a month, so May is when the majority of the content for Learn Python 3 The Hard Way will drop in May.  I’m done with the editing round with my publisher so the PDF will drop later today.  I’m also toying with doing an ePub but I swear if one person using a janky Linux ePub reader complains about the meta-data being wrong before telling the project to fix their meta-data I’ll pull it down.  Life is too short to convince angry Linux ePub developers to fix their code.

I’m also going to try my future book writing process starting today.  I’ve wanted to incorporate chat into my book publishing process but haven’t really found a chat I liked.  The Gitter chat seems like it’d work pretty well so I’m going to try that on the rough draft of the Learn More Python The Hard Way book.  You just have to go to https://gitter.im/lcthw/more-python-help  from the top of the book and you’ll be able to chat with me and everyone else.

If this works out then future books will be released this way:

  1. I hack on the idea until I’ve got a rough draft going.
  2. I post the rough draft, and put a room for the book into the LCTHW Gitter.
  3. I’ll hang out in there while I work on the book, answer questions, and change the rough draft based on feedback.

My goal is to get earlier feedback from people on how my exercises work and also give people free access to early releases.

 

The End Of Coder Influence

I get an email from someone who tells me that Reddit has decided to remove my book from their list of suggested readings for Python until I update the book to Python 3.  They made this decision about two weeks prior to when I received the email, so I went to look at my traffic and sales to see if there was an impact.  Weirdly, my sales were up and my traffic was about the same.  It had no impact.

Once a year I go through my Python book and I try to convert all the code to Python 3 as a test.  I do this with the eye of a total beginner, looking for things that will trip them up and cause problems.  Bad error messages, confusing syntax, broken libraries, and inconsistencies.  Every year I run into nearly the same problems:  strings are difficult to use, error messages don’t have variable names, libraries don’t really help with strings, and there’s too many inconsistent string formatting systems.  So I decided to see again what it would take to make my book Python 3 and ran into the same issues all over again.

To put it bluntly, the reddit community responsible for teaching beginners to code censored my book as a power play to get me to force Python 3 on unsuspecting beginners.  The language does not work for them, and they were attempting to use their influence to enact change in my books, rather than use that influence to improve Python for beginners.

And it didn’t work.  I still had the same sales and the same traffic.  I actually think if all Programming Reddit rose up and demanded Python 3 have better error messages regarding strings (a minimum usability bar) they would be ignored too.  In fact, I kept seeing over and over people pointing out blog posts, reddit threads, HN threads, and tweet storms as if these were highly influential which then did nothing.

A few days ago I went through another test of Python 3 and ran into the same problems.  I get enough people emailing me about Python 3 that I decided I needed to work out a list of reasons why Python 3 is broken for beginners as of today.  Originally I was going to write it fairly simply and not worry about appeasing the coders, out of fear they would retaliate like they always do and boycott my book even more.  But, I remembered that after countless blog posts about how terrible of a person I am and how terrible my books are, I still end up helping millions of people a year and still have the same sales.

I decided to just write what I felt and fuck whatever programmers think.  I wrote it, put in a couple of jokes and trolls, and then posted it.  Fuck it, I have a cold and don’t give a fuck.

Immediately people started insulting me, telling me I’m wrong (yet not reading the post, LOL). Then the HN posts start, then Reddit.  I don’t read those so people shove them into my email and Twitter stream.  I was tired and not into defending myself so I just deleted Twitter off my phone and go sleep some more.  Enjoy the sun.  Did some painting.  Hung out with friends.  Who gives a fuck about what a bunch of angry lonely coders think about my thoughts?

Yet, here’s where everyone I know becomes deathly afraid of the coders.  These groups of programmers used to have large sway over what was successful and chosen, but at the same time were horribly uninformed about basic computer science.  They ran to Node.js because of “events are better than threads” and had no idea Hoare or coroutines existed.  They manually went to hand convert all Python 2 code to Python 3 code, rather than just asking why the Python 3 VM can’t just…run Python 2 code too.  Then they believe the mega load of bullshit that this is impossible despite all proofs and evidence stating otherwise.  For all their claims of superiority for having once bought a copy of The Art of Computer Programming the previous generation of programmers are sadly uninformed about basic shit.

We all feared them, because their incredibly uninformed opinions and complete lack of humor or human decency could sink or swim entire companies.  Get slagged on HN and you’re done for.  I’ve heard of VCs actually threatening to strip away funding over bad HN reactions like HN is on the same level as the food critic of the NYT.  So what was going to happen to me?

Honestly, I’ve been trying to get out of the technology industry since 2008.  This industry sucks, and largely because of the abusive previous generation of programmers.  My goal has been to just make their influence on my life as small as possible so I can go on doing things I love like painting.  Fuck them.  But, a man’s gotta eat so I keep doing my work so I can make enough of a living to keep helping folks and doing what I love.

What are the results of their insane hatred of my latest stance against Python 3?  Am I doomed to never have any more sales again?

Nope.  Same traffic.  Same sales.

I believe that the influence of the previous generation of programmers is largely gone.  I can’t exactly say why, but I think it’s because they consistently back terrible ideas over and over.   They also tend to have no idea what will be successful or not.  The reason is they base their opinion of a technology on superficial things related more to whether the tech fits their tribe than its actual merits.  When my book first came out the HN crowd and other “professionals” said it wouldn’t work.  Same for many successful startups, technology, and ideas.  Meanwhile, the things they do back end up being terrible and we all regret following their hive mind.  Can anyone say OpenSSL?

I also believe the newer generation of programmers are more well rounded and have a general distaste of this kind of tribal fascist bullshit we have in open source.  I can’t really prove that, but it’s a feeling I’ve been having for a couple years now.  This next generation is different. I just can’t quite say how other than they seem to not believe the same things as the previous generations.

About a year ago I stopped reading HN and Programming Reddit because of this.  I don’t worry about the vindictive assholes out there who feel any questioning of their tribal beliefs is an affront to their person.  I now think the actual influence of the hive mind on anything outside of the tiny little set of Silicon Valley Programmers Who Read HN bubble is nothing.  If you think their influence matters then either you’re working on something as insignificant as they are, or it really doesn’t matter and you should just ignore them and move on.

Keep making cool stuff and speaking your mind counter to the hive.  I think that’s the future generation’s take on programming, and I fully endorse that message.

Taking Down Tim Hentenaar

There is a blog post by Tim Hentenaar that says that people should not read my book, Learn C The Hard Way. It has the title “Don’t Learn C The Wrong Way” and it asserts that I am teaching C the wrong way, with a few examples as to why. The problem with Tim’s post, is that Tim actually doesn’t know how to teach much of anything, and is completely uninformed of the security defects that his own code has. In fact, Tim successfully demonstrates that he is actually a beginning coder who has no business telling others how to code. In this blog post I will simply take down Tim’s supposedly expert opinion by using his replies to me in an email exchange where he demonstrates his lack of understanding, and then tries to cover for it in the most laughable way.

First, let’s establish how much of an expert Tim thinks he is, and what he’s advising you, my reader to do:

“Recently, I came across an e-book written by Zed A. Shaw entitled Learn C The Hard Way, and while I can commend the author for spending the time and energy to write it, I would NOT recomend (sic) it to anyone seriously interested in learning the C programming language. In fact, if you were one of the unlucky souls who happened to have purchased it. Go right now and at least try to get your money back!”

That’s a very serious condemnation of my book, especially from someone who has never taught C, never written a book, can’t even spell “recommend”, and later demonstrates that he doesn’t have a clue about security defects inherent in C. So what are Tim’s complaints about my book?

Tim Has No Teaching Experience

The majority of his complaints about my book, Learn C The Hard Way stem from a lack of understanding in my (very successful) teaching method. To Tim, and most old school programmers, the way to teach something is to teach all of the topic at once in one huge chunk. You teach Make by writing a chapter on Make that tells the reader every single little thing about Make possible, and then demonstrate with some code. Here’s Tim’s statement to that effect:

“At this point, the only thing I can think is, “I’d just love for you to show me a damn working Makefile!” A novice will be thinking, “What the hell’s a Makefile?” as the concept of a Makefile has not yet been introduced.”

Then later he says:

“I don’t know how to set-up my environment, this “Makefile” thing pulled a Jimmy Hoffa, and now I have to use this Valgrind thing, after I go download it and build it from source. Great…”

The problem is, Tim didn’t read far enough to where I do explain how to make an environment, and misunderstood my purpose at that point in my book. I’m not teaching the reader to write a Makefile and start a project. I’m teaching them to quickly get their very simple C code to compile. My target readers are people who have a language like Python or Ruby but haven’t dealt with a compiled language before. But to Tim, this is insufficient because he thinks a beginner is like him and needs to know all of the Make to be able to use it.

This lack of understanding of an actual beginner is exactly why so many programmers are so terrible at teaching, or even writing basic software for non-developers. It’s not that a programmer is somehow emotionless or a “robot” like obnoxious nerd haters say. It’s that the majority of programmers have a far more advanced understanding of computing, and specifically the software they create. Through their path to that understanding have forgotten what it was like to be a beginner. This leads them to assume many things that just aren’t true. Such as, “Unless a beginner is taught every single aspect of Makefile construction they cannot use Make’s implicit build rules to build a basic C file.”

This means that Tim’s statements about how I teach are mostly invalid because he doesn’t understand how people learn to code. He’s never had to teach someone who’s just starting out so he thinks blasting them with a treatise on Makefiles is what they need 4 exercises into a course of study. By contrast, I actually sit with real people and have them go through my books, and then adapt the exercises based on where they get stuck. I also used to have comment sections on every page to gather information on how to improve exercises. Tim basically read K&R and wrote some crappy C code, which we’ll see shortly.

However, Tim’s rabid and obnoxious condemnation of my book isn’t his actual opinion. In private emails he says this:

“I don’t doubt the seriousness of your offer. In fact, one of my colleagues also read my article, and he and I were discussing it this evening, and he told me that he’s a fan of your writing style, and would love to see you write a really good book on C.”

Tim doesn’t believe my book is entirely irreparable and a failure as he states, and in private he says there’s only a few problems with it. He even offered to help me make it better despite his lack of experience writing or teaching. What he actually thinks is I should write it the way he would write it, then it’d be a good book for you to buy. Despite Tim’s complete lack of qualifications in programming, writing, education, or anything other than having a blog, he thinks that his opinion is so superior that I should rewrite my book to fit his ideas of education, not a student’s model of learning based on actually sitting with readers and helping them.

This kind of arrogance and hubris leads me Tim’s largest failing in his post, this code right here:

void copy(char from[], char to[], size_t n)
{
    size_t i = 0;

    if (!from || !to) return;
    while (i < n && (to[i] = from[i]) != '\0')
        i++;

    to[n] = '\0';
}

Tim’s claim is that this function here is superior to a function I had written called “safercopy”, but it has a critical buffer overflow that he actually attempts to defend in the most laughable way.

To understand Tim’s failure you need to see my original “safercopy”:

int safercopy(int from_len, char *from, int to_len, char *to)
{
    int i = 0;
    int max = from_len > to_len - 1 ? to_len - 1 : from_len;

    // to_len must have at least 1 byte
    if(from_len < 0 || to_len <= 0) return -1;

    for(i = 0; i < max; i++) {
        to[i] = from[i];
    }

    to[to_len - 1] = '\0';
    return i;
}

What sends most C coders into a tizzy about this code is it came from a thought experiment I was doing where I did code analysis on the K&R C book (the book by the authors of C). Many programmers took this as an offense to them (so rational), and so they would focus on how I said this function here (safercopy) was better than a similar string copy function in the K&R C book. The problem is, to discredit my claims that mine is better, they would play this little semantic shell game:

  1. “Your function is vulnerable to Undefined Behavior (UB) just like the K&R function.”
  2. They then write some example that uses a totally different UB from the hundreds available, not the buffer overflow UB from a malformed C string.
  3. Then proclaim that, since both functions are vulnerable to UB, my claim of mine being safe (notice, not safER), are invalid.

This is a lot like you buying a new lock for your front door that’s really great, so you tell your friend about it. Your friend goes, “Pfft, your lock is no better than leaving your door open, I could totally break into it.” Your friend then shows up with a SWAT team battering ram and smashes the door in like butter and says, “See? Your lock is pointless. Just leave your door open.” You, and I, aren’t saying a better lock is completely foolproof and perfect. We are saying it is safer, not totally safe. Doors are easily bashed in using countless methods, right down to setting your house on fire. When we talk safety of the lock, we mean against lock picking compared to the other lock. To say I should leave my door open because there’s a thousand ways to get into my house is insane.

However, my function is more resistant to a common externally accessible vulnerability. This is something I would love to research, but UB has different levels of exploit surface that is accessible to an attacker from outside the running process. A C string is fairly trivial to clobber so that it is missing the ‘\0’ terminator. It’s a bit more difficult to make random pointers go wherever you want, but still possible. It’s nearly impossible to rewrite the C code for a running process to cause a math error and make a compiler skipped a portion that was considered UB. When studying the security of C code we tend to just assume all UB is the same and don’t make this distinction of accessibility to an attacker. Bad C coders then use this UB to simultaneously defend bad code (“All code is breakable with UB”) and condemn other’s code (“Haha, you’re triggering UB”).

When I say my function is safER, I do not mean it is totally invincible. That is impossible in C, and one of the reasons I tell people to not use C anymore. I now firmly believe that C is impossible to write securely and is designed with flaws that are irreparable, mostly because of the huge number of UB that can easily be triggered externally.

I mean that the code in this simple function protects against this one buffer overflow that is often externally exploited, while the original K&R code does not. That’s all.

Which leads me to Tim’s lack of understanding of his own code. Clearly, he thinks his code is even safer than mine, but if you look at it again:

void copy(char from[], char to[], size_t n)
{
    size_t i = 0;

    if (!from || !to) return;
    while (i < n && (to[i] = from[i]) != '\0')
        i++;

    to[n] = '\0';
}

You’ll see that he only has one size, so if that size is invalid for the to variable then you get a buffer overflow. Here’s a trivial demonstration of it:

#include <stdio.h>

void copy(char from[], char to[], size_t n) { 
    size_t i = 0;

    if (!from || !to) return;
    while (i < n && (to[i] = from[i]) != '\0') {
        printf("to[i]=%c, i=%zu\n", to[i], i);
        i++;
    }

    printf("i=%zu, n=%zu\n", i, n);
    to[n] = '\0';
}

int main(int argc, char *argv[])
{
    // thanks to @mistahzip for pointing out this 
    // is a better demonstration code
    char to[] = {'A','A','A','A'};
    char from[] = "XXXXXX";

    copy(to, from, 6);

    printf("Final byte is: %x\n", to[3]);
}

UPDATE: I had my original analysis wrong and I apologize for that. This is a better demonstration of the problem, and a new analysis showing the buffer overflow.  Thanks for @mistahzip for setting me straight and putting up with me being an asshole.  Just goes to show you, this shit is hard.

Tim’s code works as long as the strings are valid, however it’s incredibly common for C strings to be invalid, and that’s how you get the buffer overflows from C strings.  In this example, I’ve added printing so you can see what’s going on.  I use a malformed to array so that you can see, if it’s wrong then it gets overwritten with garbage.  In addition, he does to[n] which will always set the wrong byte if from is larger than to. Any C coder worth their salt would realize this, and in many ways this is worse than even the K&R version since it is more complicated.

When you do this on many systems you just get a bus error of some sort, but not all. Many times you’ll have the end of one string still be inside a valid region of memory, and operating systems aren’t even close to foolproof on protecting buffer overflows. If you’re using a system that allocates stacks on the heap (such as in greenthreads), then you’ll typically blast right past this variable and into another function’s code. That’s very dangerous and creates remote code execution vulnerabilities.

You may be thinking, “Yeah but I could write code that breaks your safercopy too!” Yes, like I said, C has so much UB it’s an entirely unsafe language and you can destroy anything. The point though is that this is an insanely common and trivial programming error that is just bad math for one parameter. Mine you have the size for both so you don’t make this error as easily. You can still make the error, but it’s harder than with Tim’s. With Tim’s you’ll make this error all the time.

Arrogance and Hubris

I told Tim about this really silly error in his blog post and did he do the right thing and at least admit publicly that I demonstrated a trivial error in his code? Nope, not only has he not updated his code, further demonstrating that he doesn’t know what he’s talking about at all, but he proceeded to defend his code with the most asinine of defenses:

“That’s why strncpy() / strlcpy() were written, but of course with all such things, there’s a performance penalty to pay. Even with length checking, it’s still possible to trigger UB, for example via integer promotion (i.e. strncpy() with a negative length, which I did point out) or having dest and src overlap. … It’s much harder to carry out a buffer overflow attack with SSP, DEP, and ASLR these days. Although there are always ways around the best intentioned restrictions.”

His function, in his own words, isn’t wrong because, again, you can use a totally different set of UB to cause problems so this easily externally accessible one isn’t a problem. And there’s also strncpy/strlcpy, so his function is still valid (what?). Oh, and also there’s, like, uhhh oh SSP and DEP that totally protect against these problems (even though they don’t and we see it all the time). These are the words of someone stumbling to still be right to protect their ego, and demonstrates Tim’s lack of intellectual honesty and integrity.

Tim Is An Unqualified Beginner

This is your classic defense from an arrogant programmer who refuses to admit that he actually doesn’t know what he’s talking about. When I receive complaints that my code isn’t working, even if it’s been run through the ringer over and over, I still go and double and triple check that it’s working. If Tim had sent me this kind of trivial defect I would have fixed my code and worked to find out why I caused the error. To programmers like Tim, who think they know C but are totally clueless about computer security, it’s inconceivable that his code could be wrong.

This is a sign of a beginner. A beginning programmer assumes his code is right even in the face of all evidence to the contrary, like Tim does here. They defend it to the end, because they are personally attached to their creation and not objective. An expert assumes his code could be wrong at any moment and adds as many defenses as possible. This shows that you should not listen to Tim about C coding, and definitely not learn anything from him. He is entirely unqualified and should be ignored.

Conclusion

Tim Hentenaar wrote a confused screed about my book being terrible and claiming nobody should buy it. However, his expertise is completely lacking to make that determination, his code has defects in it, and he arrogantly refused to admit that it had problems. He also defends his security defects with confused logic about UB and the existence of other functions that have nothing to do with his own code. Listening to Tim about how to learn C is therefore a dangerous thing to do. No book is perfect, and let me tell you that first printing of mine had loads of problems, but until Tim writes a better C book you’d do well to ignore his advice and him.

In fact, this is the problem with the majority of the detractors from my book. None of them have written books, and many of them don’t even code C or have C in production. Writing books and teaching people is incredibly difficult, much more difficult than hanging out in IRC yelling at beginners about Undefined Behavior or writing blog posts. Over this next week I’m going to systematically take down more of my detractors as I’ve collected a large amount of information on them, their actual skill levels, and how they treat beginners. Stay tuned for more.

Random Code Editor Idea

When I teach people to code I give them this simple procedure to follow:

1. Write the skeleton of the function.
2. Write comments in English describing what that function should do.
3. Under each comment fill in the code necessary to make it work.

This procedure works for early programmers because they typically know how to write code, and know what they want it to do, but the gap between what they want and what to write is fairly large. They don’t have enough experience to close the gap, but since they can describe what they want the function to do then that’s their start.

I find that starting with desired results works best for beginners and early coders. Everyone uses a computer these days and know how software should work. They can describe what they want their software to do much more easily than they can write code, so starting there gets them going. Eventually after coding for a while they switch to thinking entirely in code, but even to this day when I can’t quite think of the code to write, I start with the comments and fill them in.

If I throw in testing into the teaching (usually when they’re more capable), then the procedure becomes a little more complex:

1. Write the skeleton of the function.
2. Write the test and first call of the function making it fail.
3. Write the comments in the function for what it should do.
4. Fill in the comments with code and keep expanding and running the test.

Yet, the process is still the same and focuses on describing what I want and then filling in the blanks. In writing this is the same process I tell beginning writers. Just talk out loud and say what you want to say, writing those as little notes, then fill in the paragraphs. Or, create an outline then fill it in. Same for painting, where I tell people to make a rough outline of what they want to paint, then figure out each piece of the outline.

In general, the way you can solve a complex problem that’s difficult to visualize in any medium (code, words, paint, music, etc.) is to convert the problem to a paint-by-numbers problem. Instead of just trying to do it all at once right in your head and get it right, you break it down into tiny problems, then solve each one.

What if code editors helped with this process specifically? What I mean is, imagine your process becomes this:

1. Write the test or the function skeleton, doesn’t matter, and the editor makes the other one.
2. Go into the function, and start writing comments.
3. Editor guesses at what should go there and puts it under your comment, and it keeps running the test as you type.
4. You then edit the code as it pops in, or maybe alternate through what comes up, and it keeps running and working the test to bust your function.
5. Eventually the test passes and it knows to move on to the next comment.

It’s difficult to describe, but a way to think of it is a hyper embedded version of what programmers seem to do these days anyway, which is just search through Stack Overflow, documentation, APIs, and github using most of the words you’d put into a comment to find code. Why not have the editor use fancy machine learning algorithms and a vast catalog of existing curated code to do this for you?

In addition to that, it seems possible to auto-generate enough test code to fuzz through most of what you write, especially if the language is more modern. Maybe it’s something like AFL generating tests that hammer your function finding things, and since it’s generating the code in the function it’s possible it could be smarter at this.

Just a random idea, but could be an interesting thing to research. Call it “Comment Driven Coding” for lack of a better name.

Early vs. Beginning Coders

When I was working on Learn Python The Hard Way I was frustrated by how often I’d have to explain that the book is for a total beginner. The problem is that most of the technology world considers someone with about two programming languages under their belt a “beginner”, but learning two programming language would take you about 4-6 months. After 6 months you can’t really say someone is a beginner since, well, 6 months later is not the beginning. The beginning of something is…I mean why do I have to say this…at the beginning. Not 6 months later.

It seems pedantic but this is a constant problem in the technology education world. When you look at the categories for technology book publishers they only have categories for “beginner” that fit the model of a person who’s not really a beginner. My book actually didn’t fit into many publisher’s categories since it was targeted at an audience that was before this level. This showed a completely ignored group of people, and it’s a very good sign that most technologists simply have no concept that there are non-programmers who want to learn programming.

To me this inability to visualize a person who is a total beginner is a symptom of most programmers being terrible at teaching programming. They frequently have bizarre ideas about teaching programming because they can’t visualize a person who knows absolutely nothing. My favorite is how they think you should teach programming without teaching “coding”, as if that’s how they learned it. They’ll have this imagined idea that they learned programming in their first discrete mathematics course, when really they were probably typing the code out of a book when they were 11 and simply don’t consider that where they learned to code. Or, they didn’t really learn programming in that class and only actually learned it when they sat down and went through a book that taught them code. Their arrogance simply makes them think they did, but I don’t know anyone who took an abstract “no coding” class and then went and wrote Java or Lisp without going through at least one book teaching how to code.

I have no idea why these people have such a hard time visualizing someone with zero knowledge, but I think a simple change to the nomenclature of software developers would help to at least talk about it.

The Beginning Is At the Beginning

What I propose is we have beginning coders and early coders. I got this idea from a painting teacher who kept referring to students who had never painted as “beginners”, but those who had painted for about one class as “early”. The reasoning is that you need a way to differentiate people who don’t know a damn thing vs. people who know the basics but just simply suck at them. Teaching a beginner is very different from teaching someone who’s already been doing it for a bit and just needs more training.

For example, a beginning coder doesn’t know how to type the | (pipe) character. They don’t even know it’s called a “pipe”. I’m not joking about this. Professionals actually don’t believe me when I tell them this, but it’s true. Beginners have zero experience so simple things like making a text file, opening terminal, and even the idea that you can type words at a computer and it will do stuff, are simply unknown to them. To teach a beginner effectively requires this level of information slowly fed to them in reasonable chunks.

The best analogy I have for this comes from either music or martial arts. In those disciplines you have a set of things that beginners need to get through repetition before they can start the process of actually learning. In music this is simple things like names of notes, ear training, scales, where notes are, and harmony training. In martial arts this is things like building strength, flexibility, how to stand, the names of techniques, and blocking. Without this initial basic repetitive training to get these core skills deeply ingrained the beginner will simply flounder trying to learn at the early stage and have a difficult time progressing to deeper understanding.

My current method for training up beginners is to make them learn the basics of 4 programming languages. I’m not sure why 4 seems to be the magic number, but after they’ve gone through 4 programming books and learned to make tiny little programs plus all the syntax, they seem to have a firm grasp of the basics. This phase is all about learning concrete simple things, but also understanding the idea that the concrete things are just standing in for abstract concepts. In one language || (two pipe symbols) might mean “or” and another language will use the actual word “or” but this is the same concept and the symbol doesn’t matter. After their fourth language they get this and can then move on to being an early coder.

Early Is After The Beginning

An early programmer is different from a beginner because they have the basic skills understood, but have a hard time applying them to problems. The early coder’s next challenge is problem solving, which is a much more abstract skill that takes longer to master. A beginner’s hurdle is training their brain to grasp the concrete problem of using syntax to create computation and then understanding that the syntax is just a proxy for how computation works. The early coder is past this, but now has to work up the abstraction stack to convert ideas and fuzzy descriptions into concrete solutions. It’s this traversing of abstraction and concrete implementation that I believe takes someone past the early stage and into the junior programmer world.

The best analogy for this would be with creative writing. First, a student has to spend time learning the alphabet, then words, reading, writing, and other concrete things. Even before that they have to learn to comprehend their native language(s) or else it’s difficult to teach them reading and writing. After they’ve learned this concrete task of reading and writing, through lots of mechanical repetition, they move on to the task of conceptual writing. They’re given problems of writing stories or essays and then figuring out how to express these ideas in concrete words.
I’m not quite sure what takes someone from early to junior other than just attempting lots of projects with guidance. Similar to writing, painting, and wood carving, I think given a lot of projects to complete and then being critiqued on the results is probably the best way to build them up. With that in mind I started a new blog Projects The Hard Way which will feature a sequence of projects in varying levels of difficulty. I’ll see what ends up working best and how to work with early coders using this format.

Significance

My idea isn’t new of course, but now that I have a word for who I’m trying to teach, next I can focus exactly on that person. By saying, “This is for early coders,” I’m able to craft exercises that will work to build their skill level up and take them out of the early stages and able to create things. I’m thinking that it won’t matter what kinds of projects they do, just that they do a bunch of them.

My only question is how many projects ends up being the breaking point for most people? Is it 10, 20? How much variation is there between them? Or, is it more a question of time and not quantity of effort?

Either way, I’ll be hammering this divide between beginner and early so that we can properly place educational efforts and materials where they belong. Now if I can only get the few people writing books for beginners to stop assuming every beginner is a little kid or a total idiot I’d be set!