C# – Craig's World

Why I Don’t Care About Swift

Swift is a new programming language created by Apple for use on OS X and iOS devices. The programming world is agog. Apple’s fantastic new language apparently solves all their problems, as evidenced, they say, by the fact that some programmer ported Flappy Birds to it in a few hours.

I’ve been around long enough to see languages come and go. Each claimed to solve all the problems introduced by its predecessors, yet each was replaced by a language that solved all its problems. In some cases, the new language surpassed the success of the language it replaced (C++ and Java); in other cases, the new language faded into obscurity (Modula-2 and Ada).

Lately the motivations for new languages have been dubious. There is a big emphasis on making a language easy to learn and having it hide nasty issues related to memory management and type safety. One review of Swift I read stated, “Apple hopes to make the language more approachable, and hence encourage a new group of self-taught programmers”. While that sounds great, it means that those of us who have mastered our craft after 30 years or more of practice are saddled with the training wheels and water wings that are written into these languages for the noobs.

A classic example is the lack of unsigned integers in Java. The motivation for this was to simplify the language for “new and self-taught programmers” by avoiding errors caused by a lack of understanding of sign-extension. However, for those of us who showed up for class the day that sign extension was taught (that would be day two), we’re left with a language that unnecessarily limits the range of positive integers and requires us to actually have mastered sign extension in order to understand what is happening when we directly manipulate the bits in our integer variables.

Explicit vs. Implicit Typing

One of the simplifications Swift makes is that it infers the types of variables from the values assigned to them rather than requiring the programmer to explicitly type variables. If this was true “weak typing” like I’m familiar with in VBScript, it would be great (though it would come with its own set of problems). But all Swift does is infer the type of the variable from the first value you assign to it.

This actually introduces problems, because it’s not always possible to unequivocally determine the type of a literal value. So Swift gives you ways to force it to interpret a literal value as a given type. Rather than removing the necessity of the programmer understanding types, Swift thus requires “new and self-taught” programmers to have a mastery of types so that they can understand how Swift is working behind the scenes and make sure that their variables have the desired type.

Strings

Swift is said to improve string-handling over Objective-C (the current language used on OS X and iOS). There is certainly room for improvement there. When I first started programming in Objective-C, one of the first things I did was bring over my own C++ string class, as I found NSString to be overly complicated and muddled. Over the years I’ve gotten better with NSString.

I would argue, however, that some of the so-called “improvements” in Swift with respect to strings are differences without a distinction. So instead of this in Objective-C:

[NSString stringWithFormat:@"The value of num is %d", num]

you say this in Swift:

"The value of num is \(num)"

The Swift version is obviously more concise, but it is also less powerful. To add more complex format specifications to Swift you actually have to invoke the functionality of the underlying NSString class, which means the “new and self-taught” programmer, again, needs to understand the details of the implementation in order to do anything beyond the simplest strings.

One of the stated benefits of string handling in Swift is that “all strings are mutable”. One need not worry about whether the string is declared as an NSString (immutable) or NSMutableString (mutable). Well, you don’t have to worry unless you do have to worry — strings assigned to constants are immutable in Swift. So:

var myString1 = "Mutable string"
let myString2 = "Immutable string"
myString1 += myString2    // perfectly legal
myString2 += myString1    // compile-time error

Switch Statements

Swift eliminates the “fall-through” behavior of switch statements, which is said to eliminate bugs caused by omitting the break at the end of each case block. But, oops, sometimes the fall-through behavior is exactly what you want. So Swift adds the fallthrough keyword. It could be argued that Swift eliminates a line of code (the break) while giving the behavior one normally desires. But at the same time, it adds a keyword (fallthrough) that does the opposite. This requires “new and self-taught” programmers to have the same thorough understanding of switch behavior that Objective-C and C++ programmers do.

Single-line Blocks

The Swift compiler will warn you if you omit the braces in any block (such as after an if) and does not allow single-line blocks, thus avoiding this error:

if (x < 0)
    goto fail;
    goto fail;

The code above will always execute one or the other of the goto statements in Objective-C or C++. Even though the second goto is indented, it is not part of the if-block and will be executed if the condition is false.

Swift will warn you about the missing braces and force you to write this:

if (x < 0)
    {
    goto fail;
    }
goto fail;

Or, for those of you who don’t do your braces the right way, this…

if (x < 0) {
   goto fail;
}
goto fail;

This is fine, and hard to argue with. The supposition is that the programmer will immediately recognize the flaw or won’t make the mistake in the first place. On the other hand, I would argue that the same C++ programmer who wrote the erroneous code will write this in Swift:

if (x < 0) {
    goto fail; }
    goto fail;

I always put braces around my blocks, even if they are one-line, so this doesn’t affect me. It’s ironic, however, that while Swift prides itself in eliminating the unnecessary break statement at the end of a case block, it requires two to four additional lines (braces) in if, for, and while statements, which are more numerous.

PocketBible and Swift

I will be more than happy to learn and use Swift for programming on iOS and OS X. I just don’t believe the hype and won’t convert just for the sake of doing something new.

I am a strong proponent of platform-independent languages like C, C++, Java, and, to a lesser extent, Objective-C (the latter is primarily an Apple language, though it has its origins outside of Apple). Such languages allow me to develop code on one platform and re-use it on another. One of the promises of C++ and Java was that you could develop the code for one platform and use it on many others. Swift is an Apple language (the same way C# is a Microsoft language). It only works on Apple devices. While those are numerous, they’re not the only devices out there. So rather than moving toward the “write once, read everywhere” model promised by Java, we’re back to “write everywhere” as each platform requires its own language.

I don’t mind learning a new language. I already jump from C++ to C# to Java to VBScript to Javascript to MS-SQL on a daily basis. For those of us who write code for a living, being multilingual is a job requirement. This is precisely why I care so very little about the supposed advantages of Swift; this isn’t a religious war for me, it’s just a tool. When someone comes out with a new kind of screwdriver, I may or may not buy it until I need it. And then I’ll just buy one and use it — I won’t try to convert all my screwdriver-toting friends.

So will PocketBible for OS X and/or iOS be re-written in Swift? Probably not today, and probably not until Apple requires it. But Swift depends on Objective-C under the hood, so my guess is that Apple will continue to support Objective-C apps for a long time.

Braces and Indenting: You’re Doing it Wrong

Screen Shot 2014-01-28 at 8.33.08 AM Java, C++, Objective-C, and C# all use braces ( { and } ) to delineate the beginning and end of blocks of code. Over the years, several styles have evolved, with the worst of them dominating the literature. Once you see The Light you’ll wonder how we ever let this get out of hand.

Before we begin, let’s remind ourselves what braces are for: They mark the beginning and end of blocks of code. In many contexts a block stands in place of a single statement. It allows us to put two or more statements in a place where a single statement is called for in the grammar. In those contexts a block is functionally equivalent to the single statement it replaces. This will be important in our understanding of the One Right Way to indent.

In other contexts, such as the bodies of functions, surrounding the cases in a switch statement, and surrounding the declarations in a class definition, braces demarcate the contents of the function, switch, and class. For convenience, I’ll refer to any group of lines of code surrounded by braces as a “block”, even though the language definition may not always use that term in every context in which braces are used.

So the the first question is to ask: “To what do the braces belong: the block they surround or the syntactical element (if, for, switch, class etc.) to which the block belongs?”

When braces are used to surround a true block (the else clause of an if statement, for example), it’s clear the braces belong to the block. Together with the lines of code they contain, they replace a single statement.

The implication is that the braces should be indented at the same level as the lines of code in the block they surround, for they are part of that block.

The second question we need to ask is: “Should braces share the line with any other code; either a statement from the block they surround or the statement the block belongs to?”

Clearly we would not format code like this:

We might break a very long line into two or more lines, but a short statement should always be on one line. Similarly, we try to avoid code like this:

    x = y + z; if ( x > 10 ) foo(x); bar(z); switch (y) {case 1: x = 2 * y; break; case 2: default: foo(x); break;}

The commonly accepted practice is to put one statement on each line. (There are exceptions but they are rare.) Similarly, I would argue that braces belong on a line by themselves. They are not “inline operators” like + or ==. They do not belong to the statement to their right or left; they surround those statements.

The reason we don’t put two or more statements on one line is that it is more difficult to read. It’s why we break up our thoughts into sentences and paragraphs. It aids in comprehension. The same is true of code. Consider the following:

    if ( x < 10 ) { foo(x);
        bar(y); }

The call to foo(x) belongs to the then-clause because it is inside the brace but it would be easy to glance at the code and assume bar(y) is the only statement executed when the if-condition is true because the call to foo(x) is “hidden” at the end of the if statement.

For this reason I would argue that braces belong on a line by themselves. It is too easy to miss them when they are “hidden” at the end of another line of code. So unless you’re in the habit of writing a dozen statements on one line, it doesn’t make sense to put a brace on the same line as another line of code.

With these two rules (i.e. braces belong to the block they surround and braces belong on a line by themselves), there’s only One Right Way to indent your code:

    if ( x < 10 )
        {
        foo(x);
        bar();
        }
    else
        {
        x += 10;
        foo(x);
        }

Now we can see why the predominant indenting style is so, so wrong:

    if ( x < 10 ) {    // should not be on same line as "if"; should be indented with block
        foo(x);
        bar();
    } else {              // should not be on same line as else (*2); should be indented like block above/below
        x += 10;
        foo(x);
    }                     // should be indented like block above

I realize those of you who grew up doing this wrong and reading all the literature from others who do it wrong will find the One Right Way more difficult to read. But it can be argued that you only find it difficult to read because you’re not accustomed to doing things the One Right Way, while the wrong style as illustrated above is difficult for me to read because it makes no attempt to be logically consistent. This makes it objectively wrong, not just a matter of personal preference.

Postscript
In the spirit of unity and the cause of world peace, practitioners of the One Right Way will accept the following style with the hope that those practicing it will see the one small error in their way and with proper mentoring and encouragement, will correct it:

    if ( x < 10 )
    {
        foo(x);
        bar();
    }
    else
    {
        x += 10;
        foo(x);
    }

Objective-C Memory Management

Perhaps I’m showing my age, but I’m getting awful tired of language designers trying to improve on C/C++ memory management.

Just for review, here’s how memory management should work:

void foo()
  {
  // x is created on the stack. It is deallocated at the end of
  // the block/function and therefore its lifetime matches its
  // scope with no further effort. 

  int x;

  // pX is a pointer to an int that the programmer creates with
  // new. By using "new", the programmer is taking responsibility
  // for freeing the memory used by pX before it goes out of scope.

  int *pX = new int(0);

  // ... interesting code goes here ...

  // The obligatory delete before we exit the block/function.

  delete pX;

  }

Everything else in C/C++ is a variation on this. You can put pointers and variables in structures and classes/objects but they follow the same rules: If you allocate with new, you must free with delete before you lose track of the memory (i.e. the one and only (or last remaining) pointer goes out of scope).

When we started coding for iOS, we ran into “manual retain/release” which is a variation on the C/C++ technique (or rather, a manual method of the automatic garbage collection used in Mac OS):

@interface bar
  {
  // Like C++, when the pointer is a member (instance) variable, 
  // someone else is responsible for allocating memory for it.

  NSString * memberString;
  NSString * anotherString;
  }

// But if the instance variable is accessible from the outside
// world we can say it's a "property" and some of this is 
// managed for us. 

@property (retain) NSString * memberString;

// Unless we don't specify "retain". Now we're responsible for
// making sure the memory for anotherString is allocated and
// freed.

@property (assign) NSString * anotherString;

@end

@implementation bar

- (void)foo
  {
  // These are the same. They're on the stack and are automatically 
  // released when you exit the method/block.

  int x;

  // This is the equivalent of C/C++ "new", kind of. We can't just
  // do memory allocation without also initializing the object 
  // (handled by new and the constructor in C++, but that's the 
  // subject of a different article). The result is a pointer that
  // we're obligated to release before string goes out of scope.

  NSString * string = [[NSString alloc] init];

  // Another way of doing the same thing, but this time the 
  // resulting pointer is automatically released sometime in
  // the future that we don't care about.

  NSString * arString = [[[NSString alloc] init] autorelease];

  // Yet another way of doing the same thing, but the autorelease
  // is done for us. We can tell because the method name starts
  // with something that looks like the name of the class but
  // without the prefix. Intuitively obvious, right?

  NSString * autoString = [[NSString alloc] stringWithUTF8String:"Automatically released"];

  // Required release

  [string release];
  }

@end

And autorelease isn’t as automatic as you might think. You need to think about whether or not you need to create your own autorelease pool. This is important if you’re going to create a large number of autoreleased variables before returning to the run loop. You may want to manage your own autorelease pool in that case so you can free memory up at more convenient times.

If that’s not ridiculous enough, along comes Automatic Reference Counting (ARC) to “simplify” memory management.

@interface bar
  {
  // Like C++, when the pointer is a member (instance) variable, 
  // someone else is responsible for allocating memory for it.

  NSString * memberString;
  NSString * anotherString;
  }

// Instead of "retain", we create a "strong" reference. Memory
// is freed when this particular instance variable goes out
// of scope (is no longer accessible). 

@property (strong) NSString * memberString;

// We use "weak" instead of "assign" to mean that we understand
// someone else is in control of when this memory gets freed.

@property (weak) NSString * anotherString;

@end

@implementation bar

- (void)foo
  {
  // These are the same. They're on the stack and are automatically 
  // released when you exit the method/block. In reality, they're
  // qualified with __strong by default.

  int x = 10;
  NSString * string = [[NSString alloc] init]; // could add __strong for clarity

  // You can also create weak pointers for some reason:

  NSString * __weak weakString;

  // Unfortunately, that introduces a bug into these lines of code:

  weakString = [[NSString alloc] initWithFormat:@"x = %d", x];
  NSLog(@"weakString is '%@'", weakString);

  // In the code above, weakString is immediately deallocated after
  // it is created because there is no strong reference to it. See
  // how this is getting easier?

  // Not to mention:

  NSString * __autoreleasing arString;
  NSString * __unsafe_unretained uuString;

  // Now we don't have to do this:
  // [string release];
  // And that's really all we saved by introducing "Automatic Reference Counting".
  // At the same time, we created a new way to introduce a bug by failing
  // to have any strong pointers to an object from one line of code to
  // the next.
  }

@end

So we’ve gone from:

new / delete

retain / release (or autorelease with no release)

strong/__strong/weak/__weak/__autoreleasing/__unsafe_unretained

all in the interest of “simplification” and avoiding having to delete/release the memory we allocate. I frankly don’t see the benefits.

Implementing Interprocess Locking with SQL Server

I suppose everyone does this and I just haven’t heard about it. I don’t get out much, so it seems cool to me.

When we redesigned our company website (www.laridian.com) a couple years back, I needed a way to automatically update best-seller lists, new releases, and other dynamic data on the site without relying on an employee to do it every week/month/quarter. Initially, I considered writing a script that did this kind of thing and was launched by the OS on a schedule every so often, but I try to stay away from creating yet another little thing I’ll have to remember if we ever move the site or are forced to recreate it on another server.

So it occurred to me that I could keep track of when the last time was I had created a particular list or other piece of dynamic content on the site, and the first user who requests it after some time period (say once a month for “best sellers” and once a week for “new releases”) would cause the site to notice the content was old and regenerate it. That’s a cool idea on its own, but isn’t the subject of this article.

One of the problems I wanted to avoid was having two or three users who happened to show up at about the same time all trigger the process. I was concerned that it might be time-intensive and while I don’t mind delaying one customer while the data is created, I didn’t want to delay everyone who visits the site during those few seconds. So I came up with the idea of using SQL Server to implement a generic “lock” or “semaphore” capability I could use anywhere on the site.

The idea is to have a simple table with a Name field and a SetTime field. The Name field is given the UNIQUE constraint, so that duplicate records with the same Name field are not allowed. The first customer session that discovers it needs to rebuild the best-sellers list tries to INSERT a record with Name = ‘Best Sellers’ and SetTime = GETDATE(). If the INSERT succeeds, the process “owns the lock” and can do what it needs to do. If someone else comes along shortly thereafter and discovers it, too, needs to update the best-sellers list, it will try to do the same INSERT and will fail due to the existence of a record with the same Name field. This second process does not own the lock, and cannot update the best-sellers list. Instead, it uses the old list.

Once the first session has updated the list, it simply DELETEs the record, thus releasing its lock on the best-sellers list.

Since INSERT is an atomic operation there’s no possibility that two sessions are going to both believe they wrote the record.

Since the web is a flaky place, it’s necessary to allow for the possibility that a lock obtained a long time ago was never released. So every request for a lock checks the SetTime field. If the existing record is “too old” it is deleted before the attempt is made to INSERT the record.

This allows a certain amount of interprocess cooperation and communication between my Classic ASP pages with very little effort.

One of the side-effects is that the locks span not only all the processes running on the server, but can be made to span processes running on user devices. A recent use case that surfaced for this capability was the necessity of keeping a user from synchronizing his notes, highlights, or bookmarks from two (or more devices) with the Laridian “cloud” at the same time. The results can be unexpected loss of data on one or both of the devices.

The solution to this potential problem was for the synchronization process to request a lock that contains both the name of the table being synchronized and the customer ID. That way, many customers can synchronize, say, Bible bookmarks at the same time, but any one user can only synchronize one device at a time. This is a little more complicated than it seems, since PocketBible for Windows and PocketBible for iOS each have their own synchronization script on the server, while our newer clients (PocketBible for Android, Windows RT, and Windows Phone) use our new TCP-based synchronization server. The scripts for the older clients are written in Classic ASP and are invoked through HTTP POST operations from the client, while the new TCP server is written in C# and runs as a Windows Service. All have access to the same SQL Server database, and all implement the same locking strategy, which is working well.

In addition, during the debug process the TCP server runs on my local machine and connects via VPN to SQL Server. I can use and test the locking mechanism in this way before it goes live.

The combination of a very simple implementation using technology (SQL Server) that is well-known and well-tested, and the ability to implement locking across platforms makes this an interesting and (I would argue) elegant solution to a large number of problems.