F# for game development: oop

Showing posts with label oop. Show all posts

Tuesday, August 2, 2011

Is it worth it?

In my previous post, I explained why inheritance for code reuse is a bad idea. The blog was a bit on the theoretical side, and I wanted to know if I wasn't sitting in an ivory tower.

For this reason, I have decided to rewrite one of my uses of inheritance by composition. The piece of code in question is the BaseScreen class in XnaUtils.

Before I go any further, I need to give you a bit of context. The user interface of video games is usually composed of screens. The library provides a class called ScreenManager which is responsible for managing stacks of screens, notifying each screen when to load/unload resources such as textures...

I have created an interface named "Screen" which serves as the interface between the framework and the screens in a game. The framework also provides a few screens which can be used in simple games, namely TextScreen, MenuScreen and PressStartScreen.

These screens all share common functionality which I needed to put somewhere to avoid code duplication. In my first version, BaseScreen was an abstract class from which screen implementations inherited, overriding a few key methods to customize rendering.

Differences in ScreenBase

The changes are not very extensive. Abstract methods were replaced by fields. Instead of defining overrides, inheritors set properties using lambdas.

+    let post_drawer : ('T -> unit) ref = ref (fun _ -> ())
...
-    abstract member EndDraw : 'T -> unit
-    default this.EndDraw(_) = ()
-
+    member this.PostDrawer
+        with get() = !post_drawer
+        and set(d) = post_drawer := d

Differences in inheritors

A few "this" had to be replaced by "impl", the name of the variable I have used to hold the instance of ScreenBase.

type PressStartScreen(sys : Environment, fade_in, fade_out, blink) =
-    inherit ScreenManager.ScreenBase<unit>()
+    let impl = new ScreenManager.ScreenBase<_>()
+    let impl_screen = impl :> ScreenManager.Screen

// Task to check if a button is pressed. Sets player when that happens.
let player : PlayerIndex option ref = ref None
while (!player).IsNone do
-            do! sys.WaitUntil(fun () -> this.IsActive)
+            do! sys.WaitUntil(fun () -> impl.IsActive)
for p in all_players do
let state = GamePad.GetState(p)
if state.IsConnected

Boiler-plate code had to be written to implement interfaces Screen and IDisposable, forwarding calls to "impl". As ScreenBase implements these interfaces explicitly, I had to introduce an extra variable "impl_screen" to refer to impl's implementation of Screen. Otherwise, I would have had to use casts in all forwarding code, as shown in the implementation of IDisposable.

+    interface ScreenManager.Screen with
+        member this.ClearScreenManager() = impl_screen.ClearScreenManager()
+        member this.Draw() = impl_screen.Draw()
+        member this.LoadContent() =
+            impl_screen.LoadContent()

+
+    interface System.IDisposable with
+        member this.Dispose() = (impl :> System.IDisposable).Dispose()
+

Advantages of composition and delegation over inheritance

1. Inheritors can reuse code from multiple classes (F# does not have multiple inheritance).
2. The code for ScreenBase is a bit shorter.
3. Inheritors have more control.

Advantages of inheritance over composition and delegation:

1. The "IDisposability" of ScreenBase is implicitly propagated to its inheritors.
2. No risk of errors in delegation.
3. The code for inheritors is shorter.

Conclusion

Although I believe most of the weaknesses of the approach using composition and delegation could be lifted by additions to the F# language, in its current shape, I would not dare to force users of my library to use composition.

The ideal solution is to write my library code to allow users to use the approach they prefer, which I expect would be inheritance in most cases. Providing this approach as the only one is not acceptable, as it would prevent users to reuse features from multiple classes.

Sunday, July 17, 2011

Why inheritance for code reuse should be avoided

The question of when to use inheritance and when not to has been asked many times. I have noticed answers can be put in one of three categories:

1. Use as much as you can. It's a convenient way to maximize code reuse with very few keystrokes. Actually, this answer is seldom explicitly given, but if I judge by the code I see, this is a wide-spread opinion.

2. There are situations when inheritance is appropriate, and situations when it isn't. Namely, inheritance is appropriate to express IS-A relationships.

3. Never use it for the purpose of code reuse. Some take it even further and say never use it at all. Inheritance is flawed by design.

Throughout my life as a programmer, I have gone through all of these stages, in order. I think most of the controversy focuses on inheritance for the purpose of code reuse, and I'll start with that.

Inheritance for the purpose of code reuse
The first answer is easy to discard, as explained in this answer on the programmers stackexchange. Systematically using inheritance quickly leads to excessively large objects. The author of the code probably won't see this as a problem at first, and may indeed be oblivious to the problem. Unfortunately, this makes it very easy to fall into this trap.

The second answer begs for an answer to the question "what is an IS-A relationship?". I find that all IS-A relationships can be expressed as HAS-A, and vice-versa. Taking the usual example of cars, we can say a car has four wheels, and it is a four-wheeled vehicle. A car has an engine, and it is an engine-powered vehicle. Which is the correct way of expressing relationships between cars and their components?

Instead of relying on IS-A vs HAS-A, one can use the Liskov Substitution Principle (LSP), as pointed out in this other answer.

Informally, the principle states that if B inherits from A, then B must be usable in all places where A is used.

This is a bit of a loose definition, and even the more formal definition by Barbara Liskov and Jeanette Wing stated below invites further questions.

Let q(x) be a property provable about objects x of type T. Then q(y) should be provable for objects y of type S where S is a subtype of T.

What is the proof system assumed here? What kind of sentences can be expressed? Do the axioms include knowledge about the program?

Informally, how should "usable" be interpreted in the informal statement of the principle?

I don't think there is a single fit-all answer to this question. There are a number of aspects to consider, which all affect the initial development time and degree of maintainability of a software project.

Language Semantics.
Note that a language which allows to query the concrete type of an object makes it impossible to meet the LSP.

let f (x : A) =
    match x with
    | :? B -> failwith "Failing!"
    | _ -> x.DoSomething()

Obviously, an instance of B cannot be used with f, as it would cause the program to behave differently as when provided with instances of A (or some other subtype of A).

Notion of equivalence.
That instances of B can be used instead of A means that the behavior of the software when B is used is "observably equivalent" to that of the software when A is used.

There are different levels of strictness regarding equivalence. Including internal state in the observable behavior is a bad idea, as this forbids method overriding and polymorphism.

Even limiting oneself to "user-visible events" may be too limiting, as this closes the door to plug-ins that extend or modify functionality.

In the end, I think a popular but very weak notion is whether the program crashes or not. And even that may be too strong, as it's probably acceptable for instances of B to work where instances of A crash (OK, that's nit-picking).

Scope.
Does "all places where A is used" refer to the places as they exist in the current form of the software, in the current and expected future forms, or in all possible forms?

There is a bit of overlap with the first aspect. When assuming the scope is limited to the current state, and if no concrete type tests on instances of A are made anywhere, then meeting the LSP seems feasible.

Assuming the second alternative (current and expected future forms) requires either a clear idea of what the future looks like, or a coding standard prohibiting the use of all language constructs potentially incompatible with the LSP.

If you are designing a library or framework to be used by other parties you don't control, you are condemned to the last option, "all possible forms". Unless you have a low level of ambition on the notion of equivalence, the LSP is impossible to meet in this setting.

There may well be other aspects I haven't thought of, but I hope I made it clear that the LSP isn't quite as simple as it may seem at a first glance. Using inheritance only "when it conforms to the LSP" often equates to "never" if one really cares about the LSP. Moreover, it requires an understanding of the principle itself as well as clearly identifiable decisions on all the points mentioned above, two conditions which are not met that often out-there-in-the-real-world.

Note that it's possible and indeed desirable to specify limitations on language features, equivalence and scope on a per-type basis. Doing it on a project-wide level would probably make it impossible to respect the LSP.

Inheritance limited to interfaces
My standing on the question of inheritance when limited to interfaces is not quite clear yet. I can't quite put my finger on the exact problem yet, so I'll leave it to another day to express my thoughts in details on the subject.

Shortly, it seems to me this kind of inheritance brings little, if anything at all, over composition. The argument of the "keystroke minimality" applies only to subtyping between interfaces, which I have had little practical use for so far.

Alternatives to inheritance
Composition is often cited as the right way to take advantage of code reuse. To put it mildly, mentioning it often fails to trigger the enthusiasm of proponents of inheritance for code reuse. It's easy to understand why:

type A =
   ...
   member this.DoSomething1() = ...
   ...
   member this.DoSomething15() = ...

type OhForGodsSake_B =
   let an_a : A = ...
   member this.DoSomething1() = an_a.DoSomething1()
   ...
   member this.DoSomething15() = an_a.DoSomething15()

type WhatABreeze_B() =
   inherit A()

Indeed, there is something to be said for the second approach. I wish F# had some way to achieve the conciseness of inheritance through composition, but that's not the case. This may be an oversight, or may be by design. As Matthieu M. writes in his answer, forcing verbosity can be beneficial:

[It] means that at least a couple keystrokes are needed for each method, and suddenly people begin to think about whether or not it's such a great idea to reuse this "fat" class for just a small piece of it.

I would add that this would also force the author of B to think about whether all 15 DoSomething() methods really are needed in B, or if it's enough to provide just a few of them.

By the way, should the methods in B have the same name as their counter-parts in A? If A is "VehicleWithWeels", and "B" is "Airplane", it makes sense to expose "VehicleWithWeels.Brake" as "Airplane.BrakeWithWheels", as airplanes have different means of braking. It's easy to miss that subtle point when using inheritance.

All-in-all, I don't think I should value my laziness and comfort over cleaner code, at least not in a professional setting. Still, I can't help being lazy... Programming isn't all about writing clean code, it's about having fun too.

When dealing with implementation relationships between concrete types and interfaces, I am happy with seeing it as "HAS-A" kind of relationship, or more precisely (and verbosely) "OFFERS-FUNCTIONALITY-AS-SPECIFIED-BY-INTERFACE". This can be achieved in two ways in F#:

// Explicit implementation
type B1 =
   ...
   interface A with
      method this.DoSomething() = ...

// Composition using object expressions
type B2 =
   ...
   member this.AsA() =
      { new A with
           member x.DoSomething() = ... // Note "x" vs "this". "this" refers to B2.
      }

The first approach can be used when it's clear B1 can only provide A in one single sensible way at all times, the second approach is more general and flexible. One might wonder if the second approach shouldn't always be preferred. Where do extremes end? ;)

Friday, July 15, 2011

OOP without classes in F#

Did you know that F# allows to use some object-oriented features with types that are not classes? This includes methods, dynamic dispatch, open recursion and encapsulation.

Methods are functions or procedures which are tightly coupled to a type of object.

Dynamic dispatch is the ability to execute different implementations of an operation depending on the concrete types of arguments. A special case which often has dedicated syntax is when the operation is a method, and the argument controlling the dispatch is the object itself.

Open recursion makes the object available to its methods through an identifier, typically called this or self.

Encapsulation restricts access to the internal data and operations of an object to a restricted area of the code. Typically this area is the methods of the object itself, F# works at the module level instead.

The code below shows how a vector type can be defined using records.

[< CustomComparison >]
[< CustomEquality >]
type Vector2 =
    private { x : float
              y : float }
with
    // Constructors
    static member New(x, y) = { x = x; y = y }
    static member Zero = { x = 0.0; y = 0.0 }

    // Accessors
    member this.X = this.x
    member this.Y = this.y
    member this.Length = let sq x = x * x in sqrt(sq this.x + sq this.y)

    // Methods
    member this.CompareTo(other : obj) =
        match other with
        | null -> nullArg "other"
        | :? Vector2 as v -> this.CompareTo(v)
        | _ -> invalidArg "other" "Must be an instance of Vector2"

    member this.CompareTo(other : Vector2) =
        let dx = lazy (this.x - other.x)
        let dy = lazy (this.y - other.y)

        if dx.Value < 0.0 then -1
        elif dx.Value > 0.0 then +1
        elif dy.Value < 0.0 then -1
        elif dy.Value > 0.0 then +1
        else 0
        
    // Overrides for System.Object
    override this.ToString() = sprintf "(%f, %f)" this.x this.y
    override this.Equals(other) = this.CompareTo(other) = 0
    override this.GetHashCode() = (this.x, this.y).GetHashCode()

    // Explicit interface implementations
    interface System.IComparable<Vector2> with
        member this.CompareTo(other) = this.CompareTo(other)

    interface System.IComparable with
        member this.CompareTo(other) = this.CompareTo(other)

    interface System.IEquatable<Vector2> with
        member this.Equals(other) = this.CompareTo(other) = 0

The use of private on line 4 makes the representation (not the type itself or its methods) private to the enclosing module (Thanks to Anton for pointing that out in the comments).
Members can access the object through an identifer (this), and open recursion is covered.
Vector2 can be used as one would expect to be able to use an object, as illustrated below.
Dynamic dispatch is exercised by Array.sortInPlace (which calls CompareTo).

let playWithIt() =
    let v = Vector2.New(0.0, 2.0)
    let v2 = Vector2.New(v.y, -v.x)
    printfn "%s and %s have %s lengths"
        (v.ToString())
        (v2.ToString())
        (if (let epsilon = 1e-6 in abs(v.Length - v2.Length) < epsilon)
            then
            "close"
            else
            "different")
        
    let rnd = System.Random()
    let vs = [| for i in 1..10 -> Vector2.New(rnd.NextDouble(), rnd.NextDouble()) |]
    Array.sortInPlace vs
    printfn "%A" vs

I cannot write an entire article about OOP and not mention inheritance. F#-specific datatypes do not support inheritance, but I will try to show in a later post why this is not as big a problem as one might think.

Monday, April 18, 2011

"I'll Whack Your Fingers"-Oriented Programming

Every now and then, a question about how to transfer data between game screens pops up on the App Hub forums. The source of the problem is that many try to solve this problem using synchronous programming. This does not work because game screens require interaction with users, which cannot be handled in a synchronous manner without freezing the game.

Although F# gives you tools to solve that problem using e.g. async workflows, most people are using C#.

How do you solve that problem in this language? Usually, people use a combination of events and public fields. This solution is fine, but I just don't like events much. They make bugs tricky to track, because I lose track of where handlers fit in the expected flow of execution of my code. Moreover, the only practical way that I know of to pass data from an event handler to the main code is through a shared variable (which I call "I'll Whack My Own Fingers"-oriented programming).

For this reason, I suggested to use Begin-End-style asynchronous programming. This requires an implementation of IAsyncResult if you want to do it in a nice .NET-ish kind of way. I am not aware of any such implementation (EDIT: Here is one, thanks mausch), so I wrote my own:

///  
  /// An implementation of IAsyncResult that wraps result data. 
  ///  
  /// Type of the result data. 
  class AsyncResult<T> : IAsyncResult, IDisposable { 
    public AutoResetEvent signal = new AutoResetEvent(false); 
    public bool isDone = false; 
    public T result = default(T); 
 
    public object AsyncState { 
      get { return state; } 
    } 
 
    public WaitHandle AsyncWaitHandle { 
      get { return signal; } 
    } 
 
    public bool CompletedSynchronously { 
      get { throw new NoneOfYourBusinessException(); } 
    } 
 
    public bool IsCompleted { 
      get { return isDone; } 
    } 
 
    public void Dispose() { 
      signal.Dispose(); 
    } 
  }

Did you notice how I made all fields public? I think many a modern programmer would wince when reading this. What if some careless programmer accessed isDone and modified it? Bad things can follow, and surely we should restrict access to these fields using the private keyword.

I like to call this "I'll whack your fingers if you touch this"-oriented programming. Beside "private" and "protected", which were probably invented to tempt stressed programmers to replace them with "public", enthusiasts of this approach enjoy using "sealed" to punish users who might be tempted to reuse your code.

There are better ways to prevent accidental tempering with an object's internal state: access objects via interfaces.

For instance, here is the code for the Begin-End pair of methods that use my AsyncResult:

///  
        /// Open a MessageBoxScreen asynchronously. 
        ///  
        /// The screen manager to which the screen is to be added. 
        /// The index of the player who has control, or null. 
        /// The text of the message to show. 
        /// An object which can be used to wait for the request to complete and retrieve the result. 
        public static IAsyncResult BeginShowMessage(ScreenManager sm, PlayerIndex? player, string msg) { 
          var result = new AsyncResult(); 
 
          var screen = new MessageBoxScreen(msg); 
          screen.Accepted += (src, args) => { 
            result.isDone = true; 
            result.result = true; // User accepted the message box. 
            result.signal.Set(); 
          }; 
          screen.Cancelled += (src, args) => { 
            result.isDone = true; 
            result.result = false; // User cancelled the message box. 
            result.signal.Set(); 
          }; 
 
          sm.AddScreen(screen, player); 
 
          return result; 
        } 
 
        ///  
        /// Wait for the user to complete interaction with a MessageBoxScreen opened with BeginShowMessage. 
        ///  
        /// The object returned by BeginShowMessage. 
        /// Whether the user accepted or cancelled the message screen. 
        public static bool EndShowMessage(IAsyncResult r) { 
          var result = r as AsyncResult; 
          if (result == null) 
            throw new ArgumentException("Wrong type or null", "r"); 
 
          result.signal.WaitOne(); 
          result.Dispose(); 
          return result.result; 
        }

Note how BeginShowMessage and EndShowMessage return and take respectively an IAsyncResult. Unless users of these methods really want to take risks getting intimate with AsyncResult, there is no risk of tampering. And if they really want to work directly with the inner parts of AsyncResult, why prevent them to do so?

I wonder, what's the point with "private" and "protected" anyway?

UPDATE:

Thinking a bit more on the subject, I think another overused feature resides in non-virtual public methods (and fields). Anything public in the API of a class should belong to an interface, with the exception of constructors and construction helpers (which maybe should be internal to assemblies).

I would however not advise to use abstract data types in all situations. Navigating in the call tree is a good way to learn a new code base, and typically abstract methods defeat this. One could say abstraction through interfaces is a form of obfuscation, as it effectively hides implementations.

F# for game development