C# and .NET
John Reeve
Another Language is Born
In mid - 2000, a new programming language called C#, was
submitted by Microsoft to the ECMA standards group. On July 11, 2000
Microsoft had a press release announcing the .NET Framework to unite
programming languages for Web-based uses. Whilst many languages have
come and gone and C# has many features that could lead to it's wide
adoption in financial applications. This article takes a brief look at
the some of these features.
Overview
The syntax of C# borrows very heavily from C++. However, whilst, object
oriented programming was bolted onto C as an after-thought, C# has been
designed from the outset as a pure object oriented language where
everything is derived from object.
The second major departure from C++ is that C# runs in a managed
environment, the common language run-time (CLR) in much the same way
Java does. Code is compiled into an intermediate language (MSIL) which
is JIT compiled into native processor code at run-time. In theory, this
allows C# to be used cross platform. Memory management and object
destruction is handled automatically by the run-time framework freeing
the programmer from this task, saving time and eliminating the problems
of memory leaks. All objects are addressed by reference allowing the
framework to handle the allocation of the memory on the heap.
The .net part of C# .net refers to a very comprehensive and flexible
class library that is supplied with the CLR. This is where many of the
most useful features of this C#.NET can be found, but unfortunately,
also where the ability to use applications cross-platform tends to end.
New Features
There are a few features of the language that are worthy of note and
the most useful of these is the ability to define objects with events.
In C++ it is generally necessary to pass a pointer to an object at the
time of object construction in order to call out to other objects.
However, in C# an object can be declared with an event. The event can
only be fired by code within the object but any external object can
subscribe to receive the event providing it has a receiving method of
the correct format. The format of receiving methods is defined by
something called a delegate which specifies the parameters and return
type of the method. Objects can subscribe and un-subscribe to the event
dynamically whilst the application is executing allowing a great deal
of flexibility in distributing event based information around the
application. Multiple objects can also subscribe to a single event, in
which case the methods on the receiving objects are called in the order
that the objects subscribed to the event. For example, if an
application is being developed that requires a price update event to be
distributed to a dozen processing objects, it is simply necessary to
write a data source object with an event and then get each processing
object to subscribe to the event.
A second new language feature is the addition of properties. Each
property allows data within an object to be accessed externally via get
and set accessor methods. Properties can be defined with only get
or set or both get and set accessors. Important private variables
within an object can be made accessible using a property defined with
only a get accessor. This allows external objects access to the
variable whilst ensuring it cannot be modified externally. Whilst this
is also possible using methods in C++, the addition of properties to C#
makes the resulting code tidier and easier to understand.
Finally, multiple-inheritance is not supported in C# but interfaces
have been introduced as an alternative. Interfaces are basically object
templates that define the properties, methods and events that are
supported by an object. An object that implements an interface must
have all the features defined by the interface but otherwise there are
no restrictions. Objects can also implement multiple interfaces and any
object reference may be cast to the interface/s it implements. Hence,
an application may have three different datafeed objects with entirely
different implementations that all implement a common datafeed
interface. The datafeed objects can be cast to the common
interface and accessed from within the application using an identical
set of methods and properties. Whilst standard inheritance could be
used for this example by defining a base datafeed class, if there is
nothing in common between the datafeed objects then using an interface
provides a neater more logical solution.
Memory Management
Although everything in C# is an object, the handling of each object in
memory depends on it's type. Simple integral types, floats and structs
are called "value types" and by default are handled on the stack in the
same way as in C++. Classes and arrays are called "reference types" and
are always assigned to heap memory. As everything is an object there
are cases where a value type might be passed to a method which is
expecting a reference type. In this case a process called boxing is
used where the value type is taken from the stack and transferred to
the heap. The reverse process is un-boxing where a value type stored on
the heap is put back on the stack. The process of boxing and un-boxing
will impact on execution speed so should be avoided where possible.
Memory management of the heap is performed by a garbage collector. This
runs on a background thread and keeps track of unused references.
Memory is reclaimed automatically as objects become obsolete, however,
the exact time at which any given object is destroyed is not under the
control of the programmer. In cases where a large amount of memory has
become free, it is however possible to force garbage collection.
As a general rule the garbage collector does an efficient job of
managing memory resources without having a significant impact on
program performance. However, there are a few cases where caution is
needed. If an object has subscribed to an event but all other
references to it have been destroyed it will obviously not be garbage
collected. However, it is also not obvious that the reference exists as
it is effectively hidden within the event. Hence, event handler code
may continue to execute in an object that otherwise appears to have
fallen out of reference. To ensure this does not happen it is important
to remove redundant objects from subscribed events before they go out
of reference. A second issue occurs if objects are being created very
rapidly but have a short life time. Under these circumstances, the
garbage collector can fail to keep up and memory usage can grow
far beyond what is really needed. The use of objects in this way is not
efficient and the problem is easily solved either by re-using objects
or storing data in structs that are handled more efficiently on the
stack.
Pointers
Although C# runs in a managed environment, it also provides facilities
for the limited use of pointers. Pointers have to be declared in a
special unsafe mode and can be used to point to floats, integral types
and structs. Obtaining pointers to strings and arrays is also possible.
As these are managed on the heap by the garbage collector, a statement
is provided to fix the memory position whilst the program is executing
within a specified portion of code. Arrays accessed using pointers
avoid the array boundary checking that is used on standard arrays and
can increase the speed of data access.
Microsoft and .NET
The .NET portion of C#.NET refers to the class library that comes with
the run-time environment. The class library provides all the
functionality that is available with VC++ MFC and lots lots more. The
most important new addition is substantial web support, expanded
database support and the addition of the remoting infrastructure.
Classes for extensive maths support, hashtables and linked-list classes
are also provided along with classes to support multi-threading.
Remoting
Perhaps the most interesting part of the .NET class library is the new
remoting infrastructure. This basically replaces older technology such
as DCOM and supports remote objects within the C# environment. Remote
objects can be created across application boundaries and even between
applications running on different machines. The remoting infrastructure
effectively acts as native middleware and allows communication between
multiple applications forming complicated distributed systems.
Remote objects are either client activated or server activated
depending on whether the client or server has control of the lifetime
of the object. To complicate matters a little further the server
objects can be single call or singleton. Single call objects are
stateless and created and destroyed for every single call of the remote
method. Singleton objects maintain state information between calls.
The remoting infrastructure generally works extremely well but has few
areas where problems can arrise. Remote methods that return remote
classes can experience problems with security settings. However, with
the appropriate workaround, will work reliably between processes on a
single machine. Remote objects returning client activated objects
between different computers can run into problems with network settings
and are best avoided. For reliable remoting
between different computers it is best to stick to single-call,
server-activated objects that return native C# types such as ints,
floats or arrays. Data in objects can be serialised into an array for
transmission and deserialsed at the receiving end.
Multi-threading
The .NET library also contains a number of classes to help with the
implementation of mutli-threaded applications including mutex and
monitor classes. A new thread is easily created by creating a thread
class, passing it a thread-start object and then calling its start
method. The thread class provides simple methods for controlling thread
priority, pausing, resuming and killing a thread. The language includes
a lock statement that can be use to control thread access to a class.
External Components
Support for external components implemented in unmanaged code is
provided within the framework by the interop services. Simple DLLs can
be accessed by declaring the external functions within C#. These can
then be called from managed code without any noticeble performance
overhead.
Support for external COM components is more complicated and involves
creating an interoperabiliy wrapper DLL for the COM component. The
development environment contains tools to automatically generate the
wrapper DLLs so it is only necessary to point at the COM component and
click. Once the wrapper DLL is produced the application can talk to the
wrapper which handles all the communications with the unmanaged COM
component.
Execution Speed
As with any program, speed of execution depends far more on how well
the application is coded than on the underlying language. From
experience, the performance of a well designed C# application is very
similar to that of an equivalent C++ application. However, this does
depend on avoiding some of the speed pitfalls such as boxing/unboxing
and array boundary checks on all array accesses. Unsafe C# code that
makes use of the pointer facilities seems to execute at exactly the
same speed as a C++ equivalent. This gives the option of writing
non-performance critical code using standard safe C# and then
optimising the critical parts using pointers in unsafe code. A
historical data server implemented like this was capable of reading 5
million ticks from disk, compressing this into 100,000 bars and
returning the results over the network in less than a second on a
fairly standard 2GHz PC.
Conclusion
C# is a modern language with clean syntax and development tools to
allow rapid application development. The use of a manged environment
makes it easier to write applications
and eliminates the potential problems of bad pointers and memory leaks
that can cause unreliability in applications developed in C++. The .NET
class library provides many functions that make it ideal for
writing real-time financial applications such as the support
for events, multi-threading and remote objects. Applications
developed in C# can offer most the speed performance provided by C++.
|