GIS

Posted by David Glassborow Tue, 23 May 2006 15:09:00 GMT

I’m writing this posting for two reasons. Firstly my company is currently engaged in a project with a client about getting more of their data onto their intranet based mapping tool, so I’ve spent the last 3 days researching and getting back up to speed on the industry.

Secondly my brother has just got a new job working as sales manager for a company providing GIS data, so I thought an introduction and overview of the subject might be of use to him. Apologises if I get too technical Mike.

Geographic Information Systems

A GIS is a system the stores and displays maps electronically. Its a smart tool that helps us understand the world around us better, to get our work done more efficiently. I’ve been interested in GIS since one of my first jobs where I was in charge of writing a routing and optimising algorithm for solving a variant of the Travelling Sales Problem. Essentially our product was in charge of figuring out routes for workman to get their work down in the most efficient way possible, and send them routes on their mobile devices so they didn’t get lost. It was really interesting work, we worked with providers of GIS data about the road network, and with the UK goverment for access to electronic versions of their maps.

I think part of my interest also comes from travelling. When I first go to a new city, I’m always keen to find a map, to give me a feel of the layout. I always choose Lonely Planet over Rough Guide, because their maps are better. It was very strange when I was in Guyana in the Jungle, the only maps we had were from satelite images taken over 10 years ago, at a very low resolution. Most of the time we only have a vague idea of where we were at any time. Rivercourses flood, move, the tree cover prevents identification of key features. It made me realise how much we take accuracte maps for granted.

GIS is everywhere at the moment. Tom Tom are advertising their navigation products on TV. Google Local can show me a map of my local city in the time it takes to click a mouse, and Google Earth can let me zoom in from space to see my parents house, the roof of my dad’s landrover. NASA can let me look at the surface of Venus.

Raster and Vector

The most important division in GIS to understand is the different between Raster and Vector information. Raster information is pictures, its pixels that show us what our cities look like from space. Its the images that digital cameras give us. Its resolution dependant. Vector information is mathmatical in nature, its descriptions of things in terms of lines, circles, polygons.

Google earth is predominaly raster based, as you zoom in loads in images with more detail.

Google Earth

Google Local is vector based, it shows you your local highstreet as a series of straight lines, it shows you the nearby river as a polygon.

Google Local

The real power comes when we use raster and vector together. When we turn on the country boundaries in NASA’s World Wind we seem them overlayed over the raster images of earth:

WindWalker

Raster information is an approximation to the real world, its pictures of the world. Vectors are mathmatical, they are ‘idealised’ versions of the world. The mathamtical nature of vectors allow us to use maths to answer questions like: “How many roads are there within 2 miles of the river ?”. Or this more relevant example “Show me all the pubs within 2 miles of where I live” in Mappoint 2004:

Mappoint

Raster examples:

  • Pictures of landscapes, from space or from aircraft
  • Pictures of clouds
  • Infrared pictures showing heat usage

Vector examples:

  • Countries, Regions, Postal boundaries
  • Streets, Motorways, Garages
  • Wind speeds, wave heights

Organisations

OpenGIS

The main organisation dealing with GIS standards appears to be the “Open Geospatial Consortium”, a group of 300 organisation worldwide. It has prepared various specifications for open standards:

  • GML: A geographical data format based on XML
  • Simple Features (SFSQL): A standard way to store data within standard SQL Databases
  • Various web interfaces for providing both vector and raster information in agreed formats (GML)

Companies

The leaders in the industry for GIS software seem to be ESRI, Mapinfo and Integraph. The two largest data provders are NAVTEQ, used by Google maps, and TeleAtlas.

Representation

An important thing to understand in GIS is that there is no universal ‘best’ representation for coordinates on Earth. The earth is not an exact sphere, its not flat, so various different schemes have been invented. Which is used is, as ever, is dependant on what your trying to achieve.

Longitude and Latitude

The standard way of representing points on the earth is obviously using long and lat. Longitude is the lines from the North Pole to the South Pole, with 0 degrees going through Greenwich, London. Latitude is the lines around the ‘waist’ of the earth, with 0 degrees at the equator, + 90 degrees being the north pole. LongLat

Longitude and Latitude are measured against a known datum. This is the reference model used, i.e. the mathematical figures used for calculating the best fit shape of the Earth. The most common currently in use is the WGS84 datum (WGS stands for ‘World Geodetic System’). It is possible to convert from one datum to another, but this can require some quite complex mathematics.

GPS

Global Positioning System is where alot of GIS data is coming from. It is a set of satelites flying through space, transmitting signals allowing the calculation of very accurate position. GPS figures are long/lat figures based on the WGS84 datum.

Grid References

Rather than capture coordinates as spherical coordinates (like long/lat), coordinates can be stored as a reference on a grid. As the Earth is obviously not flat, this is an approximation, but can be easier to deal with when mapping smaller areas. The UK uses a grid reference system for its ‘Ordance Survey’ (OS) maps. In this coordinate system, points are measured in Eastings and Northings. This grid is based on a different datum, OSGB36, based on a slightly different elipsoid representation of the Earth. The ‘Helmert’ datum transformation can be used to convert to WGS84, but with an approximate error of 7m (depending on location). The image below shows where the eastings and northings are measured from. The letters in the grids allow another way of specifying location, called National Grid References (NGRs), with further digits specifying easting and northing offsets within each 100km grid square. More digits give more accuracy or specify a smaller area. An example of an NGR is ‘NN 166 712’. This gives a 100m square area in Scotland. The good thing about NGS’s is that they are useful shorthand to give approximate position, or for capturing an ‘area’.

OS GridRef

The UK goverment can provide digitial maps of the country in GML, with currently 440 million distinct features taking up 660 Gigabytes of storage. The data is called OS Mastermap

Formats

GML and SVG

GML and SVG are both XML based standards that are used within GIS. GML stands for Geography Markup Lanaguage, and is a format designed to hold geographical features, geometry (vector), raster information, as well as units, coordinate representation systems, and map styles.

SVG is actually just a generic format for holding vector graphic information, and stands for Scalable Vector Graphics. Its like an Open Standard version of Adobe’s Flash files, and supports both raster and vector information of any kind, not just geographic info. The customer we are doing the GIS work for currently uses SVG for their display of GIS data inside a webbrowser, using Adobe’s free SVG plugin for Internet Explorer (note: firefox supports a subset of SVG natively now). SVG has well defined Javascript bindings so its easy to create interactive displays.

To me GML seems a better storage format and SVG a better display format.

Other

There are various file formats in use within the industry. Although the OpenGIS is pushing GML, most data seems to be available in either ESRI’s shape format, or Mapinfo’s standard file format. Both are proprietary, but easy to parse formats that most tools support.

Databases

There seems to be a big push to moving ‘spatial’ capabilities into databases at the moment. This makes sense as really geometry is just another attribute of an entity. Moving the geometry into database tables allows this data to be kept alongside other attributes, as well as allowing databases to carry out geospatial queries.

The market leader in geospatial databases is Oracle. All versions of their database support what they call Locator features. This allows the storages of complex vectors into tables, as well as support for complex queries (e.g. find me all objects that lie within the boundary of this polygon, find me there nearest 5 customers to our store). The top end product also has spatial features that allow very complex geometric calculations to be done. The nice thing is that the free Oracle Express (10g at the moment) supports the locator features. IBM’s DB2 seems to have some sort of spatial extender but I couldn’t find any information on the web about it.

Some of the GIS software companies provide software to help with database storage, like ESRI’s ArcSDE, but these are just ‘layers’ that sit between software and the database, and are not open formats, tying you into using a particular vendor.

Two of the open source database engines have recently added spatial features to their engines. Postgres SQL has the addon PostGIS that supports the SFSQL specification mentioned above (as does Oracle). MySQL also has spatial extensions based on the SFSQL spec, but the current version (5.0) seems only to support planar (flat) and not spherical coordinate systems. The mathematics used for distances, areas, etc. are different for planar and spherical representations.

It was hoped that Microsoft SQL Server would add some spatial features to the new 2005 release, but this was not the case. There is however an interesting MSDN article showing how spatial features can be added to SQL Server, but this does not support open standards (surprise surprise).

Spatial Indexes

When reading about GIS formats, one of the key issues is how the formats are ‘indexed’ to make retrieval efficient. Oracle uses Helical Hyperspatial Codes (HHCodes), a data structure developed by the Canadian Hydrograph office. Dispite the fancy name, it seems to be a binary tree, dividing each dimension in order. Its designed to support as many dimensions as desired, to an arbitray level of detail. The MSDN article discusses the use of Hierarchical Triangular Meshes, which is a technique of covering a globe with a mesh of subdivided triangles. These seems work well for spherical mapping, but does not seem as general case as HHCodes. PostGIS and MySQL seems to use extensions of standard database indexes, rather than specific spatial indexing.

Summary

The GIS industry is becoming more and more important, as developers I can see us having to deal with spatial information more and more. If i get time I’ll try and write an article on how to access SFSQL spatial data from within Delphi. For more information see Wikipedia’s entry.

Comments (originally on blogger.com)

Paul Ramsey said…

PostGIS and MySQL use R-Tree indexes, which are rather specifically for spatial data (well, any data type which can be decomposed to a range rectangle). So does Oracle Spatial, for that matter. Space-filling curves, which map multi-dimensional data to something sortable into a B-Tree are ways of leveraging non-spatial database technology to be more useful for spatial data. Space-filling technicques tend not to be as balanced as R-Trees, however. The heirarchical grid is the final trick, and can be quite performant, but needs to be carefully tuned to the characteristics of the data, unlike R-Trees which automagically work for data of uneven scales.

Posted in  | no comments | no trackbacks

Class RTTI

Posted by David Glassborow Mon, 22 May 2006 15:09:00 GMT

This post follows up my previous one about RTTI in Delphi, inspired by Hallvard’s 2 posts here and here, and covers some advanced RTTI features in Delphi that I haven’t seen mentioned anywhere else.

$METHODINFO

While playing around with Websnap in Delphi, trying to extend some of the objects available for scripting, I came across the compiler directive METHODINFO.

The online documentation says:

The $METHODINFO switch directive is only effective when runtime type information (RTTI) has been turned on with the {$TYPEINFO ON} switch. In the {$TYPEINFO ON} state, the $METHODINFO directive controls the generation of more detailed method descriptors in the RTTI for methods in an interface. Though {$TYPEINFO ON} will cause some RTTI to be generated for published methods, the level of information is limited. The $METHODINFO directive generates much more detailed (and much larger) RTTI for methods, which describes how the parameters of the method should be passed on the stack and/or in registers. There is seldom, if ever, any need for an application to directly use the $METHODINFO compiler switch. The method information adds considerable size to the executable file, and is not recommended for general use.

My previous article showed this isn’t completely accurate, detailed RTTI is available for any Interface which has $TYPEINFO or $M around it. $METHODINFO seems to affect classes, in particular it will store detailed RTTI information for not only Published methods, but also Public ones.

Doing a search for this compiler directive in the delphi win32 source code gives us only 1 instance in WebSnapObjs.pas.

{$METHODINFO ON}
TScriptableObject = class(TObjectDispatch)
private
  FLookupList: TStringList;
  FLookupValues: TInterfaceList;
protected
  FPreferChild: Boolean;
  function DispatchOfName(const AName: string): IDispatch; virtual;
  function FindObject(const AName: string): TObject; virtual;
public
  constructor Create;
  destructor Destroy; override;
  class function DispatchOfObject(const AObject: TObject): IDispatch;
  function GetIDsOfNames(const IID: TGUID; Names: Pointer;
    NameCount: Integer; LocaleID: Integer; DispIDs: Pointer): HRESULT;
    override;
  function Invoke(DispID: Integer; const IID: TGUID; LocaleID: Integer;
    Flags: Word; var Params; VarResult: Pointer; ExcepInfo: Pointer;
    ArgErr: Pointer): HRESULT; override;
end;
{$METHODINFO OFF}

Websnap

Websnap is the poor cousin in the web framework world for delphi. Its never had much support, and seems now to be overshadow by ASP.net and Intraweb. I personally quite like it, although I code my own templates in VBScript or JavaScript rather than use any of the Design Time webpage design stuff.

Under the hood, websnap uses the ActiveScript engines provided in Windows. ActiveScript is a scripting host that can support many different COM based scripting languages, and Windows comes with VBScript and JScript (which is basically JavaScript). Other ActiveScript lanaguages are avaialbe including Python and Perl.

The original ASP by Microsoft uses the ActiveScripting engine to do its work. The asp template is turned into a vBScript or JScript program containing the HTML to output as well as the logic of the page. This is fed into the ActiveScripting engine and compiled ready for running. The ActiveScripting engine then has ‘objects’ added to it so the program can do useful work. The most obvious one is the Response object, but there are others like the Session object, etc. The program is then run and the page rendered.

Websnap pages, at least those using a TPageProducer, use this same process to produce HTML pages. The problem for the Delphi deisgners was how to link arbitary Delphi objects up to the ActiveScripting engine, which uses late bound IDispatch COM for communication. The IDispatch interface, one of the main underpining of the COM framework in Windows, uses a single call, Invoke for all method calls. This is where $METHODINFO comes it, the rich method RTTI is provided to allow a single procedure entry, Invoke, to call arbitary Delphi methods.

The VBScript or Javascript script running in the scripting of the websnap page needs to talk to Delphi objects (e.g. Page, Session), and it uses this Rich RTTI to acheive this. You can see the websnap objects that are exposed to the script, have a look in WebSnapObjs.pas, where TResponseObj, TProducerObj, etc.

The unit ObjAuto contains the code and header for retrieving the RTTI information using the following function:

function GetMethodInfo(Instance: TObject; const MethodName: ShortString): PMethodInfoHeader;

In turn, the base class of TScriptableObject (marked with $METHODINFO) uses the RTTI to find methods, and call them, at run time.

ObjAuto.pas

This contains the code to search for a method’s RTTI. Looking at GetMethodInfo, you can see it uses the system.pas vmtMethodTable offset to get hold the method table for the class. It then uses a search to find the correct entry. It also contains the code that allows an arbitary call to an object supporting RTTI to jump to the correct routine:

function ObjectInvoke(Instance: TObject; MethodHeader: PMethodInfoHeader;
  const ParamIndexes: array of Integer; const Params: array of Variant): Variant;

As you can see you just pass it parameters and variants, and it packages them into the correct types and does the call. The source code to this call shows all the complexity of packaging up all the parameters according different conventions, etc. This is ultimately how VBScript objects call methods on Delphi objects inside Websnap.

DetailedRTTI.pas

While playing with the metadata, I coded a few helper classes to aid exploration. You can download the code if you want to have somewhere to start. Just calling .RTTIMethodsAsString() on any object to get a list of its methods and their parameters. Its a bit rough and ready but you’re welcome to use it for whatever.

Summary

This article, and the previous one have shown that rich metadata for methods is available in Delphi, with supporting routines for accessing it. Interface metadata allows the VCL to support SOAP, multiple methods multiplexed to a single call. The rich class metadata allows the VCL to support a single function automatically being routed to other methods, allowing Websnap to expose objects to COM IDispatch automatically.

Comments (originally on blogger.com)

Hallvard Vassbotn said…

Great posts, David!

I reference them here

Posted in  | no comments | no trackbacks

Interface RTTI

Posted by David Glassborow Thu, 11 May 2006 15:07:00 GMT

Reading an article and its follow up by Hallvard about RTTI inspired me to put together a couple of posts about two related areas of RTTI in Delphi. In particular one of the comments on Hallvard’s blog about using this RTTI to call objects in some late bound fashion. This post and the next cover some of the advanced RTTI that I haven’t seen covered in other places. This post covers some of the possibilities for Interface metadata, and the next one will contain details about richer class RTTI for methods.

Interface Metadata

Delphi actually has richer metadata support for methods in an Interface that in a normal class. It looks like this was added to support the SOAP features of the VCL. I’m not sure which version of Delphi it was added so your mileage may vary if your not using 2006.

IInvokable

To use SOAP, you use a WDSL file to specify the method calls, parameters, etc. If you import a WSDL in Delphi, you will notice that all Interfaces in the generated file will be derived from IInvokable. A quick peak in the System unit will show that IInvokable is:

{$M+}
  IInvokable = interface(IInterface)
  end;
{$M-}

I.e. just a standard interface, but with RTTI metadata compiled in.

Looking at the help in BDS 2006 for {$TYPEINFO ON} mentions this:

Note: The IInvokable interface defined in the System unit is declared in the {$M+} state, so any interface derived from IInvokable will have RTTI generated. The routines in the IntfInfo unit can be used to retrieved the RTTI.

IntfInfo.pas

The main procedure of interest in IntIfnfo is:

procedure GetIntfMetaData(Info: PTypeInfo; var IntfMD: TIntfMetaData; IncludeAllAncMethods: Boolean = False);

This will give us a series of records describing the methods on the interface and the parameters needed for these interfaces, as well as the unit it was defined within, the ancestor Interface and the interface’s GUID. All the names are available. both function / procedures and the names of their parameters. Calling this procedure with an interface not having RTTI will raise an exception, calling it with a class’s typeinfo will just cause an a/v :-)

When doing SOAP calls, the developer just uses the defined interface like a normal interface. Behind the scenes, Delphi packages up the parameters and sends them via a SOAP envelope to the remote server. How Delphi does this shows us some of the potential of this RTTI in Delphi, and respect for the Voodoo that is TRIO.

RIO.pas

Located in the soap folder of Delphi’s source code, RIO.pas contains the class TRIO. TRIO is an object that represents a remote object, presumably it stands for Remote Interfaced Object.

When an application casts a TRIO descendant to a registered invokable interface, it dynamically generates an in-memory method table, providing an implementation to that invokable interface.

Looking at the source for TRIO, I’ve come to the conclusion that:

MyRioObject as IMyInvokableInterface

Will cause the TRio object to

  1. Get the meta data for IMyInvokableInterface (from a registry InvRegistry object defined in InvokeRegistry.pas)
  2. Allocate memory for a vtable for the interface
  3. Allocate memory for ‘stub’ routines, marks it as containing executable code
  4. Writes machine code stubs that takes the parameters and packages them up, then calls TRIO.Generic

This is a very crude representation I knocked up in Visio:

image

When you then make a call on the ‘generated’ interface, Delphi calls the vtable, the vtables holds the address of the generated machine code. The generated machine code pushes the parameters then calls the Generic function. This packages up the parameters, and then uses a SOAP call to call the remote service. The return is then packaged up and returned in a similar way, back through the generated stub. If you are interested in how the actual machine code is generated (taking into account the 5 different calling conventions, etc.) take a look at TRIO.GenVTable function.

I don’t know which of the Delphi team wrote this code, but its very very impressive.

Anwyay I hope this has given you a feel for some of the advanced metadata available with Interfaces. The RIO approach would allow you to write Interface proxies of any Interface with metadata, for security, logging and indeed other forms of RPC remoting. Let me know if anybody suceeds in such a thing !

My followup article on class RTTI.

Posted in  | no comments | no trackbacks

Older posts: 1 2