Archive for GSoC09

GSoC 09 Final Report: 07/08 – 17/08

Work done

This last week, I fixed the last found errors in the existing code. I planned to implement more pending features, but after talking with my mentors, I decided to review, fix and test the existing code. So, after manually fixing some easy-to-find bugs, I exported the GDAL autotest code from https://svn.osgeo.org/gdal/trunk/autotest and created my own test script, using python.

With this script, I successfully tested:

  • Get GDAL driver by name
  • Try to open a dataset using wrong connection strings
  • Compare, pixel by pixel, 2 WKT Rasters (1 band and 3 bands) with the original files
  • Open 2 WKT Rasters (1 band and 3 bands) and compare projection references and geotransform arrays with the original files

There are 9 tests. The necessary data for the test have been added to the gdrivers/data directory. You’ll need:

I successfully used GDAL tools like gdalinfo, gdal_translate or gdalwarp to test the code too. And I tried to use gdal2tiles, but the code needs a few modifications to accept a connection string as a valid dataset.  Oh, and I improved my version of gdal2wktraster script to add outdb support. I’ve updated the patch in the official osgeo ticket. So, you can download and test this new version. It still needs improvements, I’m sure.

There are two features that I couldn’t test, for different reasons, and this will be two of my first tasks after GSoC. These features are:

  • Outdb support: After modifying and testing the gdal2wktraster script to add outdb support, I realized on one bug in my code: I was trying to read the outdb raster data from IReadBlock method. But the outdb raster doesn’t necessarily have the same block arrangement than the indb raster. So, the block offsets given in IReadBlock couldn’t be valid for the outdb raster. And  normally, they aren’t. I realized on this error too late, and I prefered to add a comment and continue developing this feature after GSoC. In addition, you have to hack the WKT Raster code to allow outdb raster support at WKT Raster extension’s level, because this feature is under development. I chose the short path: comment the outdb checking, in rt_pg/rtpostgis.sql file, lines 532 – 535.
  • Endianess: I created the code for adding raster data to a testing table with a different endianess that my machine (little endian machine). But, when I tried to execute the “INSERT” instruction, the data was automatically changed to the correct endianess by the WKT Raster extension. I tried to hack the code, but I failed this time…

So, I have some untested features finished or partially finished (in the case of outdb support). And I finished other features, like setting raster data, or setting projection, geotransform array and nodata values. These features will be used for the creation of new WKT Rasters from GDAL driver.

Anyway, if you want to see the complete list of TODO tasks (always under revision), check the project page. I think the driver, in its current state, is a stable prototype GDAL driver, and it will be a complete one by the release of the version 1.7.0 of GDAL.

Here, you can check out the GDAL code, with the WKT Raster protoype code

Here, you can check out the GDAL autotest code, that includes the testing code for GDAL WKT Raster.

GSoC 09 conclussions

I think the GSoC 09 has been one of the greater experiences of my student’s life. For first time, I fell I’ve developed a really useful tool, in the frame of a excelent software like GDAL, an essential piece of the Open Source GIS-related world. I was a GDAL user, and now, with the permission of the GDAL team, I feel like a newbie GDAL developer. An incredible feeling :-)

During this summer of code, I’ve acquired more programming skills than in the last year. I’ve learned:

  • How to do some things in a smarter way, without “reinventing the wheel”.
  • How to organize a big software project using C/C++
  • How to write good makefiles
  • How to implement a driver, meaning how to take advantage of basic OO features like inheritance, information hiding, polymorphism, data abstraction, modularity, encapsulation. The GDAL driver architecture is an excelent tool to see these concepts working.
  • How to make questions. When you post a message to a developer list, it’s important that you provide a thorough description of the error, an example output, the version of the libraries you’re using, the configuration of your working enviroment… Obvious? Of course, but you need to be in the situation to really realize on how to ask for help.
  • Related with previous point, how to feel totally stupid when you find the error just after cliking on the “Send” button. An unique feeling :-)
  • How to organize myself. Yes, it may sound stupid, but you don’t know how organized are you  until you have a chance like this.
  • How to sleep less than 4 hours and write (working) code :-) . This drives us to an eternal discussion: tea or coffee? Tea works better for me…

The future…

In the future, I plan to finish the driver code, of course. Now, I’m personally commited with it. I should finish the driver’s code for the release of GDAL v1.7.0, and I’m going to work in this way. As I said, you can check the todo tasks, and the planned schedule (always under revision), in the project page.

Last, but not least, I’d like to thank my mentors, and all the GDAL and PostGIS development teams members for their patience, solving all my doubts. Of course, thanks to the OSGeo for giving me this chance of learning and helping the Open Source community.

Oh, and this is for Wolf Bergenheim: I think I escaped punishment of the Knights of Ni!, the keepers of the sacred words: Ni, Peng, and Neee-Wom (“Those who hear them seldom live to tell the tale!”). Will I have my T-Shirt? :-) :-) :-)

Comments (2)

GSoC 09 Weekly Report #11: 31/07 – 07/08

Work done

This week:

  • I’ve fixed a bug with overviews’ dimensions.
  • I’ve successfully tested the overviews with dumpoverviews utility. I had to introduce small changes in the code to accept connections to PostgreSQL as input datasets. I created a patch for the original code, but it has no sense to use it without the rest of WKT Raster driver.
  • I’ve added support for outdb rasters. It has to be tested.
  • I’ve changed the way the driver gets the data from database. To minimize the rounds to the server, I’ve created a kind of “cache” system at Dataset’s level. After opening the dataset (establish a database connection, basically), The driver tries to fetch all the blocks covered by the whole raster’s extent. In case of success, it store all the blocks as WKTRasterWrapper objects in an array at Dataset. Then,WKTRasterRasterBand::IReadBlock method get the blocks from this array. If fails, execute a query to fetch it from database. I’m testing this cache system just now.

Planned work for next week

Next week is the last one. I’ll finish testing the tasks of this week and I’ll write final documentation and some testing scripts.

There are todo tasks, of course. But I’ve planned to continue with the project after GSoC, and my mentors think it’s a good idea.

Problems found

The way to implement the cache system. An option was to use the cache at RasterBand’s level (GDALBlocks), but one of my mentors suggested me other idea, and I adopted it.

Comments (2)

GSoC 09 Weekly Report #10: 24/07 – 31/07

Work done

This week I was out for 3 days due to an unexpected personal issue. So, I didn’t finish all my planned tasks. Now, I’m making up these days. Anyway, the work done was:

  • Start the RASTER_COLUMNS table update with the information fetched from database.
  • Create a patch for gdal2wktraster script and send it to the PostGIS track as new ticket, instead of sending the modified script to the list. I’ve delayed the out-db raster support in GDAL because is under development in WKTRaster, and I tried to help in this task.
  • Correct errors. I’ve introduced some changes, like erase the overviews from SUBDATASETS metadata domain, fix some memory leaks due to sharing objects between Datasets and overviews, add NBITS metadata information related with each RasterBand, and some other little fixes.
  • Test overviews support with gdal_translate.
  • Start block caching

Thanks to people from gdal and postgis lists (like Frank, Tamas, Mateusz, Even…) I solved the two doubts I had the last week:

DOUBT: What happens if the bb of a block offset in IReadBlock doesn’t match any real block? Missing-tiles raster?
RESPONSE: In this case, the buffer must be filled with nodata values

DOUBT: Where should I fit WKTRaster “16BF” datatype? In GDT_Uint16?, in a single float?
RESPONSE: It fits in a 32bits float. Anyway, this datatype is proposed to be deleted from WKTRaster: http://trac.osgeo.org/postgis/ticket/226

UPDATE: I forgot to add the graphical proof of overviews running:

Here, the file utm.tif, used for testing (reduced to fit into the blog’ space). Look at the dimensions, at bottom left corner.

utm

Here, the result after loading the image in PostgreSQL with gdal2wktraster, using a block size of 100×100px and converting the image to TIFF format again using gdal_translate with an output size of 50%

utm_ov2

Planned work for next week

This week must be the week to end the tasks related with regular_blocking support, and to start with non-regular blocking arrangements.

Problems found

One important problem was to share a PGconn object between one dataset and its overviews (datasets too). The connection is released in the class destructor, but only the general dataset should do it, as the connection owner. Finally, I solved the situation thanks to suggestions from Even Rouault and Mateusz Loskot, by using a boolean var to detect if it’s the dataset or one of its overviews.

Leave a Comment

GSoC 09 Weekly Report #9: 17/07 – 24/07

Work done

This week:

  • I implemented access to overviews, adding overviews tables as subdatasets of the general dataset, and adding metadata info to SUBDATASETS domain.
  • I implemented raster inplace update, bt overwriting the IWriteBlock method of RasterBand.
  • I reorganized some parts of the code by creating RasterWrapper and RasterBandWrapper classes. In these classes, I do all the tasks related with raster format: parse hexwkb format, swap words with needed, get and set raster attributes, etc.  The WKTRasterWrapper takes the hexwkb string fetched from database, and fill raster properties by parsing it. The most useful thing, in my opinion, is that in any moment, you can get the updated hexwkb representation of the raster again.  The WKTRasterBandWrapper allows setting new data for the band.
  • I implemented a TODO task of the gdal2wktraster script: support from out-db rasters. My version of the script has been sent for review to PostGIS devel list.

The two last tasks will help me in the implementation of support for out-db rasters in the driver

Planned work for next week

  • Finish support for out-db rasters.  I asked Mateusz for this option in the loader script, and he told me to implement it, if I wanted. I need this option for adding out-db support in the driver.
  • Correct errors. This is a transversal task. This week I found a couple of important errors in several parts of the code.
  • Update RASTER_COLUMNS table with the information fetched in IReadBlock, if needed. This was an old doubt, and I think that’s the correct point, thanks to a Frank’s comment.
  • Start with block caching. In each call to IReadBlock, I fetch all the raster from database.

Problems found

  • The spatial query seems to work, but I’m not sure if is the most appropiate:
    • With GIST index:
      SELECT rast  FROM table WHERE rast ~ ST_SetSRID(ST_MakeBox2D(
      ST_Point(lowerLeftX, lowerLeftY) ,ST_Point(upperRightX, upperRightY)),
      srid);
    • Without GIST index:
      SELECT rast  FROM table WHERE _ST_Contains(rast, ST_SetSRID(ST_MakeBox2D(
      ST_Point(lowerLeftX, lowerLeftY) ,ST_Point(upperRightX, upperRightY)),
      srid));
  • When calculating a block  Bounding Box based on a block offset and the block size in IReadBlock, I crossed two values (X-Y), and, of course, it failed for non-square blocks. I spent a couple of days with this error
  • DOUBT: What happens if the bb of a block offset in IReadBlock doesn’t match any real block? Missing-tiles raster?
  • DOUBT: Where should I fit WKTRaster “16BF” datatype? In GDT_Uint16?, in a single float?

Leave a Comment

GSoC 09 Weekly Report #8: 10/07 – 17/07

Work done

This week I finished extending the basic GDAL WKT Raster driver code, and continued working with overviews and out-db rasters.

More specifically:

  • I solved the problem with “Invalid angle” changing srid of the data to 26711, the correct one.
  • I used a spatial query to fetch the correct block from database.
  • I created my own PQ-style array parser, to extract each element as a string.
  • I allowed raster with several bands and all pixel types, taking endianess into account

After this, I started working in adding support for reading overviews. The line executed to load raster overviews:

gdal2wktraster.py -r utm.tif -t table -s 26711 -b 1 -k 100x100 -l 2 -V -I -M -v

Where table is the name of my raster table.

Just now, I’m working finishing the overviews code and working in the support for out-db rasters.

Planned work for next week

Finish the support for overviews and out-db rasters, and try to implement the raster inplace update.

Problems found

  • The spatial query seems to work, but I’m not sure if is the most appropiate:
    • With GIST index:
      SELECT rast  FROM table WHERE rast ~ ST_SetSRID(ST_MakeBox2D(
      ST_Point(lowerLeftX, lowerLeftY) ,ST_Point(upperRightX, upperRightY)),
      srid);
    • Without GIST index:
      SELECT rast  FROM table WHERE _ST_Contains(rast, ST_SetSRID(ST_MakeBox2D(
      ST_Point(lowerLeftX, lowerLeftY) ,ST_Point(upperRightX, upperRightY)),
      srid));
  • I spent a lot of time implementing two methods that were implemented in GDAL :-( . These are CPLHexToBinary, in cpl_string.cpp and GDALSwapWords, from rasterio.cpp. Actually, I changed my point several times before finding these useful functions. It was a really stupid waste of time…
  • In previous versions, I transformed to binary only the raster data fetched from database, instead of the complete raster representation. This caused problems detecting the end of the data with several bands. Once changed, it worked.
  • Each band has its own pixel type, and I was considered the same pixel type for all the bands of a given raster. It worked because all the bands actually had the same pixel type, but I realized on the error testing the code and I changed it.

Leave a Comment

GSoC 09 Weekly Report #7: 03/07 – 10/07

Work done

This week I’ve written the code of an almost-finished-basic-version of the read-only driver.

Why almost? Because I have to perform a spatial query, looking for all raster blocks that contains one point (the center of each block). In case of regularly-blocked rasters, I’ll get only one block. Instead of this spatial query, I have a testing one, always reading the same block of data.

And why basic? Because the driver only works for rasters with 8 bits/pixel, and one band of grey colour (colour statically returned, from now)

Apart from this, the driver seems to work. Here, the exit for command

gdalinfo -mm -stats -checksum "PG:host='localhost' dbname='gsoc09_test'
user='postgres' password='postgres' table='usa_mountain_one_band'
where='rid > 0'"

captura_wktraster2

Yes, there is an error. And it’s related with the fact that the Coordinate System description doesn’t have a part with UNIT["metre",1,AUTHORITY["EPSG","9001"]]. So, gdalinfo thinks that the coordinate units are degrees, and take the UTM coordinates (the original image coordinates) as invalid angle values. I suppose that the error was to choose an invalid srid (4267) when loading the data on PostGIS, but I’m not sure yet.

Oh, I sent the midterm evaluation this week, almost forget it…

Planned work for next week

  1. Fix the problem with “Invalid angle”.
  2. Change the testing query for a real one.
  3. Continue working to extend the basic driver to a more general one.

Problems found

Basically, some segmentation faults and  memory leaks when moving over the HEXWKB representation of the raster. Solved, from now.

Leave a Comment

GSoC 09 Weekly Report #6: 26/06 – 03/07

Work done

I have updated the project trac page by gathering all the mails, post comments, related docs and new ideas together: http://trac.osgeo.org/gdal/wiki/WKTRasterDriver. So, I defined a new project plan, available for comments, of course.

I looked for more testing data, and I found ftp://ftp.remotesensing.org/geotiff/samples/. Useful.

Then, I continued working on the Dataset/RasterBand to achieve the Objective 1 (see project plan in trac). Still working…

Planned work for next week

Finish the basic version of the read-only driver: reading support for one-band raster, without overviews neither outdb support, from now. First objective.

Problems found

I loaded this tif file for testing: ftp://ftp.remotesensing.org/geotiff/samples/gdal_eg/cea.tif

gdal2wktraster.py -r cea.tif -t mountain_one_band -s 4267 -b 1 -k -I -O -V

And I got an assertion error:

File "/home/jorge/gsoc09/src/wktraster/scripts/gdal2wktraster.py", line 855, in wkblify_raster
 assert band_ov is not None

This error is in wkblify_raster function, when trying to get Overview 0 from RasterBand 1.The function “calculate_overviews” returns  0 overviews, but the loop is executed anyway (look at the commented part):

for nov in range(0, 1): # noverviews):

So, I decided to avoid the overview creation (not in my objective 1), load the raster without this option and investigate a bit more. My first checking was if  I was using an old version of gdal2wktraster script. Yes, I was. Time to svn update.

Once updated, I realized on some syntax changes. Basically:

  • The “-O” option now doesn’t exist. A new option “-l OVERVIEW_LEVEL” is used (How do you know the limit to OVERVIEW_LEVEL?)
  • New “-M” option, to execute VACUUM command against created tables. Good idea :-)
  • The “-k option” now takes an additional param, the block size desired. Then, we don’t use the block size given by GDAL (why?)
  • As a consequence of previous change, the “-m blocksize” option has been deleted

Leave a Comment

GSoC 09 Weekly Report #5: 19/06 – 26/06

Work done

This week my work, principally, was to create a testing enviroment: I need to see what I’m doing, and debug it, if needed. So, I have wktrastertest.c for testing purpose. Basically, I check that the connection is correctly done. Of course, I’ve created the code for connecting with database, and basic methods for georeferencing: GetGeoTransform and GetProjectionRef.

Another important task was reading documentation:

  • Libtool: I knew of it, but I never used it. Great. But compilation is slower. Frank suggested me to configure gdal without libtool (–without libtool in configure script execution)
  • Makefiles: Ok, I’ve written makefiles before, but just now I realize on how powerful is make tool. After reading the code of GDAL’s GNUMakefiles and make’s documentation[1] , I feel like a real programmer J
  • PostgreSQL documentation about system catalogs: All the important information about the tables of a given PostgreSQL database is stored in tables. So, I had a look in the documentation to check which tables I needed to get info about my own tables (pg_class, pg_attribute and pg_type were enough).

Apart from the official documentation on these areas, I found these books really useful. From now, “must-have” books:

  1. Linux programming unleashed (amazon.com)
  2. Beginning Linux programming (amazon.com)

Oh, I forgot this. I saw an old friend again…

Click to enlarge

I missed him ;-)

Problems encountered

First problem was a physical problem: hardware crash L. My hard disk died, and I had to buy a new one. I lost part of my work, but thanks to Subversion, I could get back the most of the code.

Anyway, during few days, I became absorbed into the GDAL-build system, and learning how powerful the UNIX build system can be. At first, I was using the GNUMakefile from PGCHIP driver, adapted for mine. Two things confounded me:

  • $(OBJ:.o=.$(OBJ_EXT))
  • O_OBJ = $(foreach file,$(OBJ),../o/$(file))

First one: substitution reference[2]. As I understood, It basically means that all my objects extensions will be replaced by the value of OBJ_EXT. I found that variable into GDALMake.opt (key file). If my system have libtool (yes, it has), this extension is .lo. So, time for reading about libtool[3]

Second one: foreach function[4]. O_OBJ will have the value: ../o/firstobject.o ../o/secondobject.o, etc

Now, as I said, I really know the power of UNIX build system.

Another problem was to find my own way of testing code. Tamas, Mateusz and Frank gave me some useful advices. I think I can apply most of them to improve my work-cycle.

The last problem was some doubts about setting or not sequential scan at database, and how useful can be know if the table with the raster column has an index or not. I’m reading about these issues.

Planned work for next week

Next week I’ll try to finish the code from WKTRasterDataset’s Open method. Anyway, I’ll start coding the WKTRasterRasterBand class

—–
1: http://www.gnu.org/software/make/manual
2: http://www.gnu.org/software/make/manual/html_node/Text-Functions.html#Text-Functions
3: http://www.gnu.org/software/libtool/manual/
4: http://www.gnu.org/software/make/manual/html_node/Foreach-Function.html#Foreach-Function

Leave a Comment

GSoC 09 Weekly Report #4: 12/06 – 19/06

First week of “real” coding. First of all, I decided the GDAL version to work with, and I took a snapshot. It was the 1.7.0. In the first weekly report, I said that I checked out the last version of GDAL from repository, but I need to export one version (not check it out), to import it into my own repository.

The second thing was to create my own repository. That’s it: http://www.gis4free.org/gdal_wktraster.

Once I had a repository, I created the skel of the driver. This is:

  • WKTRaster.h: Main header file for the driver, with class definitions
  • WKTRasterDataset.cpp : Driver’s dataset
  • WKTRasterRasterband.cpp : Driver’s raster band
  • GNUMakefile: Makefile to build the driver under GNU enviroments
  • makefile.vc: Makefile to build the driver under Windows enviroments, with Visual C
  • install, todo, readme: Typical information files

The code is still a skel, only. But enough to test a configure-make-make install cycle. So, I did it. Following the instructions of install file (taken from PGCHIP driver), I added my driver to the supported ones and executed

gdalinfo --formats

getting this:

captura_wktraster

UPDATE 2009/09/16:

If you get the message

gdalinfo: error while loading shared libraries: libgdal.so: cannot open shared
object file: No such file or directory

while executing gdalinfo, the fastest solution is to add the libgdal.so directory (/usr/local/lib) to /etc/ld.so.conf and execute

sudo ldconfig

After this, you will execute any gdal-related program without problems.

I’m currently working on writing code, of course.

Things to remark

  1. The connection string syntax. I chose this:
    PG:"dbname='db_name' host='host' port='port' user='user' passwd='passwd'
    
    table='table_name' where='sql_where'"

    Apart from table and where options, the rest of the string is prepared to be directly passed to PGconnectdb function. The sql_where part will be pased to the server in the same way. No SQL processing here. Anyway, this decision is opened to comments.

  2. I’ve installed my own trac software to manage the bugs and tasks of this project, without interfering in the official GDAL trac

Doubts and problems

  • How should be the “perfect” method header comment? I couldn’t find a recommendation for this. When I start a method, I use to put this information in the header comment:
    • Name of the method
    • What does the method do
    • Inputs
    • Outputs

    Is enough? Maybe I can put the external libraries used, for example. Suggestions?

  • What is exactly a sibiling list? I read this concept in the OGR datasource code, but I didn’t find further information, and I think it may be important
  • CPL/VSI functions. Used for portability and allowing virtual file systems. Really interesting. I would like to have further information on them, because are widely used in the GDAL code, and maybe I should learn how to use them properly
  • Ignore list.  I created an ignore list with the command svn propedit svn:ignore.  Then, I added patterns to be ignored, one per line. Is it the correct way?
  • Compiling. I only compile the WKTRaster code. Is it correct? When I have a working code, I compile the whole library again.

Next week

I will be focused in the code to connect with database, and check if the table selected has a raster column.  I will create code for testing, of course.

Related with this, I would need some test data. Where could I find it?

Comments (5)

GSoC 09 weekly report #3: 05/06 – 12/06

After my exams, I continue working. I’ve prepared a proposal to the driver’s implementation. Totally in Request-For-Comments state, of course

EDIT 2009-06-17:  Some errors corrected. New version of the document

GDAL WKT Raster driver specification proposal v0.2 (PDF 629KB)
GDAL WKT Raster driver specification proposal v0.1 (PDF 633KB)

This week, I’ll really start coding following the principles of the proposal.

Comments (7)