Opened 12 years ago
Closed 11 years ago
#267 closed task (fixed)
Tiling with rasimport
Reported by: | Dimitar Misev | Owned by: | Alexander Herzig |
---|---|---|---|
Priority: | critical | Milestone: | 9.0 |
Component: | rasgeo | Version: | 8.3 |
Keywords: | Cc: | a.beccati@…, Peter Baumann, joachim.ungar@…, HerzigA@…, Piero Campalani | |
Complexity: | Trivial |
Description (last modified by )
Check what is the tiling strategy that rasimport uses. Is it fixed to a certain tile size and configuration, or it's adaptable to the input maybe?
Check whether it's easily possible to make it flexible (i.e. allow using the rasql storage layout sublanguage).
Update documentation to include the tiling parameter that has been implemented as flexible tiling strategy specification solution.
Attachments (4)
Change History (35)
comment:1 by , 12 years ago
Cc: | added |
---|---|
Description: | modified (diff) |
comment:2 by , 12 years ago
follow-up: 5 comment:3 by , 12 years ago
Replying to dmisev:
Check what is the tiling strategy that rasimport uses.
rasimport uses rasql's 'insert into COLLNAME values … ' statement without specifying any tiling scheme at all. BTW, is there a default tiling scheme for this case?
Check whether it's easily possible to make it flexible (i.e. allow using the rasql storage layout sublanguage).
As Alan has suggested, easiest would be to have an option like
—tiling '<here comes the tiling spec as string parameter>'
and if it's specified, it just goes at the end of the 'insert into COLLNAME values …' statement.
Would that work?
comment:4 by , 12 years ago
Alex, that should work. Only inconvenience is that the string has to be enclosed in quotes properly to make it one shell word, such as:
$ rasimport … —tiling "area of interest [blabla]"
…which seems acceptable. So Alan's suggestion is favored by me, too.
follow-up: 6 comment:5 by , 12 years ago
Replying to herziga@…:
rasimport uses rasql's 'insert into COLLNAME values … ' statement without specifying any tiling scheme at all. BTW, is there a default tiling scheme for this case?
If I understood correctly, rasimport imports data partitioning it manually into chunks of a variable size (the chunk size is computed depending on some parameters)? So the tiles in the object are equivalent to the chunks that rasimport commits.
By default there's no tiling, I still need to change this to the most meaningful generic tiling spec.
comment:6 by , 12 years ago
Replying to dmisev:
If I understood correctly, rasimport imports data partitioning it manually into chunks of a variable size (the chunk size is computed depending on some parameters)? So the tiles in the object are equivalent to the chunks that rasimport commits.
Yes, that's correct. rasimport first creates (insert into COLLNAME …) an initial one pixel image (e.g. [0:0,0:0] for 2D) and then subsequently updates it by chunks of rows (e.g. update <COLLNAME> as m set m assign shift(<MDD>, <r_Point>) where oid(m) = <OID>). If I now were to specify a tiling scheme with the initial insert statement, would those incoming chunks (update statement) be adjusted to that scheme automatically or would they have to be put in in appropriate chunks (tiles) according to the scheme?
by , 12 years ago
by , 12 years ago
follow-up: 15 comment:7 by , 12 years ago
Yes it will automatically partition the update chunks according to the tiling scheme, but the problem is that it won't automatically accommodate the existing tiles.
To give you an example, suppose the tiling is regular 512x512 tiles, but rasimport commits 750x512 chunks. Then the resulting tiles in rasdaman after inserting two chunks with rasimport will be as
but it should be as
But this is a general issue of partial updates, not of rasimport I'd say. So as long as we use some fixed larger chunk size in rasimport (e.g. 100MB) I think issues like this will be minimized.
follow-up: 9 comment:8 by , 12 years ago
So in conclusion: I think the —tiling "tiling_spec" which will be passed verbatim at the end of the first insert statement as "tiling tiling_spec" is a pretty good solution.
comment:9 by , 12 years ago
Replying to dmisev:
So in conclusion: I think the —tiling "tiling_spec" which will be passed verbatim at the end of the first insert statement as "tiling tiling_spec" is a pretty good solution.
Sweet! BTW, rasimport uses 128MiB as chunksize
follow-up: 11 comment:10 by , 12 years ago
To me it seemed like it's variable, because rasimport in certain cases was creating a huge number of small tiles in my experience.
comment:11 by , 12 years ago
Replying to dmisev:
To me it seemed like it's variable, because rasimport in certain cases was creating a huge number of small tiles in my experience.
Mmmh, that's interesting. rasimport uses a fixed number of rows (nrows = maxMem_bytes / (numColumns * pixelsize_bytes)) for each iteration step, only the last chunk may be smaller. Perhaps I'm missing something in my own code?? BTW it's in rasimport's importImage(…) function.
comment:12 by , 12 years ago
Yes, this formula is pretty much what I ended up with when I investigated, but didn't have time to try understand the reason for it. So apparently the chunk size is not fixed, but perhaps what you mean is that it's limited to 128MB?
comment:13 by , 12 years ago
Sorry, you're right, that's what I meant. The reason is to be able to process images which don't fit into RAM; 128MiB is just an arbitrary choice. We could turn it into a parmater though?
comment:14 by , 12 years ago
Milestone: | → 8.4 |
---|
comment:15 by , 12 years ago
Replying to dmisev:
I just made an initial test with rasgeo and the new tiling option. Unfortunately, it doesn't seem to work with the current rasgeo workflow logic of importing an image as chunks of rows by partial updates:
rasimport -f t1.img -coll t1tiled1 -tiling "tiling regular [0:499,0:499] index rc_index"
ERROR - rimport::main, l. 1371: Exception: The tile configuration is incompatible to the marray domain.
I assume 'compatible' means, the chunk size must not be smaller than the tile size (for any or all dimensions?)? If that's the case, we had to adjust rasgeo such that the chunk size is adjusted (made compatible) to the tile size. This involved revising the whole logic to partition the data as well as adding capability to parse the tiling specification in the first place. Since you mentioned earlier that partial updates and accommodating for existing tiles is more of a server rather than a client problem, I was wondering whether you've got any ideas how to proceed in this case? Is this something you're going to address in the future, or do we have to implement 'tiling upon import' for large data on the client side?
follow-up: 17 comment:16 by , 12 years ago
The regular tiling is a bit constrained, it has to divide evenly the image domain, but even then it may still be a problem with the chunks, I'm not sure.
Can you try maybe with aligned tiling and leave out the index? E.g.
tiling aligned [0:499,0:499] tile size 250000
(multiply the tile size by the type size)
comment:17 by , 12 years ago
Replying to dmisev:
Aligned tiling seems to work with rasimport, at least it doesn't throw any exceptions and the image is imported correctly. However, I don't know how to check whether the tiling is correct though.
Strangely enough, I couldn't import an image using partial updates and aligned tiling on the command line (s. tilingtest_2.txt). I also tried directional tiling on the commandline, but it didn't work either using the 'partial update' workflow (and hence failed with rasimport). So, it seems only aligned tiling is working with partial updates and therefore with rasimport. See tilingtest_2.txt for the few tests I did.
by , 12 years ago
Attachment: | tilingtest_2.txt added |
---|
comment:18 by , 12 years ago
Yes, with directional tiling it won't work, because it expects that the limits you give when you insert the array match the domain of the inserted array, unless the dimension is marked as *
Maybe you can attach a patch here and I'll check if the aligned tiling worked well.
comment:19 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
by , 12 years ago
Attachment: | 0001-provisional-patch-adding-tiling-support-to-rasimport.patch added |
---|
comment:21 by , 12 years ago
Oh I didn't notice a patch was uploaded, trac doesn't seem to send notifications for attachment uploads.
The patch is fine, just missing to update the README with the new parameter. It can be applied and later we can fix the README
comment:23 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
follow-up: 25 comment:24 by , 12 years ago
Complexity: | → Medium |
---|
Alex, thanks a lot for patching! However, I agree with Dimitar that a README file or help entry would be useful for us. Should we reopen the ticket?
comment:25 by , 12 years ago
Cc: | added |
---|
Replying to ungarj:
Very good point, Joachim, we shouldn't forget about that. Not quite sure how to handle this,
shall we re-open this one or open a new one?
comment:26 by , 12 years ago
Complexity: | Medium → Trivial |
---|---|
Description: | modified (diff) |
Priority: | major → minor |
Resolution: | fixed |
Status: | closed → reopened |
Reopened and updated accordingly.
comment:27 by , 12 years ago
Milestone: | 8.4 → 8.5 |
---|
comment:28 by , 12 years ago
Description: | modified (diff) |
---|---|
Priority: | minor → critical |
We got some feedback by users about that missing documentation so I'm raisin priority
comment:29 by , 12 years ago
Status: | reopened → assigned |
---|
comment:30 by , 11 years ago
Milestone: | 8.5 → 9.0 |
---|
Probably an option to be specified on the command line with the tiling substring to be passed to the insert inside rasgeo would be most flexible.