xmerge - a program for merging and straightening images Copyright (C) 2003 Johan Borgxmerge is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. xmerge is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with xmerge; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Included in this distribution is a modified version of the SQP solver hqp, by Ruediger Franke. hqp in turn includes parts of the meschach library by David E. Steward & Zbigniew Leyk. Using xmerge Things you should know: * reading the whole documentation is probably quite essential * xmerge only operates on ppm-files with 24bpp color in binary form (P6) * xmerge only supports X color depths of 16, 24 and 32bpp, not all of them heavily tested. Introduction to xmerge: When the user has provided a set of rules for how the image(s) should look (which points should overlap, horizontal and vertical references, etc) xmerge use an SQP solver for calculating how the images should be placed for this rules to be fulfilled, while minimizing more severe types of distortion of the images. The sqp solver used is HQP, by R. Franke. The original implementation is intended for things related to optimal control problems, and quite bloated for simpler applications where only the good sqp solver is used. As a result, I decided to strip it down a bit. Unfortunately, I don't have the time or interest to complete this effort, so the contents of hqp/ is in a somewhat sorry shape (while usable, many options and alternative parts of the algorithm are not included). In any case, if you wish to use hqp for anything else, I strongly suggest you take a look at the original sources and their documentation. Also, the use of SQP for solving this problem is probably quite a bit of overkill, it is quite possible that some heuristic can be used to transform the problem into a simpler type of optimization. When the "optimal" mapping of the images has been determined, the user is supposed to select edges, internal to the mapped images, which are used for calculating the weighting of the individual images when two images overlap. (areas of the image, but outside of these edges are not used at all) The working image used at this stage is an un-interpolated version of the final image, as the interpolation takes considerable time, and even without interpolation generating the image is quite slow. The preview and final images are generated using interpolation by Elliptically Weighted Averages, which seem to give superior results compared to most other methods, while being considerably slower. A truncated Gaussian function is used for filtering, but use of sin(x)/x may give superior results if a large enough window is used. This option might be implemented in future versions. The mapping of the images used are on the form X=x*(a+b*y)+c*y+d, Y=y*(e+f*x)+g*x+h which in retrospect may be a suboptimal choice, as the inverse coordinate mapping is really messy. Furthermore, other mappings should give better results when applied to specific uses, like merging photographs. There are some plans for implementing more mappings at a later time. A problem inverse mappings for photographs is that a surface in 3D only has 6 degrees of freedom, while parts of the current version assumes 8 degrees. One rather attractive option may be to implement 2 degrees of 2nd order distortion, which could compensate for imperfect optical components. (Any opinions on whether this is a good idea? Any ideas about better ways of modeling distortion from optics? Something with circular symmetry perhaps?) A more generic approach, where mappings can have arbitrary freedom and different mappings may be combined (Ex: correcting for optical errors before inverse Z mapping, and then mapping the result on a sphere) would be really nice, but far much more work. Command line options ("xmerge -h"): -h this message -o set output file for final image, if written (default test.ppm) -O set output file for mapping data, if written -l load saved mapping data, mutually exclusive to specifying ppm files -m magnification applied when solution is calculated (default 1.0) -M magnification applied if mapping data are loaded -b background color (default: 255,0,0 (red)) -r reference-color for gain calculation (gui: b) (default 255,255,255) -R color gain applied if mapping data are loaded (default 0,0,0) -B black level used when final image is created (default: 0,0,0) (color gain or reference level must be adjusted accordingly) options implying non interactive operation: (in order of execution) -s solve loaded problem -W write current mapping data -w write final image Input file(s) can be either -l OR a list of ppm(P6) files. When image files are specified, the images are placed on rows from left to right. The special file name / causes remaining images to be placed on a new row one position down. Empty arguments ("" in most shells) can be used to create empty locations Example: $ xmerge -o foo.ppm -O foo.txt foo00.ppm foo01.ppm foo02.ppm / foo10.ppm foo11.ppm foo12.ppm will start xmerge with six images in two rows and three columns, write final the final image (when requested in the gui, by pressing "w") to the file foo.ppm, and write the current state of the variables used in the mapping to the file foo.txt (when "W" is used in the gui). This file can later be loaded into xmerge using "xmerge -l foo.txt". Using the command line option "-w" together with "-l " it is possible to write the final image corresponding to the saved data, without using the gui. User Manual for the graphical user interface: There are two main modes of operation, referred to as Source and Target. Generic navigation in both modes are performed by clicking with the 2nd (center/middle) button of the pointing device at the point about which the image is to be centered. This is the only means for moving around on an image zoomed to a size larger than the program window. keyboard commands valid in both modes: m switch between source and target mode + zoom in - zoom out w write the final image to disk W write a loadable listfile describing the current state of the problem s solve the problem specified by the relations set in the Source mode r reset the solution to its initial state, useful if solving for impossible relations has been attempted, with a solution from which the solver is unable to continue, even if the relations are corrected. a auto-improve the matching of the regions specifying point-relations, by using the current solution when mapping pixels. This command usually mess things up more than it improves, and all point relations are affected, even those manually matched. q exists xmerge In Source mode an array of input images are displayed. This mode is used for specifying the relations used for calculating the merging of the images. There are two types of relations available in this version: Point relations, specifying two points of two different images, which are supposed to map to the same point when the final image is produced. This is done by selecting an area of one image, moving the resulting sub-image to another input image, and clicking again when the regions of both images match. By default, xmerge tries to improve the match, so only a rough match is required. This feature can be turned off and on by pressing "i". Additionally, when the automatic improvement is turned off, it is possible to force the matching regions to stay at integers pixels. This is toggled with "I". The center points of these rectangles are used as the actual point-relations. This type of relation can be removed by clicking in the blue/green rectangle specifying one side of a point-pair, or moved, by clicking in the red/yellow rectangle specifying the other side. Directional relations, forcing a line to take a certain direction in the final image. When the source mode is entered, creation of point relations is the default operation. By typing "1".."4" horizontal, vertical, 45deg and -45deg directional relations, respectively, are created. These relations are removed by clicking near the start of the line (featuring a small line indicating the desired direction), or moved by clicking near the end of the line. is used to return to creating point relations. Additionally, brightness adjust operations are performed by typing "b" in source mode. When active, areas of the source images can be selected for calculation of the gain of each color, required to make the average color in these areas equal to the reference color specified at the command line. When an additional area is specified, the values from all areas selected within the image are combined. The gains can be set to 1 (with weight 0 when new areas are specified) by typing "B". is used to return to creating point relations. It should be remembered that each image has exactly 8 degrees of freedom, that a point constraint share 2 relations between 2 images, while a direction constraint contributes only one relation, possibly shared between two images. Specifying too many relation for an image or a set of images will invariably result in something useless, usually collapsing all images to cover one point. Even if the total number of constraints are not exceeded, it is still possible to achieve fucked up results if the relations can impossibly be satisfied (example: 3 horizontal relations in one image). Bad results will result when badly placed relations are used (example: it is possible to solve a case where 2x2 images are used, each sharing two points, but in order to match all the 4 point relations where all images meet at the center, the images are (except for special cases) stretched far from their desired shapes (the correct thing is to only use 3 point relations where 4 images meet)). Finally, 2 degrees of freedom of the whole problem is used for origin of the target (which is later discarded) and 1 degree for the scaling of the image When all relations have been specified, "s" is used for solving the problem by using an SQP optimizer on a problem where the relations form equality constraints and a function describing absolute and relative scaling, skewing, stretching and rotation of the images is minimized. When the optimizer is finished (or has given up from failing to find a solution) xmerge is automatically switched to target mode. Keyboard commands in Source mode (summary): 1..4 angle relation submode b brightness adjust B reset brightness (for all images) i toggle automatic improvement of point-relations I toggle integer placement of point relations in manual mode abort operation in progress, or return to point relation creation, if no operation in progress In Source mode, pressing and holding the 3rd (usually right) pointing device button hides the swapped images covering point relations, useful when manually matching images or for verifying the performance of the automatic match improvement. Target mode displays a "fast" but inaccurate rendering of the final image. In it's normal mode of operation, the outer edges of each image (yellow/black lines) used when the images are merged, are controlled, these edges are modified either by clicking on a corner and moving it or clicking on an edge to insert a new corner to be placed. Corners can be removed by pressing "d" when a corner is being moved. When two images overlap, ratio of the products of the distances to the two closest edges (of this type, not the actual edge of the image) of each image covering the pixel is used for weighting their individual contributions. By typing "c" the crop rectangle for the final image is selected (red/black lines) can be modified (click on an edge and drag to a new position By typing "p" preview images of selected regions can be generated, without writing the final image to disk. These preview images are displayed by executing execvp "xv -" (so you better make sure you have xv somewhere in your PATH, if you wish to use this feature). keyboard commands in Target mode: (summary) c select crop rectangle (final image size) selection mode p select preview mode d delete current corner (when moving image-edge) abort operation in progress, or return to image edge modifications, if no operation in progress In Target mode, clicking the 3rd (usually right) pointing device button is used for changing the order in which overlapping images are drawn by placing the bottom image at the top. xmerge prints rather large amounts of debugging information to stdout, most of which are probably rather incomprehensible, but occasional messages about the progress of current operation, may still be useful. That should be enough, hopefully...