LibXDiff(3)                File Differential Library               LibXDiff(3)



NAME
       xdl_set_allocator, xdl_malloc, xdl_free, xdl_realloc, xdl_init_mmfile,
       xdl_free_mmfile, xdl_mmfile_iscompact, xdl_seek_mmfile,
       xdl_read_mmfile, xdl_write_mmfile, xdl_writem_mmfile,
       xdl_mmfile_writeallocate, xdl_mmfile_first, xdl_mmfile_next,
       xdl_mmfile_size, xdl_mmfile_cmp, xdl_mmfile_compact, xdl_diff,
       xdl_patch, xdl_merge3, xdl_bdiff, xdl_bdiff_tgsize, xdl_bpatch - File
       Differential Library support functions


SYNOPSIS
       #include <xdiff.h>

       int xdl_set_allocator(memallocator_t const *malt);
       void *xdl_malloc(unsigned int size);
       void xdl_free(void *ptr);
       void *xdl_realloc(void *ptr, unsigned int nsize);
       int xdl_init_mmfile(mmfile_t *mmf, long bsize, unsigned long flags);
       void xdl_free_mmfile(mmfile_t *mmf);
       int xdl_mmfile_iscompact(mmfile_t *mmf);
       int xdl_seek_mmfile(mmfile_t *mmf, long off);
       long xdl_read_mmfile(mmfile_t *mmf, void *data, long size);
       long xdl_write_mmfile(mmfile_t *mmf, void const *data, long size);
       long xdl_writem_mmfile(mmfile_t *mmf, mmbuffer_t *mb, int nbuf);
       void *xdl_mmfile_writeallocate(mmfile_t *mmf, long size);
       void *xdl_mmfile_first(mmfile_t *mmf, long *size);
       void *xdl_mmfile_next(mmfile_t *mmf, long *size);
       long xdl_mmfile_size(mmfile_t *mmf);
       int xdl_mmfile_cmp(mmfile_t *mmf1, mmfile_t *mmf2);
       int xdl_mmfile_compact(mmfile_t *mmfo, mmfile_t *mmfc, long bsize, unsigned long flags);
       int xdl_diff(mmfile_t *mmf1, mmfile_t *mmf2, xpparam_t const *xpp, xdemitconf_t const *xecfg, xdemitcb_t *ecb);
       int xdl_patch(mmfile_t *mmf, mmfile_t *mmfp, int mode, xdemitcb_t *ecb, xdemitcb_t *rjecb);
       int xdl_merge3(mmfile_t *mmfo, mmfile_t *mmf1, mmfile_t *mmf2, xdemitcb_t *ecb, xdemitcb_t *rjecb);
       int xdl_bdiff(mmfile_t *mmf1, mmfile_t *mmf2, bdiffparam_t const *bdp, xdemitcb_t *ecb);
       long xdl_bdiff_tgsize(mmfile_t *mmfp);
       int xdl_bpatch(mmfile_t *mmf, mmfile_t *mmfp, xdemitcb_t *ecb);


DESCRIPTION
       The LibXDiff library implements basic and yet complete functionalities
       to create file differences/patches to both binary and text files. The
       library uses memory files as file abstraction to achieve both perfor-
       mance and portability. For binary files, LibXDiff implements (with some
       modification) the algorithm described in File System Support for Delta
       Compression by Joshua P. MacDonald, while for text files it follows
       directives described in An O(ND) Difference Algorithm and Its Varia-
       tions by Eugene W. Myers.  Memory files used by the library are basi-
       cally a collection of buffers that store the file content. There are
       two different requirements for memory files when passed to diff/patch
       functions. Text files for diff/patch functions require that a single
       line do not have to spawn across two different memory file blocks.
       Binary diff/patch functions require memory files to be compact.  A com-
       pact memory files is a file whose content is stored inside a single
       block.  Functionalities inside the library are available to satisfy
       these rules.  Using the XDL_MMF_ATOMIC memory file flag it is possible
       to make writes to not split the written record across different blocks,
       while the functions xdl_mmfile_iscompact() , xdl_mmfile_compact() and
       xdl_mmfile_writeallocate() are usefull to test if the file is compact
       and to create a compacted version of the file itself. The text file
       differential output uses the raw unified output format, by omitting the
       file header since the result is always relative to a single compare
       operation (between two files). The output format of the binary patch
       file is proprietary (and binary) and it is basically a collection of
       copy and insert commands, like described inside the MacDonald paper.



   Functions
       The following functions are defined:

       int xdl_set_allocator(memallocator_t const *malt);

              The LibXDiff library enable the user to set its own memory allo-
              cator, that will be used for all the following memory requests.
              The allocator must be set before to start calling the LibXDiff
              library with a call to xdl_set_allocator().  The memory alloca-
              tor structure contains the following members:

                   typedef struct s_memallocator {
                        void *(*malloc)(unsigned int size);
                        void (*free)(void * ptr);
                        void *(*realloc)(void *ptr, unsigned int nsize);
                   } memallocator_t;

              The malloc() function pointer will be used by LibXDiff to
              request a memory block of size bytes. The free() function
              pointer will be called to free a previously allocated block ptr
              , while the realloc() will be used to resize the ptr to a new
              nsize size in bytes. The LibXDiff user must call xdl_set_alloca-
              tor() before starting using the library, otherwise LibXDiff
              functions will fail due to the lack of memory allocation sup-
              port.  A typical initialization sequence for POSIX systems will
              use the standard malloc(3), free(3), realloc(3) and will look
              like:

                   memallocator_t malt;

                   malt.malloc = malloc;
                   malt.free = free;
                   malt.realloc = realloc;
                   xdl_set_allocator(&malt);



       void *xdl_malloc(unsigned int size);

              Allocates a memory block of size bytes using the LibXDiff memory
              allocator. The user can specify its own allocator using the
              xdl_set_allocator() function. The xdl_malloc() return a pointer
              to the newly allocated block, or NULL in case of failure.


       void xdl_free(void *ptr);

              Free a previously allocated memory block pointed by ptr.  The
              ptr block must has been allocated using either xdl_malloc() or
              xdl_realloc().


       void *xdl_realloc(void *ptr, unsigned int nsize);

              Resizes the memory block pointed by ptr to a new size nsize.
              Return the resized block if successful, or NULL in case the
              reallocation fails. After a successful reallocation, the old ptr
              block is to be considered no more valid.


       int xdl_init_mmfile(mmfile_t *mmf, long bsize, unsigned long flags);

              Initialize the memory file mmf by requiring an internal block
              size of bsize.  The flags parameter is a combination of the fol-
              lowing flags :

              XDL_MMF_ATOMIC Writes on the memory file will be atomic. That
              is, the data will not be split on two or more different blocks.

              Once an xdl_init_mmfile() succeeded, a matching
              xdl_free_mmfile() must be called when the user has done using
              the memory file, otherwise serious memory leaks will happen.
              The function return 0 if succeed or -1 if an error is encoun-
              tered.


       void xdl_free_mmfile(mmfile_t *mmf);

              Free all the data associated with the mmf memory file.


       int xdl_mmfile_iscompact(mmfile_t *mmf);

              Returns an integer different from 0 if the mmf memory file is
              compact, 0 otherwise. A compact memory file is one that have the
              whole content stored inside a single block.


       int xdl_seek_mmfile(mmfile_t *mmf, long off);

              Set the current data pointer of the memory file mmf to the spec-
              ified offset off from the beginning of the file itself. Returns
              0 if successful or -1 if an error happened.


       long xdl_read_mmfile(mmfile_t *mmf, void *data, long size);

              Request to read size bytes from the memory file mmf by storing
              the data inside the data buffer. Returns the number of bytes
              read into the data buffer. The amount of data read can be lower
              than the specified size.  The function returns -1 if an error
              happened.


       long xdl_write_mmfile(mmfile_t *mmf, void const *data, long size);

              Request to write size bytes from the specified buffer data into
              the memory file mmf.  If the memory file has been created using
              the XDL_MMF_ATOMIC flag, the write request will not be split
              across different blocks.  Note that all write operations done on
              memory files do append data at the end the file, and writes in
              the middle of it are allowed. This is because the library memory
              file abstraction does not need this functionality to be avail-
              able.  The function returns the number of bytes written or a
              number lower than size if an error happened.


       long xdl_writem_mmfile(mmfile_t *mmf, mmbuffer_t *mb, int nbuf);

              Request to sequentially write nbuf memory buffers passed inside
              the array mb into the memory file mmf.  The memory buffer struc-
              ture is defined as :

                   typedef struct s_mmbuffer {
                        char *ptr;
                        long size;
                   } mmbuffer_t;

              The ptr field is a pointer to the user data, whose size is spec-
              ified inside the size structure field. The function returns the
              total number of bytes written or a lower number if an error
              happened.


       void *xdl_mmfile_writeallocate(mmfile_t *mmf, long size);

              The function request to allocate a write buffer of size bytes in
              the mmf memory file and returns the pointer to the allocated
              buffer. The user will have the responsibility to store size
              bytes (no more, no less) inside the memory region pointed to by
              the returned pointer.  The files size will grow of size bytes as
              a consequence of this operation. The function will return NULL
              if an error happened.


       void *xdl_mmfile_first(mmfile_t *mmf, long *size);

              The function is used to return the first block of the mmf memory
              file block chain. The size parameter will receive the size of
              the block, while the function will return the pointer the the
              first byte of the block itself. The function returns NULL if the
              file is empty.


       void *xdl_mmfile_next(mmfile_t *mmf, long *size);

              The function is used to return the next block of the mmf memory
              file block chain. The size parameter will receive the size of
              the block, while the function will return the pointer the the
              first byte of the block itself. The function returns NULL if the
              current block is the last one of the chain.


       long xdl_mmfile_size(mmfile_t *mmf);

              The function returns the size of the specified memory file mmf.


       int xdl_mmfile_cmp(mmfile_t *mmf1, mmfile_t *mmf2);

              Request to compare two memory files mmf1 and mmf2 and returns 0
              if files are identical, or a value different from 0 if files are
              different.


       int xdl_mmfile_compact(mmfile_t *mmfo, mmfile_t *mmfc, long bsize,
       unsigned long flags);

              Request to create a compact version of the memory file mmfo into
              the (uninitialized) memory file mmfc.  The bsize parameter spec-
              ify the requested block size and flags specify flags to be used
              to create the new mmfc memory file (see xdl_init_mmfile() ). The
              function returns 0 if succedded or -1 if an error happened.


       int xdl_diff(mmfile_t *mmf1, mmfile_t *mmf2, xpparam_t const *xpp,
       xdemitconf_t const *xecfg, xdemitcb_t *ecb);

              Request to create the difference between the two text memory
              files mmf1 and mmf2.  The mmf1 memory files is considered the
              "old" file while mmf2 is considered the "new" file. So the func-
              tion will create a patch file that once applied to mmf1 will
              give mmf2 as result. Files mmf1 and mmf2 must be atomic from a
              line point of view (or, as an extreme, compact), that means that
              a single test line cannot spread among different memory file
              blocks.  The xpp parameter is a pointer to a structure :

                   typedef struct s_xpparam {
                        unsigned long flags;
                   } xpparam_t;

              that is used to specify parameters to be used by the file dif-
              ferential algorithm.  The flags field is a combination of the
              following flags :


              XDF_NEED_MINIMAL Requires the minimal edit script to be found by
              the algorithm (may be slow).

              The xecfg parameter point to a structure :

                   typedef struct s_xdemitconf {
                        long ctxlen;
                   } xdemitconf_t;

              that is used to configure the algorithm responsible of the cre-
              ation the the differential file from an edit script. The ctxlen
              field is used to specify the amount of context to be emitted
              inside the differential file (the value 3 is suggested for nor-
              mal operations).  The parameter ecb is a pointer to a structure
              :

                   typedef struct s_xdemitcb {
                        void *priv;
                        int (*outf)(void *, mmbuffer_t *, int);
                   } xdemitcb_t;

              that is used by the differential file creation algorithm to emit
              the created data. The priv field is an opaque pointer to a user
              specified data, while the outf field point to a callback func-
              tion that is called internally to emit algorithm generated data
              rappresenting the differential file. The first parameter of the
              callback is the same priv field specified inside the xdemitcb_t
              structure. The second parameter point to an array of mmbuffer_t
              (see above for a definition of the structure) whose element
              count is specified inside the last parameter of the callback
              itself. The callback will always be called with entire records
              (lines) and never a record (line) will be emitted using two dif-
              ferent callback calls. This is important because if the called
              will use another memory file to store the result, by creating
              the target memory file with XDL_MMF_ATOMIC will guarantee the
              "atomicity" of the memory file itself.  The function returns 0
              if succeeded or -1 if an error occurred.


       int xdl_patch(mmfile_t *mmf, mmfile_t *mmfp, int mode, xdemitcb_t *ecb,
       xdemitcb_t *rjecb);

              Request to patch the memory file mmf using the patch file stored
              in mmfp.  The mmf memory file is not changed during the opera-
              tion and can be considered as read only.  The mode parameter can
              be one of the following values :


              XDL_PATCH_NORMAL Perform standard patching like if the patch
              memory file mmfp has been created using mmf as "old" file.

              XDL_PATCH_REVERSE Apply the reverse patch. That means that the
              mmf memory file has to be considered as if it was specified as
              "new" file during the differential operation ( xdl_diff() ). The
              result of the operation will then be the file content that was
              used as "old" file during the differential operation.

              The ecb will be used by the patch algorithm to create the result
              file while the rjecb will be used to emit all differential
              chunks that cannot be applied.  Like explained above, callbacks
              are always called with entire records to guarantee atomicity of
              the resulting output.  The function returns 0 if succeeded with-
              out performing any fuzzy hunk detection, a positive value if it
              secceeded with fuzzy hunk detection or -1 if an error occurred
              during the patch operation.


       int xdl_merge3(mmfile_t *mmfo, mmfile_t *mmf1, mmfile_t *mmf2,
       xdemitcb_t *ecb, xdemitcb_t *rjecb);

              Merges three files together. The mmfo file is the original one,
              while mmf1 and mmf2 are two modified versions of mmfo.  The
              function works by creating a differential between mmfo and mmf2
              and by applying the resulting patch to mmf1.  Because of this
              sequence, mmf1 changes will be privileged against the ones of
              mmf2.  The ecb will be used by the patch algorithm to create the
              result file while the rjecb will be used to emit all differen-
              tial chunks that cannot be applied.  Like explained above, call-
              backs are always called with entire records to guarantee atomic-
              ity of the resulting output.  The function returns 0 if suc-
              ceeded or -1 if an error occurred during the patch operation.


       int xdl_bdiff(mmfile_t *mmf1, mmfile_t *mmf2, bdiffparam_t const *bdp,
       xdemitcb_t *ecb);

              Request to create the difference between the two text memory
              files mmf1 and mmf2.  The mmf1 memory files is considered the
              "old" file while mmf2 is considered the "new" file. So the func-
              tion will create a patch file that once applied to mmf1 will
              give mmf2 as result. Files mmf1 and mmf2 must be compact to make
              it easy and faster to perform the difference operation.  Func-
              tions are available to check for compactness ( xdl_mmfile_iscom-
              pact() ) and to make compact a non-compact file (
              xdl_mmfile_compact() ). An example of how to create a compact
              memory file (described inside the test subdirectory) is :

                   int xdlt_load_mmfile(char const *fname, mmfile_t *mf, int binmode) {
                        char cc;
                        int fd;
                        long size, bsize;
                        char *blk;

                        if (xdl_init_mmfile(mf, XDLT_STD_BLKSIZE, XDL_MMF_ATOMIC) < 0)
                             return -1;
                        if ((fd = open(fname, O_RDONLY)) == -1) {
                             perror(fname);
                             xdl_free_mmfile(mf);
                             return -1;
                        }
                        if ((size = bsize = lseek(fd, 0, SEEK_END)) > 0 && !binmode) {
                             if (lseek(fd, -1, SEEK_END) != (off_t) -1 &&
                                 read(fd, &cc, 1) && cc != '\n')
                                  bsize++;
                        }
                        lseek(fd, 0, SEEK_SET);
                        if (!(blk = (char *) xdl_mmfile_writeallocate(mf, bsize))) {
                             xdl_free_mmfile(mf);
                             close(fd);
                             return -1;
                        }
                        if (read(fd, blk, (size_t) size) != (size_t) size) {
                             perror(fname);
                             xdl_free_mmfile(mf);
                             close(fd);
                             return -1;
                        }
                        close(fd);
                        if (bsize > size)
                             blk[size] = '\n';
                        return 0;
                   }

              The bdp parameter points to a structure :

                   typedef struct s_bdiffparam {
                        long bsize;
                   } bdiffparam_t;

              that is used to pass information to the binary file differential
              algorithm.  The bsize parameter specify the size of the block
              that will be used to decompose mmf1 during the block classifica-
              tion phase of the algorithm (see MacDonald paper).  Suggested
              values go from 16 to 64, with a preferred power of two charac-
              teristic.  The ecb parameter is used to pass the emission call-
              back to the algorithm responsible of the output file creation.
              The function returns 0 if succeede or -1 if an error is
              occurred.


       long xdl_bdiff_tgsize(mmfile_t *mmfp);

              Given a binary memory file patch, it returns the size that the
              result file will have once the patch is applied to the target
              file. It can be used to pre-allocate (or write-allocate) a mem-
              ory block to store the patch result so that a compact file will
              be available at the end of the operation. The function returns
              the requested size, or -1 if an error occurred during the opera-
              tion.


       int xdl_bpatch(mmfile_t *mmf, mmfile_t *mmfp, xdemitcb_t *ecb);

              Request to patch the binary memory file mmf using the binary
              patch file stored in mmfp.  The mmf memory file is not changed
              during the operation and can be considered as read only. The
              binary patch algorithm has no notion of context, so the patch
              operation cannot be partial (either success or failure). The ecb
              parameter contain the callabck (see above for description) used
              by the binary patch algorithm to emit the result file. The func-
              tion returns 0 if succeeded or -1 if an error occurred during
              the patch operation.


SEE ALSO
       Two papers drove the content of this library and these are :

       o      File System Support for Delta Compression by Joshua P. MacDonald

       o      An O(ND) Difference Algorithm and Its Variations by Eugene W.
              Myers.


       Also usefull information can be looked up inside the diffutil GNU pack-
       age :

       http://www.gnu.org/software/diffutils/diffutils.html


LICENSE
       This library is free software; you can redistribute it and/or modify it
       under the terms of the GNU Lesser General Public License as published
       by the Free Software Foundation; either version 2.1 of the License, or
       (at your option) any later version.  A copy of the license is available
       at :

       http://www.gnu.org/copyleft/lesser.html


AUTHOR
       Developed by Davide Libenzi <davidel@xmailserver.org>


AVAILABILITY
       The latest version of LibXDiff can be found at :

       http://www.xmailserver.org/xdiff-lib.html


BUGS
       There are no known bugs.  Bug reports and comments to Davide Libenzi
       <davidel@xmailserver.org>




GNU                                   0.7                          LibXDiff(3)
