Global Refinement of Positions and Uncertainties 
                        (H.L. McCallon 981012)

The following outline describes a plan to refine 2MASS positions and
position uncertainties taking advantage of global overlap information as
contained in the database.  The position corrections are obtained by
"Martinizing" the position data.  The name comes from the fact that
Martin Weinberg originally proposed this type of approach a number of
years ago. Prior to that step, however, the original position uncertainties
must be adjusted to compensate for the fact that frame position uncertainty
contribution is currently set to a constant in the pipeline processing.
The described approach to achieve the needed uncertainty adjustments
resulted from a number of discussions between myself and John Fowler
and has elements in common with the position "Martinizing".

Before getting into the position refinement itself, a method for 
extracting the needed data from the database and setting it up for
efficient access is described.  This is quite important given the huge
volume of data involved.  The technique steps through the database a
tile at a time, computing what it needs and storing the results in
memory.  After all the needed data is accumulated, the refinement
is done using arrays in memory.

The position uncertainties of point sources coming out of the pipeline 
will be adjusted to make them reflect actual discrepancy dispersions
by multiplying their error variances by a factor denoted "variance
factor". These variance factors will be determined for a dozen Dec 
segments in each scan.

Positions at the scan segment level will be adjusted by a global 
minimization of scan overlap discrepancies weighted against
ACT reference star match information.

This refinement technique could be applied at any point in the survey,
up to and including full sky coverage. The greater the contiguous
coverage, the better the results. A test case, using prototype code and
data from only a single night (971116n), showed a significant improvement.
The proposed steps are as follows:

  I. Step through tiles and, for the primary scan covering each, compute
     difference information w.r.t. ACT's and overlapping scans.

     A. Pull positions of the four corners for the primary scan from
        the database.

     B. Pull all sources from any scan within the database which have
        high quality extractions and are enclosed by the primary-scan 
        four corner points.

     C. Compute average position variances (RA and Dec) for primary
        scan and load into the arrays VarRA and VarDec indexed by
        tile. In most instances each will be the average of a group
        of constants.  This occurs because:

          1) Only "high quality extractions" (I-B) are used.
          2) PROPHOT applies a floor to the position uncertainties
             which generally clips the high quality extractions
             at a minimum allowed value.
          3) The contribution of position reconstruction of the
             frames to the original variance is constant.  This,
             of course, is what gives rise to the need for the 
             variance factors.

     D. Match primary-scan sources to overlapping-scan sources and
        to ACT's.

     E. Divide matches from step I-D into 12 segments by Dec. These
        will henceforth be referred to as "tile segments" and will
        be the basic unit of position adjustment.

     F. For each tile-segment overlap check existing entries in the 
        idtsl and idtsh arrays to see if the overlap combination has
        already been processed. Note that each overlap combination
        will show up twice but need only be processed once. Assuming
        the combination has not already been processed, compute the
        following variables, increment the overlap index "iol" and load 
        into the indicated arrays (otherwise skip to G):

          1) Number of matches => nMCHo(iol)
          2) Sums of match differences in RA => SdRAo(iol)
          3) Sums of match differences in Dec => SdDECo(iol)
          4) Sums of match difference squares in RA => SdRA2o(iol)
          5) Sums of match difference squares in Dec => SdDC2o(iol)

        In order to provide easy access to the overlap data two additional
        arrays with the same indexing are maintained.  These are

          6) Unique ID of lower tile segment in overlap => idtsl(iol)
          7) Unique ID of higher tile segment in overlap => idtsh(iol)

        The terms "lower" and "higher" here simply refer to the tile-segment
        ID's themselves.  The sign of the difference sums (SdRAo and SdDECo)
        should always reflect the position of the tile-segment with the
        larger ID minus that with the smaller ID. 

     G. For each tile-segment identify the ACT matches, compute the
        following variables and load to tile-segment indexed (its) arrays
        as indicated:

          1) Number of ACT matches => nMCHa(its)
          2) Sums of ACT match differences in RA => SdRAa(its)
          3) Sums of ACT match differences in DEC => SdDECa(its)

  II. Prepare difference data for efficient access.

     A. Generate two sets of keys allowing indexed access to the
        overlap arrays (I-F) in:

          1) ascending idtsl order => array "idxtsl"
          2) ascending idtsh order => array "idxtsh"

     B. Step through the idtsl overlap array in ascending order
        using the idxtsl keys and build a tile-segment indexed
        array "tsl1st" giving the location of the first occurrence
        of that tile-segment in idts1 order.  Repeat to build a
        tile-segment indexed array "tsh1st" for first occurrence in 
        idtsh order.

     Note: One can now access tile-segment unique data directly
           via tile-segment index.  The shared overlap data can be
           obtained by first using the tsl1st key to jump into the
           idxtsl array at the first occurrence of that tile-segment
           then pulling in all data until idtsl changes.  The 
           remaining overlap data can be obtained in the same manner
           using the tsh1st key to jump into idxtsh array at the first
           occurrence of that tile-segment then pulling in all data until
           idtsh changes.

  III. Compute variance factors for each tile segment.  The RA and Dec
       factors can be computed independently. The steps described are
       applied to RA and then Dec to generate separate sets of variance
       factors. Since the purpose of the variance factors is to allow
       the contribution of the frame position uncertainties to vary,
       only that portion of the original variance should be multiplied
       by the variance factor.  This can be achieved by representing
       the adjusted variance as follows:
                     var_adj = var_orig + V0*(vfac-1)

       where: var_adj  = adjusted variance
              var_orig = original variance
              V0       = constant frame variance used in pipeline
              vfac     = variance factor

       Note: A variance factor of one indicates no adjustment. A 
             variance factor of zero would reduce the contribution     
             of the frame uncertainty to zero. 
     A. The summation of chi-squares over n in each overlap can be
        written as:

         sum_chisq/n = (sum_dx^2)/{N*[Var1+V0*(vf1-1) +Var2+V0*(vf2-1)]}

         where: sum_dx^2 = sum of overlap differences squared
                N        = number of matched points in overlap
                vf1      = variance factor of first tile segment 
                vf2      = variance factor of 2nd tile segment
                Var1     = average original variance from first tile (I-C)
                Var2     = average original variance from 2nd tile (I-C)
                V0       = constant frame variance used in pipeline

     B. Form sum "S" of differences which, if driven to zero, would drive
        the sum_chisq/n values to one.  It will not, of course, be
        possible to drive "S" to zero, but it can be minimized. Each term  
        in "S" is associated with an overlap and is weighted by the
        number of matches in that overlap.

        S =SUM{Nij*[(sum_dx^2)ij -Nij*(Vari+V0*(vfi-1) +Varj+V0*(vfj-1))]^2}

           where i ranges over the tile segments and j over the overlaps

     C. Form a set of simultaneous linear equations by taking the
        partial derivative of S w.r.t. each of the tile-segment variance 
        factors in turn and setting each partial equal to zero.

     D. Solve simultaneous linear equations above for the desired variance
        factors which minimize the sum_chisq/n differences from one.
        This step is complicated by the fact that for the entire sky
        (limiting the northern survey to > -12 deg and the southern 
        to < +18 deg) there are 893364 tile-segments.  Although the
        matrix to be inverted is extremely large (893364x893364)
        it is also extremely sparse with typically only about 2-3 non-zero
        elements per row. The number of non-zero elements goes up near the
        poles but still remains very small.

        1. The matrix will have to be stored and solved using sparse
           matrix techniques.  IMSL has some routines which look
           promising assuming they can handle a problem this
           large. Need to check with Martin Weinberg on the upper
           limits of the sparse matrix approach.

        2. Should the matrix prove too large even for sparse matrix
           techniques, an alternate approach might be to solve the
           problem as described but with manageable subsets of the
           entire sky. Each subset could be made up of a number of
           adjacent Dec bands. This should leave things close enough
           to finish off with iteration.

     E. Iteration could proceed as follows:

        1. Determine what variance factor for the primary tile segment
           would make each of the overlap chisq/n values go to one.
           Let vfvij represent the vote of overlap j on primary variance 
           factor i; then:

            vfvij = [(sum_dx^2)ij/Nij -Vari -Varj]/V0 -vfj_lst +2

           where vfj_lst is the variance factor for tile-segment j from 
           the last iteration

        2. Compute a weighted average of the vfvij overlap votes to
           obtain what a full-step value (vfi_fs) for the new iteration
           of vfi would be.

               vfi_fs = (sum Nij*vfvij)/(sum Nij)

           where the sums are over the j overlaps.

        3. Take a portion of the indicated full step to get the new value
           for vfi:

               vfi_new = vfi_lst +alf*(vfi_fs -vfi_lst)

           where alf is the factor which specifies the fraction of the
           full step to use. It usually helps to start alf out low
           and let it approach one as the iteration converges.

     F. Clip variance factors to prevent using unreasonably low variances 
        as weights in the position correction (step IV).  Define VarMin to 
        be the minimum reasonable frame variance. If vfi*V0 adjusted positions and associated uncertainties
                         2 => adjusted uncertainties for non-adjusted positions
                 nTile = tile number             
                 RA    = uncorrected RA
                 DEC   = uncorrected DEC
                 SigRA = original RA sigma
                 SigDEC= original Dec sigma

         Outputs: (for nflag=1)
                  RAc   = corrected RA
                  DECc  = corrected Dec
                  SigRAa= adjusted sigma for corrected RA
                  SgDECa= adjusted sigma for corrected Dec

                  (for nflag=2)
                  RAc   = uncorrected RA
                  DECc  = uncorrected Dec
                  SigRAa= adjusted sigma for uncorrected RA
                  SgDECa= adjusted sigma for uncorrected Dec