NYU-VAGC: a galaxy catalog based on new public surveys

Here we present the New York University Value-Added Galaxy Catalog (NYU-VAGC), a catalog of local galaxies (mostly below a redshift of about 0.3) based on a set of publicly-released surveys (including the 2dFGRS, 2MASS, PSCz, FIRST, and RC3) matched to the Sloan Digital Sky Survey (SDSS) Data Release 2. Excluding areas masked by bright stars, the photometric sample covers 3514 square degrees and the spectroscopic sample covers 2627 square degrees (with about 85% completeness). Earlier, proprietary versions of this catalog have formed the basis of many SDSS investigations of the power spectrum, correlation function, and luminosity function of galaxies. We calculate and compile derived quantities (for example, K-corrections and structural parameters for galaxies). The SDSS catalog presented here is photometrically recalibrated, reducing systematic calibration errors across the sky from about 2% to about 1%. We include an explicit description of the geometry of the catalog, including all imaging and targeting information as a function of sky position. Finally, we have performed eyeball quality checks on a large number of objects in the catalog in order to flag deblending and other errors. This catalog is complementary to the SDSS Archive Servers, in that NYU-VAGC's calibration, geometrical description, and conveniently small size are specifically designed for studying galaxy properties and large-scale structure statistics using the SDSS spectroscopic catalog.


Motivation
New, large galaxy datasets such as the Sloan Digital Sky Survey (SDSS; York et al. 2000) and the Two-Micron All Sky Survey (2MASS; Skrutskie et al. 1997) give astronomy a view of the local universe with an unprecedented combination of completeness and detail. These surveys promise to refine our understanding of properties of galaxies, their relationships with environment, and their evolution.
However, these catalogs consist of terabytes of data and are therefore currently too unwieldy simply to download onto a small workstation and investigate directly. Furthermore, simple matched catalogs between many surveys exist only in the form of web interfaces such as NED 1 . As indispensable as such interfaces are for studying a small number of individual objects on which one desires all the knowledge in the literature, they are not ideal for huge batch jobs designed to run automatically on relatively homogeneous data sets. Finally, none of the databases interfacing to these huge catalogs yield any expression of the window function on the sky of the included surveys, leaving this critical determination to the user.
For these reasons, we have created a galaxy redshift catalog designed to aid in the study of the local universe. Early versions of this catalog have proven invaluable to the SDSS team, as they form the basis for the work studying the luminosity function, power spectrum, correlation function, and a number of other galaxy property and galaxy clustering statistics; e.g. Blanton et al. (2003b); Tegmark et al. (2004); Zehavi et al. (2002); Hogg et al. (2003); Hoyle et al. (2003); Shen et al. (2003); Baldry et al. (2004); Pope et al. (2004). Non-SDSS investigators have also used public data extracted from this catalog to investigate galaxy properties and their evolution; e.g. Trujillo et al. (2004); Rudnick et al. (2003). The catalog is small enough (tens of Gbytes) to easily store on any modern machine, it contains matches between many major public catalogs, and it contains an explicit description of its window function. The current version of the catalog uses the SDSS imaging survey as its basis, matching sources in other catalogs to that master list, and only tracking the SDSS geometrical information. In the future we plan on incorporating other surveys more fully by explicitly including their geometrical information.
For most of the catalogs in the sample, we simply use the official public releases. However, we give the SDSS (of which the authors are participants) a special status in this catalog. First, it is the only survey for which we track detailed information about the window function. Second, for the SDSS we use an independent set of reductions of the public data (Schlegel et al. in preparation). These reductions are significantly improved over the reductions made available through the Data Archive Server (DAS) or Catalog Archive Server (CAS) on the SDSS Data Release 2 (DR2; Abazajian et al. 2004) web site 2 . The essential improvement is that errors in the large-scale relative photometric calibration of the survey have a lower variance and less structure, making the catalog more appropriate for studies of large-scale structure (as we demonstrate below). In addition, these reductions include an explicit expression for the survey geometry which the Archive Servers do not. Secondarily, the naming and unit conventions in our catalog are different, as we describe in this paper. For these reasons, expect the quantities for objects in this catalog and the corresponding objects in the Archive Server catalogs to be very similar but not identical.
This paper describes only the outline and general principles of the catalog. We leave the detailed documentation to a set of regularly updated web pages listed in Table 1. The paper contains: in Section 2, a short description of the geometrical expressions used in the catalog and how to use them; in Section 3, a description of the constituent catalogs and how they are matched, focusing on the SDSS; in Section 4, a description of other derived quantities; in Section 5, the presentation of a low redshift galaxy catalog based on this data; in Section 6, a short desription of some of the tools we have used to create this dataset which might be useful to the user as well; and in Section 7, a summary.
Note that throughout we use the environmental variable $VAGC REDUX to denote the root location of the NYU-VAGC; see the online documentation listed in Table 1 for the actual root location. Similarly, for the large-scale structure samples we describe below, we use the environmental variable $LSS REDUX.

Survey window geometries
In order to use galaxy surveys in a statistically meaningful way, we must describe their geometry on the sky. Doing so is important in order to determine the size of the volume surveyed as well as to determine in what area of sky a particular type of object could have been observed. However, astronomers have no standard way of expressing such geometrical information. Here we describe briefly the extremely general and compact system we use, introduced by Hamilton & Tegmark (2004). Currently, we have described the SDSS geometry using this method, but have not included geometrical information for the other catalogs in the NYU-VAGC.
Following Hamilton & Tegmark (2004), we store the geometry on the sky as a set of disjoint convex spherical polygons. Spherical polygons have several advantages: they are easy to express, it is easy to determine whether a point is inside or outside, and there exist relatively simple methods to transform them into spherical harmonic components (Tegmark et al. 2002). Furthermore, they can express a variety of shapes on the sky. Since the window functions of surveys, in particular of the SDSS survey as described below, are generally complex combinations of (for example) rectangles corresponding to imaging coverage and (in the case of the SDSS) circles corresponding to the spectroscopic coverage, a flexible expression of the geometry is extremely useful. The polygons are defined as the intersection of a set of "caps" in the manner of Andrew Hamilton's product mangle (see Table 1). Each cap is defined as a part of a spherical surface separated out by slicing of the sphere by a plane. Such a slicing cuts the sphere into two parts, and therefore choosing a cap means picking one of these parts.
In practice, we can fully specify a cap using four numbers (three independent ones): the unit vectorx perpendicular to the plane slicing the sphere and the quantity c m = ±(1 − cos θ), where θ is the polar angle defining the angular cap radius. A positive c m indicates that a direction a is inside the cap if A negative c m indicates that a point is inside the cap if For the value-added catalog we define the Cartesian coordinates x i such that: x 0 = cos δ cos α, x 1 = cos δ sin α, and where α and δ refer to the J2000 right ascension and declination.
The polygons are described in FITS files of the form: survey geometry.fits where "survey" indicates which survey we are describing. We also usually include ASCII versions in the standard polygon format of Hamilton & Tegmark (2004) in an ascii subdirectory wherever the FITS version is found: ascii/survey geometry.ply ascii/survey geometry info.dat where the "info" file contains auxiliary columns describing properties associated with each area of sky; in practice what these properties are varies depending on the survey in question. These files contain the geometrical description of each polygon. The format is described on the mangle homepage (see Table 1). The description of how these structures are stored in the FITS files is on the vagc web page (see Table 1) and tools exist in idlutils to read them into IDL structures.
It is often necessary to combine two sets of polygons describing two different surveys: for example, the SDSS imaging survey and the SDSS spectroscopic survey. In order to do so, we use a procedure known as "balkanization" (Hamilton & Tegmark 2004). The concept is simple, though the implementation is not. Imagine plotting both sets of polygons on the same page. The lines plotted now define a new set of disjoint polygons bounded by caps, which we refer to as balkans. Each balkan is either entirely inside both surveys or only inside one survey. We can express the intersection of the two catalogs as the set of balkans entirely inside both surveys. The code presented by Hamilton & Tegmark (2004) (see the web site listed in Table 1) is capable of finding this set of balkans.

Constituent catalogs
Here we describe the set of catalogs included in the NYU-VAGC: the SDSS, FIRST, 2MASS, 2dFGRS, IRAS PSCz, and the RC3.
The NYU-VAGC catalog corresponds to public data up to the SDSS DR2. We include three separate catalogs from SDSS DR2: the SDSS imaging catalog, the SDSS tiling catalog (a description of the targets in the imaging catalog for which we have a well-defined completeness), and the SDSS spectroscopic catalog. The top panel of Figure 2 shows the geometry of the portion of the SDSS imaging survey (DR2) released here, which includes 3514 sq deg of imaging. The green points in Figure 2 show the distribution on the sky of SDSS spectra.
This list of objects includes stars, QSOs, and galaxies together. Since the typical user of this catalog will be interested in galaxies, we recommend using the VAGC SELECT bitmask described below if they want to select Main sample type galaxies with m r < 18. Note that the criteria on which these are selected are slightly broader than those used to target galaxies in the SDSS spectroscopy, so some of these galaxies will not have spectra for that reason. In addition, some of these "galaxies" will in fact be misidentified stars or other sources (since our criteria will necessarily be less reliable than that used by the SDSS target selection process). For maximum reliability, the user can always trim our catalog to obey the same criteria as the SDSS target selection described in Strauss et al. (2002).

SDSS imaging catalog
The SDSS imaging catalog presented here includes photometric reductions using PHOTO v5 4 (Lupton et al. in preparation), and is described in Abazajian et al. (2004). However, unlike the data set Abazajian et al. (2004) presents, it is calibrated using overlaps of SDSS runs (Schlegel et al, in preparation). This procedure results in a more consistent large-scale calibration of the survey. In addition, as we describe below, the primary area of the survey is defined differently (and thus includes a somewhat larger fraction of the total area of the DR2 imaging).
We use some SDSS jargon below to discuss the organization of the data. As described in the Stoughton et al. (2002), a "run" is a sequence number assigned to a particular drift scan observation. A "camcol" (between 1 and 6) indicates a set of ugriz CCDs in the focal plane. In each run and camcol, the imaging data are divided in the scan direction into ∼ 10 arcmin "fields" for convenience.
Only a small number of the objects in the full SDSS catalog are included here. Specifically, we include only: 1. A sample of objects similar to the SDSS Main galaxy sample described by Strauss et al. (2002) selected from the most recent (v5 4) version of the imaging reductions. We have changed some parameters to be slightly more inclusive: (a) We extend the extinction-corrected Petrosian magnitude limit from r = 17.77 to r = 18, in order to include spectroscopic objects which would otherwise scatter outside the original flux limits due to changes in calibration since targeting.
(b) We extend the star-magnitude separation from 0.3 mag to 0.2 mag (expressed as m PSF − m model in the r-band; see Strauss et al. 2002 for details). This change includes a small number of galaxies not included in the targeting, but also introduces a number of stellar sources into the catalog.
(c) We extend the bright fiber magnitude limit from g, r, i > 15, 15, 14.5 to g, r, i > 12 (effectively turning it off).
(d) We turn off the exclusion of small, bright objects (with R 50 < 2 ′′ and m P < 15). This change results in the inclusion of a small number of binary stars.
2. The closest object within 2 ′′ of the position of each SDSS spectrum. In this match, we in fact give priority to objects that pass the previous criterion to be a spectroscopic galaxy target in the latest reductions. One might worry that this would introduce some sort of flux bias in the sample, since these targets are bright relative to the typical object, but in fact any pairs of objects with separations less than 2 ′′ are likely to be spurious in any case. Thus, taking the brightest one is usually the correct choice.
3. The closest object within 2 ′′ of the position of each Main sample, QSO, or Luminous Red Galaxy spectroscopic target from the target version of the reductions which was fed to the tiling software (a "tiled target"), whether or not they had spectra taken. These targets were selected based on earlier reductions of the imaging which differ in significant ways from the latest reductions.
We have retained the Petrosian half-light r-band surface brightness limit at µ 50 = 24.5. Note that the new objects included by the extensions described above are unlikely to have spectra.
In fact, because of catalog errors, there remain a handful of pairs of objects within 2 ′′ of each other (12 out of 693,331). After removing these duplicates, we include one observation of each object in the file: $VAGC REDUX/object sdss imaging.fits which contains a trimmed set of the photometric measurements (the exact list is described on the web page listed in Table 1). We use the environmental variable $VAGC REDUX to denote the root location of the NYU-VAGC; see the online documentation for the actual root location. In addition, we store all of the photometric information for this set of objects in another set of files of the form: where, as indicated, objects in separate "runs" and "camcols" are kept in separate files. The object sdss imaging file contains for each object its run and camcol, as well as its zero-indexed position in the corresponding calibObj file, allowing quick access to the full photometric information.
The geometry of the SDSS imaging survey is in a file: $VAGC REDUX/sdss/sdss imaging geometry.fits stored in the form of spherical polygons. We create a list of polygons that describe the primary area of each imaging field. For each polygon, we give the SDSS imaging field (that is, its run, camcol, and field) which we consider "primary" for the purpose of resolving duplicate observations. We should note here some points about the quantities in the SDSS data we present (some of which differ from data distributed by the Archive Servers): 1. All fluxes are given in "nanomaggies" f , which represent the flux (multiplied by 10 9 , as the prefix "nano" implies) relative to that of the AB standard source with f ν = 3631 Jy (Oke & Sandage 1968). They are related to standard astronomical magnitudes m by the formula: 3. Uncertainties are expressed in terms of the inverse variance (usually used for calibrated quantities, with the suffix IVAR) or in terms of the standard deviation (usually used with uncalibrated quantities, with the suffix ERR).
4. The column VAGC SELECT is a bitmask that yields information on how each imaging object was selected, with the following bits: 0: near tiled target 1: near spectrum 2: pass the Main sample galaxy criteria (with the adjustments listed above) Thus, an object which passed the galaxy criteria, and was near a tiled target, and was near a spectrum, would have bits 0, 1, and 2 all set, resulting in a numerical value of 7 for VAGC SELECT.
5. In addition to the local sky determination (the 100 ′′ by 100 ′′ median smoothed sky in skyflux) we provide the sky estimate for the current 9.8 ′ by 13.5 ′ field as a whole ("global" sky) in the parameter psp skyflux (in nanomaggies per arcsec 2 ).
6. A crude bulge-to-disk decomposition exists for each object processed by PHOTO, which simply consists of taking the best fit de Vaucouleurs model, the best fit exponential model, linearly combining them and refitting for the amplitudes of the models (see Abazajian et al. 2004 for details). The fraction of the flux assigned to de Vaucouleurs model (the "bulge fraction," if you will allow it) is put in the column called fracpsf.
In addition to these changes of form, there is a fundamental difference between the Princeton reductions used here and the data available on the SDSS Archive Servers -the relative photometric calibration (Schlegel et al. in preparation). The Archive Server reductions use the calibration procedure described in Abazajian et al. (2003), which involves comparing counts measured on the 2.5m telescope, to those measured on a nearby 0.5m photometric telescope on a slightly different filter system, to the fluxes of a set of primary standard stars (on a yet different filter system). Instead of this procedure, Schlegel et al. (in preparation) take advantage of the wide angular baseline provided by the drift scanning and of the large number of overlapping observations. This combination results in many exposures of the same stars taken on different nights. One can then use these multiple observations to fit for the calibration parameters (system response, airmass term, flat fields) as a function of time by minimizing the differences between the resulting inferred fluxes of the multiply-observed stars. This procedure, denoted "ubercalibration," takes advantage of the fact that the system is photometrically stable within each drift scan run, and uses that to tie all the runs together using their overlapping observations. As a demonstration that the procedure results in a lower variance in large-scale errors in the calibration, consider Figure 1. The greyscale in the top two panels shows the r − i color of the bluest stars in the magnitude range 16 < m r < 18.5 in each contiguous set of twenty fields in each run of the SDSS (all magnitudes extinction-corrected according to the dust maps of Schlegel et al. 1998). We only show one section of the SDSS coverage on the Northern Equator, but similar results hold elsewhere. This blue-tip color varies smoothly across the sky due to metallicity gradients in the Galactic stellar halo, but has little small scale structure. The top panel shows this quantity for data we have calibrated to the SDSS standard system using the photometric telescope (though these results are not identical to those in the SDSS Archive). The bottom panel shows the same for the ubercalibrated data. Both panels reveal the dependence of stellar metallity on Galactic latitude, as well as some large-scale features in the stellar distribution. Clearly, the stripy variations in the top panel, which are errors in the SDSS calibration, are greatly reduced in the bottom panel (though not eliminated). The 5-sigma clipped standard deviation in r − i color over the whole SDSS area is reduced from 0.02 to 0.01 mag between the PT calibration and the ubercalibration (these numbers include the variation of the stellar populations). The rms variations of r − i within several degree scale patches is about 0.007 mag. These results suggest that ubercalibration is a significant improvement over the standard calibration and that the calibration is good to about 1%. The SDSS is taking long scans across the entire survey in the Northern Galactic Cap, as well as scans that connect the three separated stripes in the Southern Galactic Cap, that will reduce the systematic errors even further.
Our web site (see Table 1) has full documentation of the structure of the files described in this and subsequent sections and of all of the parameters they contain.

SDSS tiling catalog
The primary SDSS spectroscopic program proceeds in the following manner. Based on a set of images for which we have selected targets (using the algorithms in Strauss et al. 2002, Eisenstein et al. 2001, Richards et al. 2002, and Stoughton et al. 2002, the SDSS defines a "tiling region." For example, for Tiling Region 7 in the SDSS, Figure 3 shows the region we defined. Given the tiling region and the set of targets within it we determine the location of spectroscopic tiles of radius 1.49 deg and decide to which targets to assign fibers (Blanton et al. 2003a). This procedure defines a set of 1.49 deg radius circles on the sky corresponding to the tiles. The intersection of the rectangles describing the tiling region and the circles describing the tiles defines the geometry of the tiling region. In this geometry we define "sectors," each of which consists of a set of spherical polygons which could have been observed by a unique set of tiles. These sectors are the appropriate regions on which to define the completeness of the survey. See the DR2 web site 3 for more complete documentation on tiling. In Figure 3, we have given each sector a different shade of grey.
The union of the tiling regions defines the geometry of the spectroscopic survey as a whole. Note that this geometry is not, generally, as simple as the total area covered by the tiles. For example, some regions are within 1.49 deg of the center of a tile, but the tile was created before spectroscopic targets for that region had been selected. This fact of life results in gaps in the survey which we track in our geometrical description of the survey.
The set of polygons describing the tiling geometry is in:

$VAGC REDUX/sdss/sdss tiling geometry.fits
This file yields the sector to which each polygon belongs. The properties of the sectors are given in: $VAGC REDUX/sdss/sdss sectorList.par, which yield which tile each sector belongs to. Finally, the centers of each tile are given in $VAGC REDUX/sdss/tileFull.par.
We match the set of objects which we input into the tiling program to the nearest imaging object within 2 ′′ in the object sdss imaging catalog and put the results in the file: $VAGC REDUX/object sdss tiling.fits Each entry in this file refers to the corresponding entry in the object sdss imaging file; e.g. row number 3 in one file refers to the same object as row number 3 in the other. The full set of tiled objects (including those that do not match any of the imaging objects) is included in the file: $VAGC REDUX/sdss/sdss tiling catalog.fits The object sdss tiling.fits file has the column sdss tiling tag primary which gives the zero-indexed row number of the object in the sdss tiling catalog.fits file. Unmatched objects have sdss tiling tag primary == -1 (and the rest of the columns for such rows are set to appropriate null values).
The reader may wonder why there would be any unmatched objects. The reason is the photometric reduction code has changed over time, in particular the deblending algorithm has changed. For this reason, there are occasionally objects found in an old reduction which have no corresponding object within 2 ′′ in the new reductions, because the set of detected pixels in that region has been deblended differently by the two versions of the code.
Again, the structure and contents of these files are described on the web site.

SDSS spectroscopic catalog
For this catalog we use the reductions of the SDSS spectroscopic data performed by Schlegel et al. (in prep) using their reduction code idlspec2d, which extracts the spectra and finds the redshifts. The redshifts found by idlspec2d are almost always (over 99% of the time for Main galaxy sample targets) identical to the redshifts found by an alternative pipeline used for the SDSS Archive Servers (SubbaRao et al. in prep).
We match the set of objects for which we have SDSS spectra to the nearest imaging object within 2 ′′ in the object sdss imaging catalog and put the results in the file:

$VAGC REDUX/object sdss spectro.fits
Each entry in this file refers to the corresponding entry in the object sdss imaging file; e.g. row number 3 in one file refers to the same object as row number 3 in the other. The full set of spectra (including those that do not match any of the imaging objects) is included in the file: $VAGC REDUX/sdss/sdss spectro catalog.fits The object sdss spectro.fits file has the column sdss spectro tag primary which gives the zeroindexed row number of the object in the sdss spectro catalog.fits file. Unmatched objects have sdss spectro tag primary == -1 (and the rest of the columns for such rows are set to appropriate null values). Note that in addition to the issues regarding deblending in different reductions mentioned in the previous subsection, a number of the spectra are sky spectra which are placed randomly on the sky and will never correspond across reductions.
We do not provide any geometrical description of the catalog beyond the locations of each fiber (each one is 3 ′′ diameter).
The quantities in these files are documented at the spectroscopic reduction web page (see Table 1).

FIRST
Using the Very Large Array, the Faint Images of the Radio Sky at Twenty-centimeters (FIRST; Becker et al. 1995) survey has mapped 10,000 square degrees of the Northern Sky overlapping the SDSS with a detection limit about 1 mJy and a resolution of 5 ′′ . For each object in object sdss imaging we find the matching object in the FIRST catalogs within 3 arcsec. The columns with the prefix FIRST in the files: $VAGC REDUX/sdss/parameters/calibObj-$run-$camcol.fits contain the FIRST results. The columns are described in detail in the Princeton photometric reduction web site listed in Table 1.

2MASS
2MASS (Cutri et al. 2000) is an all-sky map in J, H, and K s . Two catalogs have been developed from this map; the Point Source Catalog (PSC, complete to roughly K s ∼ 15, Vega-relative) and the Extended Source Catalog, that is, the galaxy catalog (XSC, complete to roughly K s ∼ 13.5). For each object in object sdss imaging we find the matching object within 3 arcsec. The columns with the prefix TMASS in the files: $VAGC REDUX/sdss/parameters/calibObj-$run-$camcol.fits contain the 2MASS PSC data. The columns are described in detail in the Princeton photometric reduction web site listed in Table 1.
In addition, we match the 2MASS Extended Source Catalog (described in the Explanatory Supplement to the 2MASS All Sky Data Release; see Table 1; Cutri et al. 2000) to objects within 3 ′′ in object sdss imaging and put the results in the file:

$VAGC REDUX/object twomass.fits
Each entry in this file refers to the corresponding entry in the object sdss imaging file; e.g. row number 3 in one file refers to the same object as row number 3 in the other. The full set of 2MASS XSC objects (including those that do not match any of the imaging objects) is included in the files: $VAGC REDUX/twomass/twomass catalog 000.fits $VAGC REDUX/twomass/twomass catalog 001.fits $VAGC REDUX/twomass/twomass catalog 002.fits $VAGC REDUX/twomass/twomass catalog 003.fits These files contain a subset of the columns listed by Cutri et al. (2000). Most notably, below we will use the "extrapolated" galaxy magnitudes from these files, as described by Jarrett et al. (2003).
We have not converted the numbers in these files from their original Vega-relative meaning. Where necessary below, we will use the conversions to AB: calculated by the kcorrect v3 2 code presented by Blanton et al. (2003), using the filter curves of Cutri et al. (2000) and the theoretical Vega flux presented by Kurucz (1991). Figure 4 shows the distribution of match distances between the SDSS and 2MASS catalogs, showing the agreement in the astrometry between these two catalogs (Pier et al. 2003;Finlator et al. 2000).

2dFGRS
The 2dFGRS (Colless et al. 2001) is a galaxy redshift survey using the 2dF multi-object spectrograph, targeted off of the APM survey (Maddox et al. 1990). We match the 2dFGRS Final Data Release to objects within 4 ′′ in object sdss imaging and put the results in the file: Each entry in this file refers to the corresponding entry in the object sdss imaging file; e.g. row number 3 in one file refers to the same object as row number 3 in the other. The full set of 2dFGRS objects (including those that do not match any of the imaging objects) is included in the files: The top panel of Figure 5 shows the distribution of angular separations of our matches, and in the bottom panel (for objects with redshifts in both catalogs) the (absolute) difference between the SDSS and 2dFGRS redshifts versus the angular distance. We limit our comparison to 2dFGRS redshifts with QUALITY ≥ 3 (the recommended criterion for a reliable redshift). There are around 27000 objects with redshifts in both catalogs; 94 of these are large redshift discrepancies (|δz| > 0.01). For about 83 of these discrepancies, the SDSS redshift is clearly correct based on an eyeball inspection of the extracted spectrum. For the remaining 11, the SDSS redshift is flagged as poor by the spectroscopic reduction software (using the ZWARNING flag described on the spectroscopic reduction web page listed in Table 1).

IRAS PSCz
The PSCz is a redshift catalog of point sources in the IRAS survey (Saunders et al. 2000). Given the resolution of the IRAS survey, we have matched each source to the nearest object within 40 ′′ in the object sdss imaging file, putting the results in the file:

$VAGC REDUX/object pscz.fits
Each entry in this file refers to the corresponding entry in the object sdss imaging file; e.g. row number 3 in one file refers to the same object as row number 3 in the other. The full set of PSCz objects (including those that do not match any of the imaging objects) is included in the files: $VAGC REDUX/pscz/pscz catalog.fits

RC3
The Third Reference Catalog of Galaxies (RC3) is a catalog of nearby galaxies developed by de Vaucouleurs et al. (1991). Because of the size of these galaxies (and the fact that locations in the RC3 are occasionally only listed to the nearest arcminute), we have matched each source to the nearest object within 45 ′′ in the object sdss imaging file, putting the results in the file:

$VAGC REDUX/object rc3.fits
Each entry in this file refers to the corresponding entry in the object sdss imaging file; e.g. row number 3 in one file refers to the same object as row number 3 in the other. The full set of RC3 objects (including those that do not match any of the imaging objects) is included in the files: $VAGC REDUX/rc3/rc3 catalog.fits

Additional quantities
In addition to matches to external catalogs, we provide some extra quantities measured from the NYU-VAGC catalog.

Collision "corrections"
For the purposes of large-scale structure statistics with the SDSS, it is necessary to account for the fact that some galaxies are missing in the spectroscopic sample due to collided fiber constraints (no two fibers on the same tile can be placed more closely than 55 ′′ ). We can do so using the following procedure: 1. Group the galaxies according to a friends-of-friends procedure with a 55 ′′ linking length.
2. For each galaxy which does not have a redshift in the SDSS data, ask whether there is a galaxy (or galaxies) in its group with a redshift.
3. If so, assign to the galaxy without a redshift that of the galaxy in the group which is closest on the sky and which has a redshift.
We put the results of this procedure into a file: $VAGC REDUX/collisions/collisions.nearest.fits About 5-6% of galaxies brighter than the flux limit need to be and can be assigned a redshift using this criterion, as found previously by Zehavi et al. (2002). Judging from the cases which could have been corrected but in fact had a redshift, about 60% of the corrected cases are within 10 h −1 Mpc of the correct redshift. Figure 8 shows the distribution of redshift separations and angular separations of such galaxies in the top panel, and the histogram of redshift separations in the bottom panel.
We also have implemented a slight variant of this procedure, in which we also require that the photometric redshift of the galaxy (determined using kcorrect v3 2; Blanton et al. 2003) which was collided be within 0.05 of the spectroscopic redshift which we want to assign to it. There is very little difference in the results; we include them in the file $VAGC REDUX/collisions/collisions.photoz.fits About 9% of the objects are corrected, about 71% of which are likely to be within 10 h −1 of the correct redshift (based on the cases which could have been corrected but in fact had a redshift).
Finally, for completeness we include corresponding files without any corrections at all: $VAGC REDUX/collisions/collisions.none.fits

K-corrections
We use the K-correction software kcorrect v3 2  to determine K-corrections for all of the objects in the NYU-VAGC. We treat them all as if they are normal galaxies; thus, the K-corrections are not going to be appropriate for QSOs.
In the directory: $VAGC REDUX/kcorrect we provide these estimates of the ugrizJHK s K-corrections and absolute magnitudes of each object (using a Ω 0 = 0.3, Ω Λ = 0.7 cosmology with H 0 = 100 h km s −1 Mpc −1 for h = 1). The JHK s fluxes all come from the 2MASS XSC extrapolated magnitudes (Jarrett et al. 2003) and have been converted from the Vega system to the AB system as described in Section 3.3. The files also contain the Galactic-extinction corrected AB nanomaggies for each object. We provide these for each set of collision corrections, for different types of SDSS flux measurements, and for different rest-frame bandpasses.
They are in files of the form: where $collision refers to the type of collision correction (that is none, nearest, or photoz), $flux refers to the type of flux used for the SDSS data (based on the prefix used in the calibObj files), and $bandshift refers to the blueshift of the bandpasses we are correcting to. As an example: $VAGC REDUX/kcorrect/kcorrect.none.petro.z0.10.fits contains corrections for galaxies using no collision corrections, Petrosian fluxes for the SDSS observations, and shifted to the equivalent bandpass shapes at z = 0.10.

Sersic profile fits
In the file $VAGC REDUX/sersic/sersic catalog.fits we provide Sérsic fits to the azimuthally averaged radial profiles of each object (Sérsic 1968). Here we provide a description of the fitting procedure; a more detailed description can be found in the Appendix of Blanton et al. (2005).
For each galaxy, we fit an axisymmetric Sérsic model of the form I(r) = A exp −(r/r 0 ) 1/n .
to the mean fluxes in annuli output by the SDSS photometric pipeline PHOTO in the quantities profMean and profErr (Stoughton et al. 2002 list the radii of these annuli). In Equation 6, n is referred to as the Sérsic index. PHOTO outputs these quantities only out to the annulus which extends beyond twice the Petrosian radius, or to the first negative value of the mean flux, whichever is largest. In any case, we never consider data past the 12th annulus, whose outer radius is 68.3 ′′ . For the median galaxy we have data and perform the fit out to 27.9 ′′ (the median half-light radius from the fits is 2 ′′ ). We minimize: where sersicMean i (A, n, r 0 ) is the mean flux in annulus i for the Sérsic model convolved with a three-gaussian seeing model for the given field.
We have evaluated the performance of this algorithm in the following way. Taking a sampling of the parameters of our fits from the Main galaxy sample, we have generated about 1200 axisymmetric fake galaxy images following Equation 6 exactly, which we refer to as "fake stamps." In order to simulate the performance of PHOTO, we have distributed the fake stamps among SDSS fields. For each band, we convert the fake stamps to SDSS raw data units, convolve with the estimated seeing from the photometric pipelines, and add Poisson noise using the estimates of the gain. We add the resulting image to the SDSS raw data at a random location on the frame, including the tiny effects of nonlinearity in the response and the less tiny flat-field variation as a function of column on the chip. We run PHOTO on the resulting set of images to extract and measure objects and then run our Sérsic fitting code. This procedure thus includes the effects of seeing, noise, and sky subtraction. We have tested that our results remain the same if we insert images using an alternative estimate of the seeing based simply on stacking nearby stars (still fitting using our three-gaussian fit to the PSP seeing estimate). Figure 9 displays the distribution of fit parameters in the r-band (converting A and r 0 to total flux f and half-light radius of the profile r 50 ), as a function of the input parameters. Each panel shows the conditional distribution of the quantity on the y-axis as a function of quantity on the x-axis. The fluxes f are expressed in nanomaggies, such that f = 100 corresponds to m = 17.5, near the flux limit of the Main galaxy sample. The lines show the quartiles of the distribution. At all Sérsic indices, sizes, and fluxes, the performance is good.
For larger sizes, sizes and fluxes are underestimated by about 10% and 15% respectively, while the Sérsic index is constant over a factor of ten in size. For high Sérsic indices, the sizes and fluxes are slightly underestimated (again by about 10% and 15%) while the Sérsic index itself is underestimated by (typically) -0.5 for a de Vaucouleurs galaxy -meaning that a true de Vaucouleurs (n = 4) galaxy yields n ∼ 3.5 in our fits. This remaining bias is not much larger than the uncertainty itself and is comparable to the bias one expects (in the opposite direction) from neglecting non-axisymmetry.
The bias is partly due to our approximate treatment of the seeing, but mostly due to small errors in the local sky level (at the level of 1% or less of the sky surface brightness) determined by the photometric software. If one fits for the sky level, one can recover the Sérsic indices (and fluxes and sizes) of the fake galaxies far more accurately. However, because the Sérsic model is not a perfect model for galaxy profiles, for real data the fits apply unrealistically high changes to the sky level to attain slight decreases in χ 2 . The resulting sizes and fluxes of the largest and brightest galaxies are obviously wrong. Thus, we satisfy ourselves that the measurements we obtain with the fixed sky level yield approximately the right answer for galaxies which are actually Sérsic shaped, and for other galaxies merely supply a seeing-corrected estimate of size and concentration.

Distances to low redshift galaxies
At very low redshift we must take care in using the redshift as an estimate of the distance. First, we convert the heliocentric redshift provided by idlspec2d into the frame of the Local Group barycenter using the Local Group heliocentric velocity determination of Yahil et al. (1977).
Then, we use a model of the local velocity field based on the IRAS 1.2 Jy redshift survey determined by Willick et al. (1997) (using β = 0.5) in order to find the most likely distance of the given galaxy. Along the sightlinex to each galaxy we maximize the likelihood density expressed by: where v(r,x) is the outward radial peculiar velocity at distance r (expressed in km s −1 ) in directionx. The fit of Willick et al. (1997) extends to 64 h −1 Mpc. Outside that radius we neglect peculiar velocities and assume the Hubble Law is exact. We set σ v = 150 km s −1 independent of local density. We taper the peculiar velocities v(r,x) to zero between 50 and 64 h −1 Mpc in order to provide a smooth transition between these two regimes. The typical corrections are of the order of 200-300 km s −1 .
We report errors in the distance based on the following calculation: we find the furthest distances above and below the best-fit distance at which the probability in Equation 8 is equal to exp(−2) of its peak value (the 2σ point in Gaussian statistics) and report 1/4 of the difference as the standard deviation in the distance. Near the edges of the volume for which we have an estimate of the velocity field (r = 0 and r = 6, 400 km s −1 ) we use 1/2 of the difference between the best fit distance and the inner distance satisfying the above criterion. Outside that volume, we simply use the velocity dispersion σ v = 150 km s −1 .
In order to test our method, we have compared our version of distances to a set of galaxies in common with the Mark III catalog of Willick et al. (1997). Our results are consistent with the IRAS-predicted velocity field distances (dist iras) in that catalog. On the other hand, there are significant disagreements (at the few Mpc level) with the Tully-Fisher corrected distances (dist tfc) of that catalog, in the sense that the Tully-Fisher distances tend to be higher. These differences reflect the inability of the IRAS density field to predict velocities perfectly in the directions probed by our sample.
The results of the procedure are in the file: $VAGC REDUX/velmod distance/distance sigv150.fits which lists the coordinates, heliocentric (ZACT), Local Group relative (ZLG), and peculiar-velocity corrected (ZDIST) redshifts for each object.

Matching spectra for badly deblended targets
When we matched the SDSS tiling catalog (sdss tiling catalog.fits) and the SDSS spectroscopic catalog (sdss spectro catalog.fits) to the SDSS imaging catalog data, we used a match length of 2 ′′ . However, for low surface brightness or complex galaxies, the behavior of the deblender has changed as the photometric software has changed over time. Thus, the spectrum may reflect that of an object in the latest SDSS imaging catalog, but not be near the nominal center of that object. We would like to make sure that we can recover the redshifts of objects in these cases.
In order to do so, we take all objects in the object sdss imaging file that have no spectroscopic matches, and compare them to all spectra in the SDSS spectroscopic catalog that are not already matched to object sdss imaging galaxies and which are within 2r P,90 (twice the Petrosian 90% light radius) of the center of the object. There are about 3000 spectra with such a candidate match. For each object we take a 3 ′′ radius aperture around the center of each nearby spectrum, and measure the flux contributed by the object in question according to the deblender (or its parent, if the quality flag USE PARENT is set) as well as the total flux in that exact same aperture. If the flux in that aperture from the object is at least half of the total flux in that aperture then we consider the given spectrum to match the given object. We perform the same operation for the tiling catalog in order to match it to the object sdss imaging catalog.
We store the results in the files: $VAGC REDUX/matchspec/matchspec.fits $VAGC REDUX/matchspec/matchtiled.fits Each file contains an entry for the closest spectrum within 2r P,90 to the center of each unmatched object (the column OBJECT POSITION indicates which object is under consideration, the column ISP or ITI indicates which entry of the sdss spectro catalog or sdss tiling catalog). The column SPMATCHED or TIMATCHED indicates whether the entry satisfies the criterion above. The catalog entries from sdss spectro catalog or sdss tiling catalog are also included for convenience.
The results of this operation are included when we build the LSS sample (Section 4.8), the low-redshift catalog (Section 5), the K-corrections (Section 4.2), and the collision corrections (Section 4.1).

Double stars
At very low redshifts, many of the photometrically defined "galaxies" are not galaxies at all, but instead are double stars which the photometric software failed to deblend. Typically these double stars are flagged as galaxies by the photometric pipelines, because they are resolved, are bright (m r < 16) and small θ 50 < 2 ′′ .
In order to find many of these double stars, we have post-processed the atlas images for all objects that the photometric software reports as resolved and that have cz < 1500 km/s as well as all galaxies without spectra with for which µ 50,r < 19. We fit a double PSF model to the r-band image using an idlutils 4 utility we wrote for this purpose called multi psf fit. We classify as double stars all objects that pass both the following criteria: where f model and f PSF are the "model" and "PSF" fluxes reported by the photometric pipeline (Abazajian et al. 2004), and χ 2 double and χ 2 single are measures from multi psf fit of the residuals between a model and the image using the double star and single star models. The first criterion is necessary to exclude cases where the galaxy model fit is much more appropriate; the second criterion simply measures how much better the double star fit is. We have set the parameters conservatively in the sense that there are essentially no objects reported as double stars that are not in fact double stars. On the other hand, this conservatism means that there are some double stars not flagged as such by this procedure.
The results of this procedure exist in the file:

$VAGC REDUX/doublestar/doublestar.fits
The important column in this file is ISDOUBLE, which has a "1" if the procedure above concludes that the object is a double star, and "0" otherwise.
It is worthwhile noting that because M stars have such distinctive spectra, any resolved object flagged in object sdss spectro as an M star is in fact a double star or an M star in the foreground of a distant galaxy. So when searching for very low redshift objects, one should exclude any object with a subclass of M star. On the other hand, it is common for low redshift galaxies to be classified as stars of other types.

Eyeball quality checks
For various purposes, we have performed a number of eyeball checks on the photometry and the spectra. We do not claim any completeness in terms of what set of objects we have checked, but it is usually productive to exclude objects we have flagged as errors in this list.
The quality file is at: For each object which we have quality checked, we have set values in a bitmask flag whose values and meanings are listed in Table 2.
For objects with DONE set and no other flags, we have concluded that the object is dealt with by PHOTO more-or-less correctly. Any other flags set indicates an error and the object should be ignored, except in the cases that USE ANYWAY is set, which means we have concluded that keeping the object and its measurements is better than excluding it, or that USE PARENT is set, which means we recommend using the measurement of the parent.
In cases of USE PARENT, we have created a set of files for the measurement of the parents, which is at

$VAGC REDUX/parents
This directory is more-or-less constructed to be parallel with the $VAGC REDUX directory. The list of objects whose parents we have processed is in the file: $VAGC REDUX/parents/object sdss imaging parents.fits, which also has the photometric information for each object (as in the object sdss imaging file). We calculate K-corrections and Sérsic fits for the parents and store them in the appropriate directories below this level, as fully documentaed on the web site listed in Table 1.
In some cases, the parent is centered in an odd place, such as an HII region on the outskirts of the galaxy, rather than the center. These objects we flag as BAD PARENT CENTER. Although the redshifts and some photometric measurements will be fine, structural measurements (such as Sérsic fits) will be misleading.

Large-scale structure geometry
In a separate directory: we store the information we use to describe the large-scale structure geometry of the survey and with which to do large-scale structure science. Recall that, like $VAGC REDUX, $LSS REDUX is an environmental variable describing the root location of the large-scale structure data. The relevant value will be listed in the online documentation, as it will in principle change over time.
There are two basic files. First, there is one file fully describing the geometry lss combmask.drtwo14.fits which is readable with the idlutils routine read fits polygons and which contains a row for every spherical polygon. This geometry is the area covered by the imaging, by the set of survey tiles, and not near Tycho stars (as described below).
1. SECTOR: sector number (see description of tiling geometry in Section 3.1.2) 2. FGOTMAIN: fraction of Main sample targets which have good redshifts in this sector 3. MMAX: flux limit based on targeting limit and change in calibration since targeting (r-band magnitudes) 4. DIFFRUN: whether the "best" imaging in this field is the same observation as the "target" imaging 5. ITILING GEOMETRY: position (zero-indexed) of the tiling polygon (see Section 3.1.2) to which this polygon belongs 6. ITARGET GEOMETRY: position (zero-indexed) of the target polygon to which this polygon belongs (in the file $VAGC REDUX/sdss/sdss target geometry.fits) 7. ILSS: index into the lss geometry file described below 8. RA: estimated center of the field this was based on (J2000 degrees) 9. DEC: estimated center of the field this was based on (J2000 degrees) RA and DEC are not always inside the given polygon; however, the polygon is always fully contained within a 0.36 deg circle surrounding that center.
Note that MMAX varies with position for two reasons. First, the explicit targeting limits changed with time over the course of the survey. Second, the calibration of the data has improved since the targeting, such that the flux limit in the recalibrated data varies across the sky. MMAX accounts for both effects.
There is an IDL utility in the vagc product called get sdss icomb which takes right ascension and declination values, and quickly checks which row of the lss combmask file it is contained in. Because this piece of code is relatively simple and self-contained, we reproduce it in full in the Appendix.
Second, there is a file describing the relationship each object in object sdss imaging has with the geometry: This file can be used to make some simple cuts to select galaxies with redshifts. The SDSS redshifts include those which have the ZWARNING flag set but we have determined to be good and flagged GOOD Z (see Section 4.7), as well as those which we have matched using the extended matching criterion described in Section 4.5.
In addition, there are two files describing the geometry, out of which we have built lss combmask: lss geometry.drtwo14.fits lss bsmask.drtwo14.fits Each of these files contain sets of spherical polygons. The first file contains the geometry of the survey as a whole. The second file contains a mask cut out around bright stars (stars with B < 13 in the Tycho catalog; Høg et al. 2000). The radius of the circle around each bright star is set according to the formula: where θ is in arcmin, B ′ is the Tycho magnitude B if 6 < B < 11.5, but B ′ = 6 if B < 6 and B ′ = 11.5 if B > 11.5. This radius is about that at which the mean density of galaxies near Tycho stars drops to half the background value (I. Strateva, private communication). The mask is designed such that each polygon describing it is fully contained in a single polygon of the lss geometry file (which one is stored in the ILSS column).
Finally, the directory: $LSS REDUX/drtwo14/random/ contains one hundred random catalogs with one million points each, distributed with constant surface density in the area included by lss combmask. In addition to right ascension and declination, this file contains ILSS, indicating which polygon of lss geometry the random point is in, and EBV, the E(B − V ) value for this direction from Schlegel et al. (1998).

5.
A low redshift catalog (0.0033 < z < 0.05) One of the areas of the SDSS which requires special care is in the treatment of galaxies at low redshift. In order to study the property of galaxies at low redshifts and, correspondingly, at low luminosities, we have done some simple checks of the SDSS catalog in this regime, cleaned up the catalog where it was simple to do so, and put together a "low-redshift" catalog of galaxies with estimated comoving distances in the range 10 < d < 150 h −1 Mpc.
For the purposes of this catalog, we checked the atlas images and spectra of a number of galaxies. We flagged bad deblends as errors and set other quality flags according to Section 4.7. In particular, in DR2 we have checked all objects in the catalog that we have selected as Main-sample-like objects (see Section 3.1.1), that have a spectrum, and that satisfy one of the following criteria: 1. M r > −15 and z > 0.003, if a good redshift exists in the sample redshift range, the object is not classified as a double star according the algorithm in Section 4.6, and it is not classified as an M star. We deemed about 22% of these to be deblending errors in the latest reductions; for about 72% of these errors (16% of the total number) using the parent photometry is sufficient. So we recover about 94% of the objects in this category.
2. 0.003 < z < 0.01, if a good redshift exists, the object is not classified as a double star according the algorithm in Section 4.6, and it is not a star. Again, about 19% are deblending errors, for 70% of which (14% of the total number) the parent photometry is sufficient, thus recovering about 96% of the galaxies in this category.
The redshifts we use include those matched to the imaging objects using the criterion described in Section 4.5. Note that we do not require that the object be spectroscopically classified as a galaxy; a certain number of low redshift galaxies are classified as stars, especially in cases that the spectrum has a low signal-tonoise ratio. The low redshift cut corresponds to 10 h −1 Mpc. We do not consider anything below this redshift because the sample becomes highly incomplete (due to shredding by the photometric pipeline of large resolved galaxies) and the distance estimates for such objects are highly affected by peculiar velocities.
For galaxies with USE PARENT set, we replace the SDSS child's photometry with that of the parent, using the results described in Section 4.7.
In the DR2 area, 28,089 galaxies pass the above criteria. Weighted by the completeness, the effective area of the sample is 2220.9 square degrees.
To compute the global number densities of galaxies as a function of their properties, it is necessary to compute the number-density contribution 1/V max for each galaxy, where V max is the volume covered by the survey in which this galaxy could have been observed, accounting for the flux, redshift limits, and completeness as a function of angle (Schmidt 1968). We calculate this volume as follows: where f (θ, φ) is the spectroscopic completeness as a function of angle, averaging about 90% across the survey. We determine this on a sector-by-sector basis (it is the FGOTMAIN value described in Section 4.8). z max (θ, φ) is defined for this sample by: z max (θ, φ) = min(z m,max (θ, φ), 0.05) , z min = 0.0033 .
Note that over this redshift range we can ignore the contribution of the surface brightness limits to V max , since cosmological surface brightness dimming is such a small effect.
There is a complication at low redshift: our estimate of the luminosity has a large uncertainty due to galaxy peculiar velocities. V max as calculated above is a nonlinear function of luminosity, such that an uncertainty in luminosity will yield an underestimate of V max . For a fair estimate of V max , we use: where p(L)dL = p(r)dr (from Equation 8 above). This estimate is an average of V max for all possible luminosities based on the probability of the galaxy having that luminosity. In practice, there is only a small difference between the results one finds using the regular V max estimator and this one.
The catalog is available in the file: $VAGC REDUX/lowz/lowz catalog.drtwo14.fits Its columns are described in Table 3.
Atlas images (that is, images with neighboring objects removed; Stoughton et al. 2002) of all the objects in the catalog are available in the directories: where each directory contains galaxies in a particular hour of right ascension (J2000). The names of each atlas image are based on the IAU name of each object; eg. lowz-atlas-J044112.00+003202.3.fits As described on the web site, each file contains ten HDUs. HDUs 0, 2, 4, 6, and 8 contain the ugriz images of each galaxy. HDUs 1, 3, 5, 7, and 9 contain the inverse variance ugriz images of each galaxy. In addition, there is a file of the form: psf-J044112.00+003202.3.fits which contains five HDUs, the estimated PSFs at the center of the object in ugriz.
We have already used this low redshift catalog for an investigation of the population of low luminosity galaxies in the SDSS survey (Blanton et al. in preparation). We expect the catalog and images to be useful in a number of other ways. For example, it provides a nice low redshift sample for comparison to high redshift galaxy samples.

Software tools
Generally speaking we have used a combination of C and IDL in constructing this catalog. Astronomers will likely find many of the utilities we have used to construct the dataset useful for analyzing it as well. The source code and online documentation for some of these tools is listed in Table 1.
Here are some short descriptions of the tools themselves: 1. idlutils: A general astronomically useful set of IDL utilities maintained by David Schlegel and Doug Finkbeiner, incorporating the Goddard IDL library maintained by Wayne Landsman, and contributed to by many others too numerous to mention. In particular, this library contains readers for the FITS files, spherical polygon files (see Section 2 below), and FTCL parameter files (a special ASCII format) that our catalog contains. In particular: (a) mrdfits is a general FITS reader, which can read FITS images and tables (.fits files) (b) read mangle polygons will read spherical polygon files produced by mangle (.ply files) and read fits polygons will read spherical polygons files in FITS format (and idlutils has code which makes it easy to work with the resulting structures). See the mangle web site listed in Table 1, or Section 2 below, for details on spherical polygon files.
(c) yanny readone is a FTCL parameter file reader (.par files, a special SDSS ASCII format; see the DR2 web site listed in Table 1 for details) 2. photoop: An SDSS-specific set of utilities written in Perl, IDL, and C by David Schlegel, Doug Finkbeiner, and Nikhil Padmanabhan. These tools are primarily for performing the photometric reductions and calibrations, but also contain image analysis utilities that may be useful to users.
3. vagc: The code responsible for producing this catalog. It contains some useful utilities for reading in the data from the catalog.
4. kcorrect: A set of utilities for calculating K-corrections and photometric redshifts, tuned to work for SDSS and 2MASS data ). Our catalog contains K-corrections already calculated, but the user may find this useful.

mangle:
A general set of tools for handling window functions on the sphere, developed by Hamilton & Tegmark (2004). It is described more fully in Section 2. These routines can be useful to the user for creating random catalogs from the geometrical descriptions given here, and also for checking whether certain directions are inside or outside the surveys. This code is also distributed as part of idlutils.
We should note that none of these tools are necessary for using the data, only recommended as generally useful packages.

Summary
Here we have presented a catalog of galaxies combining information from SDSS, 2MASS, FIRST, PSCz, RC3, and 2dFGRS. The main improvements of this catalog over the standard SDSS release (in addition to the matches to other catalogs) are a better calibration and an explicit description of the geometry, including all of the information necessary to perform large-scale structure analyses with the catalog. We have also included structural measurements, K-corrections, peculiar velocity corrections, and quality checks of many objects. The catalog is fully documented on the web site listed in Table 1.
As we continue to develop the NYU-VAGC, we expect to add considerable functionality. For example: 1. We plan to extract images and, where possible, measure structural parameters for galaxies detected in other surveys which overlap the SDSS.
2. We plan to include parameters for NYU-VAGC galaxies from other surveys (e.g. the Spitzer Space Telescope and the Galaxy Evolution Explorer).
3. We plan to continue adding SDSS data as it is released, and improving the treatment of deblended galaxies.
4. The SDSS ubercalibration procedure will continue to improve, as the SDSS collaboration takes more and more overlapping data.
We note that this project will be more successful if users provide feedback about how the catalog could be improved, since we do not expect that we can predict from pure thought what astronomers will find useful. HII an HII region plucked out of a larger galaxy 8 USE ANYWAY there is a bad deblend, but we recommend you include the object in your sample anyway CCD y position in the field (centers of pixels are half-integers) OBJC COLC CCD x position in the field (centers of pixels are half-integers) PLATE SDSS spectroscopic plate FIBERID SDSS spectroscopic fiber number MJD date of SDSS spectroscopic observation QUALITY eyeball quality flag (see Section 2 for full description) ABSMAG [8] absolute magnitude (AB) in the ugrizJHK s bandpasses (first five from SDSS Petrosian magnitude, last three from 2MASS XSC converted to the AB system as described in the text) K-corrected and Galactic extinction corrected (Schlegel et al. 1998).

SERSIC N[5]
Sérsic index estimated from radial profile in ugriz SERSIC TH50 [5] 50% light radius from Sérsic fit in ugriz (arcsec) SERSIC FLUX [5] total flux from Sérsic fit in ugriz (nanomaggies; see Equation 4 in the text) VDISP estimated velocity dispersion from spectrum VDISP ERR estimated uncertainty in velocity dispersion from spectrum CLASS spectroscopic classification, as output by the SDSS spectroscopic pipeline (note that occasionally this is incorrect; in particular, a number of galaxies in our sample are classified as stars spectroscopically because the signal-to-noise ratio of the spectrum does not allow reliable discrimination between the two). SUBCLASS spectroscopic subclassification (e.g., stellar type), as output by the SDSS spectroscopic pipeline VMAX maximum volume in the sample over which we could have observed this object NEDNAME name of NED match NEDCZ redshift from NED match ZLG Local Group relative redshift from SDSS ZDIST peculiar velocity corrected Local Group relative redshift from SDSS ZDIST ERR uncertainty in ZDIST  Fig. 1.-The top two panels show the r − i color of the bluest stars in the magnitude range 16 < m r < 18.5 in each contiguous set of twenty fields in each run of the SDSS (all magnitudes extinction-corrected according to the dust maps of Schlegel et al. 1998). The bottom panel shows the calibration of the scale. Only the Northern Equatorial data is shown. The top panel shows this quantity for the data calibrated to the SDSS standard system using the photometric telescope. The middle panel shows the same for the ubercalibrated data, as described in the text.       -Residuals of the fit Sérsic parameters as a function of the input Sérsic parameters for a set of 1200 simulated galaxies inserted into raw data and processed with the SDSS photometric pipeline plus the Sérsic fitting procedure. The fluxes and sizes are those associated with the Sérsic fit. The greyscale represents the conditional probability of the y-axis measurement given the x-axis input; the lines show the quartiles of that distribution.