               NEWS NLREG has been selected as the "Editor"s Pick" by SoftSeek. NLREG is in use at hundreds of universities, laboratories, and government agencies around the world (over 20 countries). For a list of a few organizations using NLREG click here. If you have categorical variables, you may want to use a Decision Tree to model your data. Check out the DTREG Decision Tree Builder. You also should check out the News Rover program that automatically scans Usenet newsgroups, downloads messages of interest to you, decodes binary file attachments, reconstructs files split across multiple messages, and eliminates spam and duplicate files.  ## Cylindrical Regression -- Fit a Cylinder to Data Points

This example shows an NLREG program that fits a cylinder to a set of data points. The center line and radius of the cylinder are calculated so as to minimize the sum of the squared distances between a set of points and the surface of the cylinder.

A cylinder can be defined by specifying its radius, R, and the line that runs through its center (i.e., its center line). A line in 3D space is defined by a point on the line (X0,Y0,Z0) and a direction vector <Vx,Vy,Vz> that specifies the direction of the line.

So the problem of fitting a cylinder involves fitting a line and the radius of the cylinder from the line. The cylinder fitting program is built on the NLREG 3D line fitting program. Once the center line has been calculated, the cylinder program minimizes the sum of the squared distances of the points from the surface of the cylinder.

The NLREG statements for this analysis are as follows:

```/*
*  Fit a cylinder to a set of points in (X,Y,Z) space.
*/
Title "Fit a cylinder to a set of data points";
/*
*  The input values are a set of (Xp,Yp, Zp) coordinates
*  for each point to be fit.
*/
Variables Xp, Yp, Zp;
/*
*  First we deal with the center line of the cylinder.
*
*  The parametric equation for a 3D line is:
*
*   Xp = X0 + Vx*t
*   Yp = Y0 + Vy*t
*   Zp = Z0 + Vz*t
*
*  Where (X0,Y0,Z0) is some point on the line and
*   is a vector defining the direction of the line.
*  t is the parameter whose value is varied to define points on the line.
*
*  Thus, with this definition, there are six parameters: X0, Y0, Z0, Vx, Vy, Vz.
*  But this overspecifies the line because a 3D line can be defined by 4 parameters
*  as long as it is not parallel to the X, Y or Z plane.
*  So we need to eliminate two parameters.
*
*  Rather than allowing an arbitrary point (X0,Y0,Z0) to specify a point on the line,
*  we will force Z0 to be 0 and make X0 and Y0 be the coordinates on the X-Y
*  plane where the line penetrates the plane (i.e., where Z is zero).  This eliminates
*  Z0 as a parameter that needs to be computed.  We can do this as long as we know that
*  the line is not parallel to the X-Y plane, so it intersects it at some point.
*
*  Next, we will work on the direction vector  that defines the direction of the line.
*  The direction of a line can be defined by the amount that X, Y and Z change when
*  the parameter t is increased by 1 (i.e., delta t = 1).  Scaling the direction vector by a
*  non-zero factor changes its length but not its direction (e.g., the direction defined by
*  the vector <1,2,3> is the same as <2,4,6>, but the second vector is twice as long).
*  If we scale the direction vector to force Vz to be 1, then we can define a revised
*  Vx2 = Vx / Vz and Vy2 = Vy / Vz.
*  So we will force Vz to be 1 and define Vx and Vy as multiples of Vz.
*  The scaled direction vector is .
*  This eliminates Vz as a parameter that needs to be computed.
*  Note that this is only valid if Vz is not zero which means the line is not parallel
*  to the X-Y plane.  You can divide by Vx or Vy if you want to allow the
*  line to be parallel to the X-Y plane but not some other plane.
*
*  Now that we know how to fit a line, we consider how to fit the cylinder.
*
*  Let the radius of the cylinder be R.
*  Our goal is to fit the cylinder so that we minimize the sum of the squared residuals
*  which are the distances from the surface of the cylinder to each point.
*  If a point at (X,Y,Z) has a distance 'd' from the line, then its distance from
*  the surface of the cylinder on a line from the point orthogonal to the center line
*  is (d-R).  So the goal is to minimize the sum of these residuals.
*/
Constant Z0 = 0;
Constant Vz = 1;
/*
*  The parameters to be computed are:
*
*  R  = The radius of the cylinder.
*  X0 = The X coordinate on the X-Y plane where the line penetrates.
*  Y0 = The Y coordinate on the X-Y plane where the line penetrates.
*  Vx = The change in X value relative to the change in Z for a change of 1 on the t parameter.
*  Vy = The change in Y value relative to the change in Z for a change of 1 on the t parameter.
*/
Parameters R, X0, Y0, Vx, Vy;
/*
*  Define some work variables.
*/
Double A, Dx, Dy, Dz, Distance, length;
Double Ux, Uy, Uz;
/*
*  Compute a unit vector (Ux,Uy,Uz) with length = 1 for the direction.
*  This normalizes the direction vector .
*/
length = sqrt(Vx^2 + Vy^2 + Vz^2);
Ux = Vx / length;
Uy = Vy / length;
Uz = Vz / length;
/*
*  Compute the distance for a point (Xp,Yp,Zp) from the center line.
*/
A = Ux*(Xp-X0) + Uy*(Yp-Y0) + Uz*(Zp-Z0);
Dx = (Xp-X0) - A*Ux;
Dy = (Yp-Y0) - A*Uy;
Dz = (Zp-Z0) - A*Uz;
Distance = sqrt(Dx^2 + Dy^2 + Dz^2);
/*
*  The function to be minimized by NLREG is the sum of the squared
*  distances of the points from the surface of the cylinder.
*  This is (Distance-R).  Since we are squaring the residuals,
*  we don't care if Distance is greater than or less than R.
*/
Function (Distance - R);
/*
*  Data values for the Xp,Yp,Zp points follow.
*/
Data;
[data values go here]
```
Note that there is no dependent variable or equal sign to the left of the function. NLREG will determine the values of the parameters such that the sum of the squared values of the function (i.e., the sum of the squared distances) is minimized.