When analysing continuous data it is most feasible to use the raw data, as opposed
to using variance/covariance matrices as input. A major advantage of using the
raw data as opposed to using variance/covariance is that it allows Mx to use
all the available data, even if there are missing data for some variables.
Mx can read raw data in two formats: from a variable length data file, or from
a rectangular data file. For details, see Mx manual. We will work with the rectangular
Rectangular datafiles are in ascii format, it does not matter what extension
the datafile has, but we will use .dat. Data files may be in free format as
well as in fixed format, as long as every variable is separated at least by
one space and every line has an equal number of variables (meaning also that
missing values can never be spaces). Fixed format is used in the example datafiles.
There may be no other characters except numbers in the datafile. decimal
separators must be a dot.
There must not be variable names included at the top of your datafile.
Every line represents one case, where a case denotes all members of the
same family, in our examples this is usually a twin pair.
It is no problem, when only one person of a twin pair is measured.
For sex and other variables on a ratio-scale that will be used as a covariate,
coding should start with zero (0=female; 1=male instead of 1 and 2).
MX reads missing values as a string. Meaning that -99.00 is not the same
as -99.000. If we define missing = -99.00, all independent variables should
have two decimals and -99.00 as a missing value. Do not use dots or empty
fields for missing values.
For variables used as a covariate, a missing value other that the missing
value assigned to independent variables must be used (see below, use of definition
In order to be able to estimate sex differences in variance components,
zygosity should be split into 6 categories:
1 = mz male
2 = dz male
3 = mz female
4 = dz female
5 = dosmf (first twin=male, second twin=female)
6 = dosfm (first twin=female, second twin=male)
If there is no information on first- or second born twin, 5 zygosity categories
can be used and one DOS (always male-female) category is enough.
Each data file should contain at least the following columns:
COUNTRY (numeric code 1-9: should never be missing)
FAMID (numeric, unique code identifying the family/twin pair: should never
ZYGOS (coded as specified before, should not be missing)
list of variables with extension 1, for the first twin and extension
2 for the second twin (i.e. AGE1 SEX1 BMI1 WEIGHT1 AGE2 SEX2 BMI2 WEIGHT2)