NOW
POSTED: THE GUIDE TO EXAM 3:
EDF
5481 METHODS OF EDUCATIONAL RESEARCH
INSTRUCTOR:
DR. SUSAN CAROL LOSH
FALL 2001
|
WHY EXAMINE WEB-BASED DATABASES?
|
As you have already learned, it is expensive
and time-consuming to collect data, especially datasets that are sizable
or comprehensive. In the early 1970s, the United States Federal government
initiated a series of what have come to be called "Social Indicators."
The idea was to collect data from different domains (education, health,
the status of women and ethnic minorities, public opinion, etc.) and to
continue these series over time, thereby tracking change and continuity
among Americans. At the same time, other countries, particularly Canada,
Western Europe, and Japan, also began indicator series, thus making possible
international comparisons. One example is the Third International Mathematics
and Science Study (TIMSS). Data were collected in 42 countries in 1995
and in 38 countries in 1999. A recent addition addresses experience with
computers and the World Wide Web.
Considerable effort has been devoted
to making many of these indicator series compatible over time:
-
Questions are asked in the same way
-
Changes to questions are established via
"split-ballot" testing, i.e., experiments to see if the revised questions
work the same way as the original questions
-
Variables are defined in the same way
-
Coding categories remain constant
-
If coding changes are made, care is taken
to make new coding systems compatible with the old, such as the detailed
census three digit occupational codes
A series may have an "oversight board."
These boards monitor the content and form of the indicator series. Thus,
principal investigators cannot arbitrarily change either content or form
without input from a panel of expert professionals.
The number of data archives is already
HUGE and it seems to be growing by the minute. Some of the large archives,
such as The Roper Center or the Howard W. Odum Institute for Research in
Social Science at the University of North Carolina, are simply staggering
in the amount of data that they hold.
As you look through some of the pages,
you will see that several times I have given the warning: "set aside a
day to explore this archive." Do take this warning seriously! One of these
archives may hold the answer to your proposed dissertation or provide the
basis for a nice conference paper or article. They are definitely worth
exploring.
With resources such as these, the novice--and
even the experienced--researcher should seriously reconsider whether they
want to gather all of their own data from scratch.
|
WHY THESE ARCHIVES ARE IMPORTANT
TO YOU
|
-
There is no point in "reinventing the wheel."
Why do a small local study when data already exist on regional, national
or even international levels? An example is using the "CIRP" to look at
college student beliefs, attitudes, and accomplishments instead of convenience
samples of your buddy's classes.
-
"There is plenty of gold in them thar hills."
Most of these databases are so huge that no one investigator could ever
analyze everything in them. With each successive year, the possibilities
for analysis grow. Further, other researchers may have ideas for analysis
that did not occur to the original Principal Investigator. In other words,
there is plenty of data for you to do an original analysis--without all
the backbreaking work of collecting the data too.
-
Many of these archives offer an unprecedented
opportunity to track trends over time. How did computer use change from
the early 1980s to the late 1990s? What kind of educational preparation
do students receive who rise to eminence later on? What are the average
student characteristics in research universities as opposed to liberal
arts colleges, and how did these characteristics change over time? What
are gender differences in Internet use over time?
-
YOUR time, resources, and energy. Many
researchers, especially junior faculty, have limited resources. With one
eye on the tenure clock, junior faculty have limited time too. It takes
time, often A LOT of time, to gather your own data. If existing archives
have variables that are directly pertinent to your research interests,
it is often in your best professional interests to use these archives.
Obviously, using pre-existing archives
are not for everyone. Many students in disciplines that lend themselves
to "quick and dirty" experiments can quickly collect data with relatively
little financial investment. However, even these researchers may be interested
in "triangulation" with survey data or historical records.
|
CLICK HERE
TO ENTER THE ONLINE DATABASE MENU
|
|
QUESTIONS YOU SHOULD CONSIDER
ABOUT ONLINE DATABASES
|
-
What is the unit of analysis? Is it an
individual? An organization, such as a college or university? A time point
for a country or state series? Archives vary and the unit is not always
an individual.
-
What kinds of variables does the archive
cover? Degree attainment? Health practices? Drug or alcohol usage? Attitudes?
-
What is the time frame covered by the archive?
Examples: the average school FCAT scores for 1998-2001 or The General Social
Survey from 1972-2000.
-
What is the geographic frame covered by
the archive (state? local? United States? international?)
-
Who were the sponsor(s) of the archive
(e.g., NSF? NCES? United Faculty of Florida?)
-
How did the archive come to be?
-
Were the data collected especially for
the archive (such as IPEDS)? Or were the data compiled from other sources
(such as Web CASPAR)?
-
Does the archive contain any tutorials
that instruct how to use it (online or otherwise)?
-
How are the data available? Are they ready
for online analysis? Are the data available to download into your computer?
Are the data contained in .pdf format tables? Are there alternative
ways to obtain the data (such as CD-Rom?)? If so, how can the data be obtained?
-
Can you simply download the data or must
you obtain a CD-Rom or other device from the archive agency?
-
Is there a charge for the data? If so,
what is the cost? Most archival costs are surprisingly reasonable, when
you consider the effort involved in the first place. For example, the cost
of the ENTIRE General Social Survey archive, from 1972 to 2000, over 40,000
interviews, in SPSS ready format, and including a hard copy of the Codebook
is about $300. Compare this with the millions of dollars it cost
to gather the data. Don't forget: you will incur time and financial costs
to gather and process your own data. It may, indeed, turn out to be cheaper
to use the archive.
-
What kinds of analyses can be done online?
Frequency distributions? Cross-tabulations? Multiple regression or other
multivariate analyses?
-
Is a questionnaire available or some other
original document describing each variable in detail?
-
What is mentioned about coverage or response
rate? For example, data are missing from several states in early data series
about abortion. Some surveys have completed interviews with less than half
of the originally contacted respondents. In other cases, such as the CIRP,
response rates can vary considerably from college to college.
-
Do you need any kind of license from the
data agency? Many data sets at the National Science Foundation, the National
Center for Educational Statistics, and other agencies require you to have
a license if you work with what is called the "unit record" data. Unit
record data is the "raw data" where each record is an individual or an
institution. This means the person or institution could plausibly be identified.
Obtaining a license is typically not a problem for legitimate researchers
but it does necessitate some paperwork so be prepared to check about this
and budget some time accordingly.
-
How recently has the database been monitored
or updated? See if you can find a date on the page, typically at the very
top or the very bottom of the page. "Old pages" may have missing links,
unfixed errors, omit the most recent updates to files, or simply may not
work.
-
Were the data gathered over time by different
agencies or different principal investigators? If so, changes
in variables, definitions, or coding may have occurred. You may find differences
attributable to these changes, rather than to changes in the concepts you
are studying--thus threats to internal validity.
-
How far back does the data series extend?
The longer the series, the more likely you are to encounter strange alphabetic
and non-alphanumeric codes, or inconsistencies in definitions or measures.
-
Were data compiled from different agencies
into a single archive? Again, check for consistencies in definitions (even
of the same variable!) across agencies.
-
See if the description of the archive notes
any problems or missing information.
-
What are your computer skills? Some databases
are in ascii format which you can probably download into a spreadsheet
such as Quattro Pro or EXCEL. But the field delimiters vary widely: some
use spaces, others use commas, still others rely on a format statement
so that the data can be read. Do you know how to analyze data using a spreadsheet
program? If not, do you know how to transfer spreadsheet data into a statistical
program such as SPSS or SAS? Do you have file management skills so that
you can insert value labels, variable labels and missing data codes? In
other cases, you may have to save or print tabular displays and hand enter
the data into a spreadsheet (very carefully). As you can see, it is VERY
helpful to have good computer skills--or to have some good friends who
do!
Any original problems when the data
were first gathered will STILL be there when the data are archived. See
what you can find out about issues with question format, sampling, coding
categories, and other sources of bias and random error. Sometimes (for
example: the General Social Survey) there will be considerable information
about entities such as response rate, sometimes there is not.
Always remember this classic cliché:
do the best you can with what you got. Despite any problems, online databases
and archives are a terrific resource for us all.
|
|
WHERE
TO START HUNTING FOR ONLINE ARCHIVES
|
-
Professional associations in your field
-
The FSU on-line library system
-
Search engines using your topic of interest
(see McMillan)
-
One link leads to another. I found the
International Social Survey Program link from the General Social Survey
www site
-
Ask your major professor
-
Check with faculty and graduate students
in Information and Library Sciences
-
Many recent textbooks have online supplements
or Web sites that list archives
-
Check McMillan, chapters 3 and 4 for information
on Subject Directories and Search Engines (pp. 86-87; 90; 93; 96-97).
|
|
CLICK HERE
TO ENTER THE ONLINE DATABASE MENU
|
November 26 2001
This page was built with Netscape Composer.
It is best displayed in Netscape Navigator,
600 X 800 display resolution.
Susan Carol Losh
Always
under construction as new databases are entered.