About Genealogy Programs and Gedcom Files

Genealogy Programs and Gedcom files

This page is about genealogy programs. It has a brief discussion of what's available in different genealogical programs. Subsequently there is a discussion about "gedcom" files, the files that can be used to send family histories from one person to another.

Genealogy Programs

Genealogy programs are software programs that can run on your PC (or your MAC, for some of them) that allow you to record information about individuals, their families and relationships, important facts, photos and other information; and then to link these families together into a tree that reflects a family history. All of the programs have several essential ingredients:

Other facilities often include

There are a number of good genealogical programs available, most for a very reasonable cost -- some are free. Each of these programs has a way to enter data about your family (including 'importing' a gedcom file created by someone else) and a way to display charts and graphs and print books and reports. Each program also has a way to "export" a gedcom file, so that the file can be given to someone else and they can "import" the data into their genealogy program. I've used various programs over the last five years or so that I've been collecting family history. Some of these have been better, some have not been as good, some are no longer available. I started with EasyTree, a very easy-to-use program that I purchased along with a set of CDs containing a lot of genealogical information. I used EasyTree for a couple of years, but had a great deal of trouble "importing" information from other genealogical programs that people would send to me or I would get from the internet. Further, their customer support was totally absent. After a while, I ended up losing a piece of my database, at which time I looke for something more reliable than EasyTree. I don't believe EasyTree is produced any more, although there still may be "boxed" versions of it for sale along with some genealogy CDs for a very cheap price.

After EasyTree, I evaluated several programs, and finally settled on Family Origins. It was slightly more difficult to use than EasyTree but proved to be much more robust, had backup capabilities which are so important to make sure you don't lose your data, and had a reasonable path to upgrade to new versions as they became available. One of the programs I tried during the evaluation was Legacy, but decided against it because it was very very slow. In 2003, Family Origins was discontinued and replaced with a new program, whose name I can't remember. I tried it initially during a free trial period, but it didn't have all the facilities of Family Origins and was not nearly as reliable. I'm sure its gotten much better since then.

However, during the time that I used Family Origins, the folks at Legacy had improved the speed of their product substantially and along with the newer faster computers, no longer had the speed problem that I encoutered when I had tried it in 2001 or so. The folks at Legacy have produced a very nice product, and they offer a free edition for those people just getting started with genealogy and family history. Their "DeLuxe" edition costs US$30 or so, which is quite reasonable for a program with this much function. I found that I needed the additional facilities in their DeLuxe edition and paid the extra money. Since then I've been extremely happy with Legacy. It's up to Version 5, which is very rich in function. It has worked reliably, provides a mechanism for backing up your family history files. But more than anything else, the Legacy people are just really nice to work with. They provide a free version, as I mentioned earlier. They make the price of their upgrades to the new version very reasonably priced. They provide software updates to their product on a timely basis, which sometimes will add a new feature or two as well as fixing a few problems. Overall, they have a fine product, at a reasonable price, with excellent support and upgrade policies. Legacy is availble on the internet at http://www.legacyfamilytree.com/.

The "gold standard" for genealogy programs is allegedly a program called "The Master Genealogist", or something very similar. I know of one person who uses it and swears by it. But for me, it's way overpriced at over US$100. Further, I don't know of anything it might have that I don't get with Legacy. One other program, though, bears some discussion. It's a program called PAF, distributed free of charge by the Mormon Church. I hgave used it; it works well, but its functions are limited, and it has a heavy bias toward the various LDS events. However, I haven't looked at it since late 2002, and it could well have changed since then. It's available at the Mormon Church website; follow through to software downloads.

GEDCOM files

"gedcom" files are files that can be transferred (sent) from one person to another that contain genealogical information. Information such as the names of individuals, their relationships, records about them, their photos and other things can all be included in a gedcom file.

Gedcom files are produced by a genealogy program, as discussed above.

I'm going to go over some of the basics and then present a detailed explanation of what it's all about ... If this is all old news, please forgive me, but I just want to make it as thorough as I can without boring you to death with infinite details.

A "gedcom" (pronounced G-E-D com) file is a file containing information of a genealogical nature. It consists of various pieces of data, such as individual names, family group information, birth dates and all the other things, with each line of text ( called a "record") containing one piece of information. Each record consists of only ASCII characters, as opposed to unprintable binary values, and is laid out in a particular manner. The manner in which it is laid out is called the "file format", in this case the gedcom file format. The gedcom file format was developed by the Mormon Church, solely for the purpose of exchanging genealogical data. This exchange needs to happen whenever it is desired to take data from one genealogical database and add it to another genealogical database. For example, two people, located separately from one another, might want to share genealogical data. Or a person might want to move their genealogical data from one genealogy program to another, for whatever reason. Or a person might want to create a very manageable small portion of a family's data separately before integrating it into the larger family database. The use of the gedcom file, being a "transportable" form of genealogical data, allows all these various things to happen.

All files of gedcom data should have a .ged as the last of the filename in Windows, e.g, "London-Longmans.ged".

If one looked inside a gedcom file, using a tool such as Windows Wordpad, each line of text (record) would have a "level" indicator, which is a number from 0 up to 9 (I suppose 9 is the last number used), indicating a record hierarchy. A level of 0 is the most important record in a group of records, with all other subservient pieces of data (records) having a value of 1 or greater. A subservient record contains information which has no meaning or value without its more-important construct (object). For example, a birthdate is meaningless unless we talk about a specific individual. Any level 2 records go with the preceding level 1 record, and any level 1 records would be subservient to any level 0 records, and so forth. For example, an individual would appear as follows:
-----------------------
0 @I8@ INDI
1 NAME Tyler Christopher /LONGMAN/
1 SEX M
1 BIRT
2 DATE 28 OCT 1986
2 PLAC Fremont, California
2 NOTE @BI8@
1 RELI
2 NOTE Catholic
1 FAMC @F2@
1 CHAN
2 DATE 4 FEB 2001
----------------------
In this example, extracted from the Dorset-Louisiana Longman gedcom file, the level 0 record, the first "line of text", begins a record for a particular individual. Note the 0 in the first column, and then the INDI record type on the right side of the line. INDI is the gedcom-prescribed nomenclature for an "INDI"vidual's record. Each new individual would have a level 0 record with INDI as shown in this example. (The @I8@ bit of information is described later).

Note that the second line, beginning "1 NAME Tyl ..." is subservient to the level 0 record which started the new individual. In this case, the level 1 record contains the "tag" NAME, which means that this particular line of text contains the name of the individual begun by the previous level 0 record. In the same way, "1 SEX" will contain the sex of the individual begun by the previous level 0 INDI record, "1 BIRT" begins a group of records of birth details. Note that "1 BIRT" only defines the beginning of birth information, in the same way a "0 ... INDI" only defines the beginning of a new individual. Just as the level 1 record "1 NAME" contains the name of the person whose information was started with the level 0 record, the level 2 records following "1 BIRT" (2 DATE, 2 PLAC, and 2 NOTE) contain a date, place and note information relating to the level 1 BIRT record. In other words, 2 DATE, 2 PLAC, and 2 NOTE are all details of 1 BIRT (birth). All information in a gedcom file is organized using this same general schema. It's important to realize that the level 2 records we just looked at, namely 2 DATE, 2 PLAC, and 2 NOTE, could be the date, place and note for any event; it's only because these are grouped after a "1 BIRT" record do we know that they are related to a birth. They could as easily have occurred after a "1 DEAT"h record, or a "1 MARR"iage record and probably many other record types as well.

It is the Mormon church's definition of levels and the four-character identifiers, such as DEAT, BIRT, INDI, etc., that tell us what each record must look like. For a complete, detailed description of all the record levels and types, go to the Mormon Church website (I don't remember exactly where offhand, but I've looked at it before on their site).

The one thing that I didn't touch on here, as shown in the first "0 ... INDI" record, is the @I8@ bit of information. This is a "pointer" to another record (or possibly multiple records). In other words, if there are other records (such as "family group" record in this case) that are related, then they would have the same "pointer" value in them. Typically, this arrangement is used for such things as "pointers" to family group records from individual records; in other words, since a "0 FAM" (meaning "family" information) would have level 1 or 2 records that would indicate family members, the pointer would be used in the family record to point to each individual's INDI record.
I scanned the file I extracted the sample INDIvidual record from, and included the "family group" records here, such that you might see how they are linked via the @I8@ value; in this case, the individual referred to by "0 @I8@ INDI" in the previous gedcom file fragment is a "CHIL"d in the family. "0 ... FAM" is the beginning record for a family group. Note the information for husband (1 HUSB) and wife (1 WIFE) have pointers to those particular INDIviduals, which of course would have pointers @I1@ and @I4@, respectively, in their INDI records, such that they can be matched. The use of pointers like this allows relationships to be established between entities of equal importance, namely individuals and families in this case, without repeating information or making one piece of information subservient to another, as would happen if they were grouped by level, e.g., 0, 1, etc..

0 @F2@ FAM
1 HUSB @I1@
1 WIFE @I4@
1 MARR
2 DATE 26 APR 1985
2 PLAC Milpitas, California
1 CHAN
2 DATE 10 APR 2001
2 NOTE 09:32
1 CHIL @I7@
1 CHIL @I8@

Obviously, this is not a totally thorough explanation, but it should give you an idea of what you would find in a gedcom file. For a detailed look at a particular gedcom file, you can use Microsoft Windows "Wordpad" to open and view the raw data.

Now, lest you think that the Mormon Church's "gedcom" standard makes everything perfect, it doesn't. There are many areas in which the standard is vague, and different genealogy programs can and often do interpret the data slightly differently. This accounts for some genealogical programs not being able to interpret ALL the data correctly that might be created by another, different genealogical program. And not all genealogical programs support all the record types (INDI, SOUR, FAM, etc.), although such things as Individual, FAMily, SOURce, BIRTh and many others obviously have to be supported by all programs. One example of a tag not supported by some genealogical programs is EMAL, which contains an E-mail address.

Of course, each genealogical program is free to display the data to you in whatever manner it likes. By the same token, each program can collect its data from you, via the keyboard, in any manner it likes, but if a gedcom file is created, then the gedcom standard must be used to organize the resulting data. It's also now important to recognize the fact that each genealogical program (such as Generations, or Family Legacy, or PAF, or Family Origins) can store its data in its own file (database) in any manner that it likes, which they do for programming ease and speed of operation.

The process of "importing" and "exporting" gedcom files is akin to the definition of importing and exporting as one might think of goods through a country's borders. Once a genealogy program, such as Legacy or PAF or Generations, has data stored in its database, then one may request that the data be "export"ed to a gedcom file. This will cause the program to read its own database and create a new file, as defined by the gedcom standard, to your disk. Now that this file is created, it may be "import"Ed by any other genealogical program, thereby causing the data from the gedcom file to be included and saved in the second program's database. Hence one can see that a single gedcom file can be read by a program, such that an entire genealogical database can be created. By the same token, one may import any number of gedcom files to create a single database. This might be the case where different parts of the family information might be gathered by different people, each one creating (via export) and sending a gedcom file to a single person, who in turn would generate a complete database of all the information. And of course, this person could export a gedcom file of the entire family information for distribution back to the contributing individuals.

Generally, the ability to import and export gedcom files is available from the "File" menu of any genealogical program (in the menu bar at the top, e.g., "File", "Edit", "Tools", "Help" and the like).

Hopefully, this gives you a general understanding of what a gedcom file is and isn't. And how, by its "standardization", all genealogy programs would be able to both import and export their information via the use of gedcom files.