Understanding Soundex - BRANDL Surname Genealogy Resources

BRANDL Surname and Family Research Resources Site

BRANDL SURNAME HOME

On our site we have 1880 and 1900 soundex census data for selected US states. And there are several other databases on the web that offer a soundex-search option. The EIDB - Ellis Island Database - or JewishGen's ShtetlSeeker (Daitch-Mokotoff soundex) are well known examples. Some people may wonder what 'Soundex' means and how it works. Here you have a thorough explanation of the American Soundex method, also describing several of it's shortcomings. So if your search for an ancestor was in vain - he may still be there, hidden behind the peculiarities of that indexing method.

The following article is taken from LEGACY NEWS January 29, 2003 and being republished here with Millennia's permission. Copyright 2003 by Millennia Corp.


Tips from the Experts
Understanding the Limitations of Soundex

The Soundex is a surname indexing system for the United States 1880, 1900, 1910, and 1920 United States censuses. In addition, New York passenger arrivals after 1910 and other records are also indexed with the Soundex system. Legacy [Note: Legacy's Genealogical Software] contains within it a Soundex calculator. [.....]

The Soundex is a phonetically coded surname index based on the way a surname sounds rather than the way it is spelled. Same-sounding surnames, like SMITH, SMITHE, SCHMIDT and SMYTH, are code identically and are filed together. This aids the researcher in finding names that sound alike, but have different spellings. The Soundex is not a perfect system and there are variations in how a surname can be coded. This is critical to know when you don't find the person you expected in the Soundex.

The American Soundex System code consist of the first letter of the name followed by three digits. These three digits are determined by dropping the letters a, e, i, o, u, h, w and y and adding three digits from the remaining letters of the name according to the table below. There are only two additional rules. (1) If two or more consecutive letters have the same code, they are coded as one letter. (2) If there are an insufficient numbers of letters to make the three digits, the remaining digits are set to zero.

Soundex Table

1. b, f, p, v
2. c, g, j, k, q, s, x, z
3. d, t
4. l
5. m, n
6. r

Examples:

Miller M460
Peterson P362
Peters P362
Auerbach A612
Uhrbach U612
Moskowitz M223
Moskovitz M213
Ashcroft A226

To Calculate a Soundex Code by Hand:

1. Print name on a piece of paper.
2. Cross out spaces, punctuation, accents and other marks.
3. Cross out any of the following characters A, E, I, O, U, H, W, Y (unless first letter of surname).
4. Cross out the second letter of duplicate characters.
5. Cross out the second letter of adjacent characters with the same Soundex number.
6. Convert characters in positions 2 to 4 to a number.

B, P, F, V = 1
C, S, K, G, J, Q, X, Z = 2
D, T = 3
L = 4
M, N = 5
R = 6

7. Fill any unused positions with zeros e.g.. Lee is L000, Bailey is B400. There is always one letter followed by 3 numbers.

Soundex Limitations:

· Names that sound alike do not always have the same Soundex code. For example, Lee (L000) and Leigh (L200) are pronounced identically, but have different Soundex codes because the silent g in Leigh is given a code.

· Names that sound alike but start with a different first letter will always have a different Soundex code. Thus, names such as Carr (C600) and Karr (K600) should be calculated separately.

· Soundex is based on English pronunciation so European names may not soundexed correctly. For example, some French surnames with silent last letters will not code according to pronunciation. This is true with French name such as Beaux - where the x is silent. Sometimes this surname is also spelled Beau (B000) and is pronounced identically to Beaux (B200), yet they will have different Soundex codes. Although I have given only a French example, this could be true of any name that does not use English pronunciation.

· Sometimes names that don't sound alike have the same Soundex code. When I am searching for the surname Powers (P620), I have to wade through Pierce, Price, Perez and Park which all have the same Soundex code. Yet Power (P600), a common way to spell Powers 100 years ago, has a different Soundex code.

· Surnames with prefixes were usually coded without the prefix, but not always. If you are searching for a surnames such as DiCaprio or LaBianca, you should try the Soundex for both with and without the prefix.

· US Census Soundex confusion arises with names such as Ashcraft. When the original Soundex coder didn't code the H and didn't consider the H as a separator between the adjacent letters with the same code S and C , then the S and C would be considered adjacent letters to be coded only once and the Soundex will be A261. In the 1920 NY Census, Ashcraft is found under A261.

Those who coded the Soundex for the United States censuses may or may not have used this rule. They sometimes considered the H as a separator, and did not code the S and C as adjacent letters that would only be assigned one letter, but rather gave a number code to each letter. When coding a name like Ashcraft, the Soundex calculator in Legacy recognizes these variations in approaches and displays both A226 and A261.

The important thing to know is that the US Census was not consistent with using the letter H and W as separators between adjacent letters. If you are trying to calculate the Soundex for a name with the letters W or H that separate two adjacent letters, it is best to calculate the Soundex using the two different methods to locate the name in the US census. This would be true of any name that has any of the letters C,S,G,J,K,Q,X,Z on both sides of the letter H or W such as SHC, SHS, CHS, KHZ, SWS, KWS, CWK.

. A surname of more than one word, or a surname that commonly comes before a given name, such as Native Americans and Chinese surnames, may have been coded under the name which appears last, even though it might not be the actual surname. In the case of multi-word surnames, only the last word may have been coded.

The National Archives offers a free brochure titled "Using the Census Soundex," General Information Leaflet 55 (Washington, DC: National Archives and Records Administration, 1995). The brochure is available by sending a message to [email protected] (include your name, postal address, and "GIL 55 please").

Here are three recommended articles on the subject:

National Archives: The Soundex Indexing System http://www.archives.gov/research_room/genealogy/census/soundex.html

Kathi Reid: Surname to Soundex Converter http://www.geocities.com/Heartland/Hills/3916/soundex.html

Gary Mokotoff: Soundexing and Genealogy
http://www.avotaynu.com/soundex.html

Top of page

BRANDL SURNAME HOME

Anybody who wants to suggest additions or corrections, has comments or other hints concerning this page, please send an email to freep@newbrandl.com


Copyright Notice: All files on this site are copyrighted by their authors. They may not be reproduced on another site without specific written permission from the administrator and/or the author. Public data as presented here are not copyrightable, but their format, form of compilation and layout, notes and comments etc., are. It is however permissable to print or save the contents to a personal computer for personal use only, or to link to this page from other websites. Copyright © 2002