14.3.2.  Regular Expressions: Phone Number Recognition

[ fromfile: regexp.xml id: regexphonerecog ]

The Problem

Example 14.5. src/regexp/testphone.txt

src/regexp> ./testphone
Enter a phone number (or q to quit): 16175738000
 validated: (US/Canada) +1 617-573-8000
Enter a phone number (or q to quit): 680111111111
 validated: (Palau) + 680 (0)11-11-11-111
Enter a phone number (or q to quit): 777888888888
 validated: (Unknown - but possibly valid) + 777 (0)88-88-88-888
Enter a phone number (or q to quit): 86333333333
 validated: (China) + 86 (0)33-33-33-333
Enter a phone number (or q to quit): 962444444444
 validated: (Jordan) + 962 (0)44-44-44-444
Enter a phone number (or q to quit): 56777777777
 validated: (Chile) + 56 (0)77-77-77-777
Enter a phone number (or q to quit): 351666666666
 validated: (Portugal) + 351 (0)66-66-66-666
Enter a phone number (or q to quit): 31888888888
 validated: (Netherlands) + 31 (0)88-88-88-888
Enter a phone number (or q to quit): 20398478
Unknown format
Enter a phone number (or q to quit): 2828282828282
Unknown format
Enter a phone number (or q to quit): q
src/regexp>

Example 14.6. src/regexp/testphoneread.cpp

[ . . . . ]
QRegExp filtercharacters ("[\\s-\\+\\(\\)\\-]"); 1

QRegExp usformat                                 2
("(\\+?1[- ]?)?\\(?(\\d{3})\\)?[\\s-]?(\\d{3})[\\s-]?(\\d{4})");

QRegExp genformat
("(00)?([[3-9]\\d{1,2})(\\d{2})(\\d{7})$");      3

QRegExp genformat2
("(\\d\\d)(\\d\\d)(\\d{3})");                    4


QString countryName(QString ccode) {
   if(ccode == "31") return "Netherlands";
   else if(ccode == "351") return "Portugal";
[ . . . . ]
   //Add more codes as needed ..."
   else return "Unknown - but possibly valid";
}

QString stdinReadPhone() {                       5
   QString str;
   bool knownFormat=false;
   do {                                          6
      cout << "Enter a phone number (or q to quit): ";
      cout.flush();
      str = cin.readLine();
      if (str=="q")
         return str;
      str.remove(filtercharacters);              7
      if (genformat.exactMatch(str)) {
         QString country = genformat.cap(2);
         QString citycode = genformat.cap(3);
         QString rest = genformat.cap(4);
         if (genformat2.exactMatch(rest)) {
            knownFormat = true;
            QString number = QString("%1-%2-%3")
                               .arg(genformat2.cap(1))
                               .arg(genformat2.cap(2))
                               .arg(genformat2.cap(3));
            str = QString("(%1) + %2 (0)%3-%4").arg(countryName(country))
                    .arg(country).arg(citycode).arg(number);
        }
     }
[ . . . . ]
     if (not knownFormat) {
        cout << "Unknown format" << endl;
     }
  } while (not knownFormat) ;
  return str;
}

int main() {
    QString str;
    do {
        str =  stdinReadPhone();
        if (str != "q")
            cout << " validated: " << str << endl;
    } while (str != "q");
    return 0;
}
[ . . . . ]

1

Remove these characters from the string that the user supplies.

2

All U.S. format numbers have country-code 1, and have 3 + 3 + 4 = 10 digits. Whitespaces, dashes and parantheses between these digit groups are ignored, but they help to make the digit groups recognizable.

3

Landline country codes in Europe begin with 3 or 4, Latin America with 5, Southeast Asia and Oceania with 6, East Asia with 8, and Central, South and Western Asia with 9. Country codes may be 2 or 3 digits long. Local phone numbers typically have 2(or 3) + 2 + 7 = 11(or 12) digits. This program does not attempt to interpret city codes.

4

The last 7 digits will be be arranged as 2 + 2 + 3.

5

Ensures the user-entered phone string complies with a regular expression, and extracts the proper components from it. Returns a properly formatted phone string.

6

Keep asking until you get a valid number.

7

Remove all dashes, spaces, parens, and so on.




[62] The phone number situation in Europe is quite complex and specialists have been working for years to develop a system that would work, and be acceptable, to all EU members. You can get an idea of what is involved by visiting this Wikipedia page.