BusinessCardParser

This program is a business card reader with the following attributes

How to Setup

Names were extracted from the document using the nltk named entity chunker
- This chunker assigns values of PERSON, ORGANIZATION, TIME, etc. to each chunk of information
- A reference used for NLTK is NLTK Reference
- NLTK assigns job titles as a chunk of PERSON
  - A running list of common job keywords was created and is checked before accepting the information as chunked as PERSON as a name
  - This list should be added to based on positions available, but a few examples were given in this code
Phone numbers were extracted using regular expressions
- A reference used for regular expressions for phone number is Phone Number RegEx
- The regular expression was then customized for this application
- A good tool for solving regular expressions is https://www.debuggex.com/
Email addresses were also extracted using regular expressions
- A reference used for regular expressions for email addresses is Email Address RegEx
- This regular expression was then customized for this application

The method get_contact_info(document) will return an instance of the object ContactInfo(name, email, phone)
The attributes can then be attained using the respective get methods in the ContactInfo class

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Entegra_Challenge.py		Entegra_Challenge.py
Input.txt		Input.txt
README.md		README.md