-
Notifications
You must be signed in to change notification settings - Fork 2
de id regex improvements
laxmirad edited this page Sep 21, 2018
·
5 revisions
- Combine x_year_old and x_years_old into one regex like: \b(?i)([9][0-9]|[1][0-9][0-9])(?=\syear old|\syr old|\syears old)\b
- Combine x_yo to the regex above \b(?i)([9][0-9]|[1][0-9][0-9])(?=\syear old|\syr old|\syears old|\sy.o.|\syo\b)\b
- city_zip.txt could be combined with the city_state.zip
- Hospital1.txt and hospital2.txt can be combined.
- num_streetname*.txt - There are a bunch of regexes to capture the street names with numbers in them. They seem similar to at_street_number_dash_street.txt which can be potentially combined
- num_streetname_city_state.txt can be combined with num_streetname_extension?.txt
- num_street_san.txt is redundant it can be added to one of the regex mentioned above. Also does not ignore cases and only looks for San or SAN and would not capture san or SAn
- room_#.txt can be combined to box_room.txt
- street_&_street and street_and_street can be combined
- Too many regex captures floor number. They can be combined.
- xxx_xxx_CCCC.txt and xxx_xxx_xxxx.txt can be combined