Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is it possible to add country, currency, company as custom datatypes ? #143

Open
seshurajup opened this issue Oct 11, 2020 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@seshurajup
Copy link

How to extend the string datatype to subdomains like country, country code, currency for the finite domain values

@seshurajup seshurajup added the enhancement New feature or request label Oct 11, 2020
@sbrugman
Copy link
Contributor

@seshurajup extending the string data type to country, country code and currency is a great application of visions. (In the future they may even be included by default).

An example for country code (standardized as iso 3166 alpha-2 codes). There are multiple ways of defining your type. The example implementation below tests if all values are within a discrete set of country codes:

def is_len_2(series):
    return (series.str.len() == 2).all() and not series.hasnans


def is_alpha2(series):
    iso_3166_alpha_iso_2_codes = [
        'AF', 'AX', 'AL', 'DZ', 'AS', 'AD', 'AO', 'AI', 'AQ', 'AG', 'AR',
       'AM', 'AW', 'AU', 'AT', 'AZ', 'BS', 'BH', 'BD', 'BB', 'BY', 'BE',
       'BZ', 'BJ', 'BM', 'BT', 'BO', 'BQ', 'BA', 'BW', 'BV', 'BR', 'IO',
       'BN', 'BG', 'BF', 'BI', 'CV', 'KH', 'CM', 'CA', 'KY', 'CF', 'TD',
       'CL', 'CN', 'CX', 'CC', 'CO', 'KM', 'CG', 'CD', 'CK', 'CR', 'CI',
       'HR', 'CU', 'CW', 'CY', 'CZ', 'DK', 'DJ', 'DM', 'DO', 'EC', 'EG',
       'SV', 'GQ', 'ER', 'EE', 'SZ', 'ET', 'FK', 'FO', 'FJ', 'FI', 'FR',
       'GF', 'PF', 'TF', 'GA', 'GM', 'GE', 'DE', 'GH', 'GI', 'GR', 'GL',
       'GD', 'GP', 'GU', 'GT', 'GG', 'GN', 'GW', 'GY', 'HT', 'HM', 'VA',
       'HN', 'HK', 'HU', 'IS', 'IN', 'ID', 'IR', 'IQ', 'IE', 'IM', 'IL',
       'IT', 'JM', 'JP', 'JE', 'JO', 'KZ', 'KE', 'KI', 'KP', 'KR', 'KW',
       'KG', 'LA', 'LV', 'LB', 'LS', 'LR', 'LY', 'LI', 'LT', 'LU', 'MO',
       'MG', 'MW', 'MY', 'MV', 'ML', 'MT', 'MH', 'MQ', 'MR', 'MU', 'YT',
       'MX', 'FM', 'MD', 'MC', 'MN', 'ME', 'MS', 'MA', 'MZ', 'MM', 'NA',
       'NR', 'NP', 'NL', 'NC', 'NZ', 'NI', 'NE', 'NG', 'NU', 'NF', 'MK',
       'MP', 'NO', 'OM', 'PK', 'PW', 'PS', 'PA', 'PG', 'PY', 'PE', 'PH',
       'PN', 'PL', 'PT', 'PR', 'QA', 'RE', 'RO', 'RU', 'RW', 'BL', 'SH',
       'KN', 'LC', 'MF', 'PM', 'VC', 'WS', 'SM', 'ST', 'SA', 'SN', 'RS',
       'SC', 'SL', 'SG', 'SX', 'SK', 'SI', 'SB', 'SO', 'ZA', 'GS', 'SS',
       'ES', 'LK', 'SD', 'SR', 'SJ', 'SE', 'CH', 'SY', 'TW', 'TJ', 'TZ',
       'TH', 'TL', 'TG', 'TK', 'TO', 'TT', 'TN', 'TR', 'TM', 'TC', 'TV',
       'UG', 'UA', 'AE', 'GB', 'US', 'UM', 'UY', 'UZ', 'VU', 'VE', 'VN',
       'VG', 'VI', 'WF', 'EH', 'YE', 'ZM', 'ZW'
    ]
    return series.isin(iso_3166_alpha_iso_2_codes).all()


class CountryCode(VisionsBaseType):
    @classmethod
    def get_relations(cls):
        return [IdentityRelation(cls, String)]
    
    @classmethod
    def contains_op(cls, series, state):
        return is_len_2(series) and is_alpha2(series)

More information is available in the documentation (or comment below).

@seshurajup
Copy link
Author

seshurajup commented Oct 13, 2020

How i can contribute to visions, to incorporate year, country, country code, other fine list of file types as extension of string as you described. https://dylan-profiler.github.io/visions/visions/creator/contributing is broken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants