A comprehensive reference of locale identifiers that combine language and region codes with their associated formatting conventions for dates, numbers, and currencies. Whether you are building a formatting engine, configuring i18n libraries, or validating user locale preferences, this dataset has every locale you need.
Pro tip: Import this dataset into your application to auto-detect the correct date and number format based on user locale — no more hardcoding format strings for each region.
Select which columns to include in your download.
About the Locale Codes Dataset
This dataset maps locale identifiers to their associated regional formatting conventions. Each entry includes the locale code (following the BCP 47 / POSIX convention of language-region, such as en-US or fr-FR), the language name, the region or country name, the standard date format pattern, the number formatting convention (decimal and thousands separators), and the default currency code. The list covers locales recognized by major platforms including CLDR, ICU, Java, .NET, and POSIX systems, making it a universal reference for any application that needs to format data according to regional conventions.
Common Use Cases
Locale data is critical for any application that serves an international audience:
- Date formatting: Look up the correct date pattern for any locale to render dates in the format users expect, whether it is MM/DD/YYYY for the United States, DD/MM/YYYY for Europe, or YYYY-MM-DD for East Asia.
- Number formatting: Apply the correct decimal separator and thousands grouping character based on locale. Users in Germany expect 1.234,56 while users in the United States expect 1,234.56 for the same number.
- Currency display: Pair the locale with its default currency code to format monetary values correctly, including symbol placement, spacing, and decimal precision.
- Form validation: Use locale-specific format patterns to validate user input for dates, phone numbers, and numeric fields without rejecting valid regional formats.
Locale Code Structure
A locale code typically combines a language code with a region code, separated by a hyphen or underscore. The language portion follows ISO 639-1 (two-letter codes like en, fr, de) while the region portion follows ISO 3166-1 alpha-2 (two-letter country codes like US, FR, DE). Some locales include additional subtags for script (such as zh-Hans for Simplified Chinese versus zh-Hant for Traditional Chinese) or variant (such as ca-ES-valencia for Valencian Catalan). This dataset normalizes locale codes to the most commonly used format across web and software platforms, ensuring compatibility with JavaScript's Intl API, PHP's locale functions, and database collation settings.
How to Use in Your Application
Download the JSON format to build a locale configuration service that returns the correct formatting rules for any user preference. The structured data maps directly to configuration objects used by libraries like Moment.js, date-fns, and Intl.NumberFormat. For server-side applications, the SQL export creates a lookup table you can query to format values before rendering templates. The CSV format is ideal for importing into spreadsheets for translation planning or for auditing which locales your application currently supports versus which ones are missing.
Handling Locale Fallback
Applications should implement a locale fallback strategy for cases where the user's preferred locale is not fully supported. A common approach is to strip the region subtag and fall back to the base language locale. For example, if your application does not have specific formatting rules for en-AU (English, Australia), it can fall back to en-GB (English, United Kingdom) as the closest match, or ultimately to en (English) as the base. This dataset provides the building blocks for constructing such fallback chains by grouping locales under their parent language code, allowing developers to build graceful degradation into their formatting logic.