What Will I Learn?
- The impact of locale on reading files.
- Resolve the conflicts in locale settings.
- A C++ class(Raleted to locate) that can be used directly.
Requirements
- Basic C++
- The basic use of the C++ class
- Basic knowledge on reading xml file
Difficulty
- Intermediate
Tutorial Contents
Locate
In C/C + + programs, locale (namely system locale, namely national or regional Settings) will change the current language coding format, date format, digital and other related to the regional Settings. Locale Settings correctly or not will affect the string handling in a program (How to output a wchar_t? What should the format of strftime() be? etc.). Therefore, for each program, the locale setting should be handled with caution.
A problem
Some times ago, I encountered such a bug that the data I read from the XML file was wrong when I set up the language area of the computer to Sweden . The real reason is that, in order for the software to be displayed correctly, the software has set the locale to Sweden which is the same as the computer settings when it starts. In Swedish, the decimal point is ", "and not“.”, which leads to errors in floating point Numbers that are read from the XML file.
It is necessary to initialize the locate of the software at startup , at the same time it is also indispensable to read some XML files during the use of software. Therefore, if you do not solve this problem, it is likely that the data read from the file is wrong.
// set up locale
_tsetlocale(LC_ALL, TEXT(""));
Solution
New idea: Temporarily changed the language to English to make sure that the data I read from the file was correct when I read the file. But the code will be very redundant if you reset the locate every time. The new solution is to write a class in a public place and implement the function of temporarily modifying the locale. The code is as follows:
#ifndef IC_LOCALE_ENV_H
#define IC_LOCALE_ENV_H
#include <locale.h>
#include "mbctype.h"
#include "Language.h"
class CEyeLocaleEnv
{
public:
CEyeLocaleEnv::CEyeLocaleEnv(BOOL bSetThread,
bool bLocAsEngUS1252 = false, bool bSysDefault = true)//set it to be the system/user default en-us-ansi one that is independent with localization.
{
m_OrgLocale = NULL;
m_bSetThread = FALSE;
#ifdef _MSC_VER
m_bSetThread = bSetThread;
//Get Original Locale
m_OrgLocale = _tcsdup( _tsetlocale( LC_ALL, NULL ) );
LCID Locale = GetUserDefaultLCID();
if( m_bSetThread )
{
//Thread Locale
m_OrgLCID = GetThreadLocale();
//Get UserDefault LCID to thread locale
SetThreadLocale( Locale );
}
if(bLocAsEngUS1252)
{
if(!bSysDefault)
_tsetlocale( LC_ALL, _T("English_United States.1252") );
else
_tsetlocale( LC_ALL, _T("C") );
}
else
{
TCHAR cpname[32];
int nbytes = 0;
GetLocaleInfo (Locale, LOCALE_SENGLANGUAGE, cpname, 32);
TCHAR cpchain1[128];
nbytes = GetLocaleInfo (Locale, LOCALE_SENGCOUNTRY , cpchain1, 128);
TCHAR cpchain[7];
nbytes = GetLocaleInfo (Locale, LOCALE_IDEFAULTANSICODEPAGE, cpchain, 7);
TCHAR NewLocal[256];
_stprintf_s( NewLocal, 256, _T("%s_%s.%s"),cpname,cpchain1,cpchain);
_tsetlocale( LC_ALL, NewLocal );
}
#endif
}
CEyeLocaleEnv::~CEyeLocaleEnv()
{
#ifdef _MSC_VER
if( m_OrgLocale )
{
_tsetlocale( LC_ALL, m_OrgLocale );
ASSERT( _tcscmp(m_OrgLocale, _tsetlocale(LC_ALL,NULL) ) == 0 );
free(m_OrgLocale);
}
if( m_bSetThread )
{
SetThreadLocale( m_OrgLCID );
ASSERT( m_OrgLCID == GetThreadLocale() );
}
#endif
}
protected:
LPTSTR m_OrgLocale;
BOOL m_bSetThread;
LCID m_OrgLCID;
};
In the process of reading the file, you only need to construct a temporary variable:
Of course, by the same principle, you can also write the following class to handle the problem of reading strings in a CStdioFile file.
//fix using function ReadString of CStdioFile issue.
//if reading an ANSI text, and the local info is en-us, but mbcs info is zh-cn, the reading result is wrong.
//so we should set local info same as mbcs info temporarily.
class CMBCSLocalEnv
{
public:
CMBCSLocalEnv()
{
m_OrgMBCSLoc = NULL;
#ifdef _MSC_VER
m_OrgMBCSLoc = _tcsdup( _tsetlocale( LC_ALL, NULL ) );
CPINFOEX cpInfo;
int nCP = _getmbcp();
GetCPInfoEx(nCP, 0, &cpInfo);
ILangSupport* piLangSprt = EyeGetLangSupport();
ELangAreaID lID = piLangSprt->GetCurLanguageID();
LCID mbcsID = 0;
if (lID == LA_ZH_CN_NO_LANG_DLL)
{
mbcsID = 0x0804;
}
else if (lID == LA_ZH_TW_NO_LANG_DLL)
{
mbcsID = 0x0404;
}
else if (lID == LA_EN_US_NO_LANG_DLL)
{
mbcsID = 0x0409;
}
else if (lID == LA_UNKNOWN)
{
return ;
}
else
mbcsID = (LCID)lID;
TCHAR cpname[32];
int nbytes = 0;
GetLocaleInfo (mbcsID, LOCALE_SENGLANGUAGE, cpname, 32);
TCHAR cpchain1[128];
nbytes = GetLocaleInfo (mbcsID, LOCALE_SENGCOUNTRY , cpchain1, 128);
TCHAR cpchain[7];
nbytes = GetLocaleInfo (mbcsID, LOCALE_IDEFAULTANSICODEPAGE, cpchain, 7);
TCHAR NewLocal[256];
_stprintf_s( NewLocal, 256, _T("%s_%s.%s"),cpname,cpchain1,cpchain);
_tsetlocale( LC_ALL, NewLocal );
#endif
}
~CMBCSLocalEnv()
{
if (m_OrgMBCSLoc == NULL)
{
return;
}
#ifdef _MSC_VER
_tsetlocale( LC_ALL, m_OrgMBCSLoc );
ASSERT( _tcscmp(m_OrgMBCSLoc, _tsetlocale(LC_ALL,NULL) ) == 0 );
free(m_OrgMBCSLoc);
#endif
}
private:
LPTSTR m_OrgMBCSLoc;
};
Conclusion
When reading data in a file in C++, you should not only pay attention to the encoding format of the file, but also notice the errors caused by the locate.
The class shown above can be used directly.
Thank you for your attention.
@hushuilan
Posted on Utopian.io - Rewarding Open Source Contributors
我发现你好腻害的
过奖了,steemit上还有很多更厉害的,我这不算啦
Nice! Messing with encodings and locales is always a single piece of joy ;-)
Haw!Messing with encodings and locales also tends to lead to bugs.
Your contribution cannot be approved because it is not as informative as other contributions. See the Utopian Rules. Contributions need to be informative and descriptive in order to help readers and developers understand them.
You can contact us on Discord.
[utopian-moderator]