Key Changes:
Encoding Specification:
- Changed
'r', encoding='utf-8'
to'r', encoding='utf-8-sig'
.- The
'-sig'
option tells Python to skip the BOM (Byte Order Mark) at the start of the file if it exists, which can sometimes cause decoding errors.
- The
- Changed
Error Handling:
- Added a try-except block around the file opening code to catch and handle
UnicodeDecodeError
.
- Added a try-except block around the file opening code to catch and handle
This approach should help you process files encoded in other character sets without running into decoding errors. However, keep in mind that this is a workaround, and it might not work for all files due to the varying encoding schemes used by different text files.