7

I have the file name, "abc枚.xlsx", containing some kind of non-ASCII character encoding and I'd like to remove all non-ASCII characters to rename it to "abc.xlsx".

Here is what I've tried:

import os
import string
os.chdir(src_dir)  #src_dir is a path to my directory that contains the odd file
for file_name in os.listdir(): 
    new_file_name = ''.join(c for c in file_name if c in string.printable)
    os.rename(file_name, new_file_name)

The following error results at os.rename():

builtins.WindowsError: (2, 'The system cannot find the file specified')

This is on a Windows system, sys.getfilesystemencoding() gives me mbcs, if that helps any.

What should I do to circumvent this error and allow me to change the file name?

Vijchti
  • 526
  • 6
  • 19
  • 1
    This is Python 3.X, correct? (`os.listdir()` throws an exception on 2.X, unless you pass it a path) – Brigand Jul 25 '13 at 22:51
  • Try converting the original filename to Unicode. Your loop will break a multi-byte character into single bytes, and some of them may be invalid filename characters even if they're printable. – Mark Ransom Jul 25 '13 at 23:00
  • @MarkRansom: `file_name` should be already Unicode string (optional path equals to `'.'` (Unicode string) therefore `listdir()` must return Unicode strings). – jfs Jul 25 '13 at 23:20
  • show `print(ascii(file_name), ascii(new_file_name))` that cause the error. – jfs Jul 25 '13 at 23:23
  • @J.F.Sebastian the OP hasn't verified that this is Python 3 yet. – Mark Ransom Jul 26 '13 at 01:59
  • @MarkRansom: path is optional for `listdir()` in Python 3.2+ only. – jfs Jul 26 '13 at 03:43
  • Sorry for the late response. This is Python 3.2. – Vijchti Jul 26 '13 at 17:49

1 Answers1

12

Here you go, this works with python 2.7 as well

import os
import string

for file_name in os.listdir(src_dir): 
    new_file_name = ''.join(c for c in file_name if c in string.printable)
    os.rename(os.path.join(src_dir,file_name), os.path.join(src_dir, new_file_name))

Cheers! Don't forget to up-vote if you find this answer useful! ;)

Simanas
  • 2,793
  • 22
  • 20
  • Thank you, I'm not yet able to do this with proper utf-8 and regular expression, but at least it worked. – Omiod Jul 02 '16 at 10:29