Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.
Set objFSO = CreateObject("Scripting.FileSystemObject")Set objFile = objFSO.OpenTextFile("c:\blablabla.txt", 1)Do Until objFile.AtEndOfStreamWscript.Echo objFile.ReadLineLoopobjFile.Close
import codecsimport sysinputfile=sys.argv[1]f = codecs.open( inputfile, "r", "utf-16" )for line in f: print line
C:\test>more i.am.utf16.file.txthere i am , utf-16 encoded fileC:\test>file i.am.utf16.file.txti.am.utf16.file.txt; Little-endian UTF-16 Unicode text, with no line terminatorsC:\test>od -c i.am.utf16.file.txt0000000 ■ h \0 e \0 r \0 e \0 \0 i \0 \00000020 a \0 m \0 \0 , \0 \0 u \0 t \0 f \00000040 - \0 1 \0 6 \0 \0 e \0 n \0 c \0 o \00000060 d \0 e \0 d \0 \0 f \0 i \0 l \0 e \00000100 \n \00000102C:\test>python test.py i.am.utf16.file.txthere i am , utf-16 encoded file
I just realized that what I posted might not work. Do you just want all the unicode characters removed from the string or do you want to replace them with a space or a box or what?I will reply with new code when you confirm this. I have already put way too much time into this thread so bear with me until I can figure this out.
for /f "usebackq delims=" %x in ('"C:\Folder き"') do set myvar=%x
%ComSpec% /u /c echo:"C:\Folder き">unicode.txt
for /f "delims=" %x in ('type unicode.txt') do set myvar=%x
type unicode.txt
C : \ F o l d e r M0"
What am I doing wrong?
import codecsimport sysinputfile=sys.argv[1]f = codecs.open( inputfile, "r", "utf-16" )for line in f: print(line)
D:\>more < unicode.txt"C:\Folder ?"
D:\>chcp 65001Active code page: 65001D:\>more < unicode.txtNot enough memory.
ghostdog74, I installed Python 3.1.2 and saved your script as read.py.My UTF-16LE file uni.txt contains:"C:\Folder き"Both files are inside the current directory.This is my MS-DOS command prompt:What am I doing wrong?
D:\>python D:\test.py unicode.txtTraceback (most recent call last): File "D:\test.py", line 5, in <module> for line in f: File "C:\Python\lib\codecs.py", line 679, in next return self.reader.next() File "C:\Python\lib\codecs.py", line 610, in next line = self.readline() File "C:\Python\lib\codecs.py", line 525, in readline data = self.read(readsize, firstline=True) File "C:\Python\lib\codecs.py", line 472, in read newchars, decodedbytes = self.decode(data, self.errors) File "C:\Python\lib\encodings\utf_16.py", line 90, in decode raise UnicodeError,"UTF-16 stream does not start with BOM"UnicodeError: UTF-16 stream does not start with BOM
data = open("c:\\test\\file1", 'rb').read()decoded = data.decode('utf-16')print decoded