118} How to handle irregular empty fields of a CSV file?
This comes out of a real-life production script problem I had
myself. The CSV (Comma Separated Values) file format is useful for
many tasks because of its relation to Excel. Consider the following,
simple example CSV file myfile.csv (mimics one common address book
format)
a11,a12,a13
a21,,a23
The script
@echo off & setlocal enableextensions enabledelayedexpansion
for /f "tokens=1-3 delims=," %%a in ('type "myfile.csv"') do (
set s1_=%%a
set s2_=%%b
set s3_=%%c
if defined s1_ (echo !s1_!) else (echo ..)
if defined s2_ (echo !s2_!) else (echo ..)
if defined s3_ (echo !s3_!) else (echo ..)
)
endlocal & goto :EOF
will produce
a11
a12
a13
a21
a23
..
The .. line obviously is mislocated. In other words, the last two
items are in a wrong order.
Is there a way out? One option is inserting a "-" within the empty ,,
pair before going through the file
@echo off & setlocal enableextensions enabledelayedexpansion
::
:: Make a demonstration test file
echo a11,a12,a13>"mytest.csv"
echo a21,,a23>>"mytest.csv"
::
:: The actual routine
for /f "delims=" %%a in ('type "mytest.csv"') do (
set s_=%%a
set s_=!s_:,,=,-,!
set s_=!s_:,,=,-,!
for /f "tokens=1-3 delims=," %%a in ("!s_!") do (
set s1_=%%a
set s2_=%%b
set s3_=%%c
if not [!s1_!]==[-] (echo !s1_!) else (echo ..)
if not [!s2_!]==[-] (echo !s2_!) else (echo ..)
if not [!s3_!]==[-] (echo !s3_!) else (echo ..)
)
)
::
:: Clean up
for %%f in ("mytest.csv") do if exist %%f del %%f
endlocal & goto :EOF
Now the output is the more logical
a11
a12
a13
a21
..
a23
There is a subtle catch. Performing the substitution twice in the
script above is not a mistake. It is required since in the more
generic case where can be either an odd or and even number of
adjacent ,, pairs. Ensuring the substitution globally takes two
passes. This is not only script-related. Even any text editor or
SED requires the same.
Also a Visual Basic Script (VBScript) aided solution can be used
@echo off & setlocal enableextensions
::
:: Make a demonstration test file
echo a11,a12,a13>"mytest.csv"
echo a21,,a23>>"mytest.csv"
::
:: Build a Visual Basic Script
set skip=
set vbs_=%temp%\tmp$$$.vbs
findstr "'%skip%VBS" "%~f0" > "%vbs_%"
::
:: Run the script with Microsoft Windows Script Host Version 5.6
cscript //nologo "%vbs_%" < "mytest.csv"
::
:: Clean up
for %%f in ("%vbs_%" "mytest.csv") do if exist %%f del %%f
endlocal & goto :EOF
'
'.............................................
'The Visual Basic Script
'
Do While Not WScript.StdIn.AtEndOfStream 'VBS
str = WScript.StdIn.ReadLine 'VBS
str = str + "," 'VBS
str = Replace (str, ",,",",-,") 'VBS
str = Replace (str, ",,",",-,") 'VBS
'
p1 = Instr (1, str, ",", 1) 'VBS
str1 = Mid (str, 1, p1-1) 'VBS
If str1 <> "-" Then 'VBS
WScript.StdOut.WriteLine str1 'VBS
Else 'VBS
WScript.StdOut.WriteLine ".." 'VBS
End If 'VBS
'
p2 = Instr (p1+1, str, ",", 1) 'VBS
str2 = Mid (str, p1+1, p2-p1-1) 'VBS
If str2 <> "-" Then 'VBS
WScript.StdOut.WriteLine str2 'VBS
Else 'VBS
WScript.StdOut.WriteLine ".." 'VBS
End If 'VBS
'
p3 = Instr (p2+1, str, ",", 1) 'VBS
str3 = Mid (str, p2+1, p3-p2-1) 'VBS
If str3 <> "-" Then 'VBS
WScript.StdOut.WriteLine str3 'VBS
Else 'VBS
WScript.StdOut.WriteLine ".." 'VBS
End If 'VBS
'
WScript.StdOut.WriteLine 'VBS
Loop 'VBS
The output will be
C:\_D\TEST>cmdfaq
a11
a12
a13
a21
..
a23